You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Large blobs in Azure are uploaded as a series of blocks. A blob may contain a maximum of 50 000 blocks.
Therefore, the wasi-blobstore implementation cannot upload data as the guest writes it, because that could result in running out of blocks. That's fine, the implementation can just buffer the data until it has a good amount to write...
...but how can it determine a "good" amount? A safe buffer size for the largest of large blobs is 4GB, but that's a ridiculous amount of buffering for what is realistically an edge case! Or I might guess 1MB, which would let me upload blobs up to 50GB, but then someone is going to get mad at me when their 51GB upload fails.
A possible way to address this is to allow a length hint on outgoing values, either at construction time or when calling outgoing-value-write-body. This would allow the implementation to choose a buffer size large enough to keep the number of blocks within an allowable margin.
Additional thoughts:
I could cook up some sort of adaptive scheme. Say, start with a 1MB buffer, then double it every time until I hit the maximum buffer size. Nobody is going to complain about losing 48GB of their maximum 190TB. But as an implementer, this does not spark joy...
In many situations, letting an arbitrary guest fill a 4GB buffer is a one-way trip to a denial of service. So it's possible I'd want to limit the buffer size (and hence max blob size) anyway. On the other hand, in that case a length hint would still be useful so I could send an informative message when they asked for an excessive length.
The text was updated successfully, but these errors were encountered:
Large blobs in Azure are uploaded as a series of blocks. A blob may contain a maximum of 50 000 blocks.
Therefore, the wasi-blobstore implementation cannot upload data as the guest writes it, because that could result in running out of blocks. That's fine, the implementation can just buffer the data until it has a good amount to write...
...but how can it determine a "good" amount? A safe buffer size for the largest of large blobs is 4GB, but that's a ridiculous amount of buffering for what is realistically an edge case! Or I might guess 1MB, which would let me upload blobs up to 50GB, but then someone is going to get mad at me when their 51GB upload fails.
A possible way to address this is to allow a length hint on outgoing values, either at construction time or when calling
outgoing-value-write-body
. This would allow the implementation to choose a buffer size large enough to keep the number of blocks within an allowable margin.Additional thoughts:
The text was updated successfully, but these errors were encountered: