Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate reducing memory usage of chunk uploading #2291

Open
szokeasaurusrex opened this issue Dec 3, 2024 · 9 comments
Open

Investigate reducing memory usage of chunk uploading #2291

szokeasaurusrex opened this issue Dec 3, 2024 · 9 comments

Comments

@szokeasaurusrex
Copy link
Member

Chunk uploading appears to have a very large memory usage when uploading large files, since we appear to be reading the entire file contents into memory. For instance, the (currently experimental) Proguard chunk uploading feature greatly increases memory usage versus when the experimental chunk uploading feature is turned off. While profiling an upload of a 160 MB Proguard file, I observed memory usage of >300 MB while chunk uploading, vs ~90 MB with chunk uploading disabled.

As this problem likely affects all chunk uploads, not just Proguard files, we should investigate if we can reduce memory usage, e.g. by not reading the entire files into memory.

@szokeasaurusrex
Copy link
Member Author

Apparently chunk uploading's large memory usage is causing problems for the Ingest team. Thanks @IanWoodard and @asottile-sentry for bringing this to my attention.

@asottile-sentry
Copy link
Member

just so it's written down -- my hypothesis is it might not be the chunk uploading -- but the preprocessing / loading of the debug files themselves. though probably the easiest way to get to the bottom of it would be to use some rust memory profiler to better understand where the overhead is happening

in ~theory a proper streaming uploader shouldn't need much more than the chunksize of total memory to perform a successful upload (on the order of MBs ideally) though we're running out of multi-GB headroom at the moment

@szokeasaurusrex
Copy link
Member Author

With the memory profiler, I discovered that what appears to be eating memory is indeed the chunk uploading process. The reason that we consume much more than the chunk size amount of memory is that we are sending at once multiple 8 MiB chunks (we group multiple chunks into max 32 MiB requests). Additionally, due to how the curl library works, we need to allocate twice the amount of memory per request (64 MiB). Also, we use 8 threads to concurrently upload multiple requests, yielding a theoretical memory usage of 64 MiB * 8 = 512 MiB. In reality, we use even more than this (I observed close to 900 MB in my tests, while uploading a ~500 MB file), likely due to overheads.

We can reduce memory usage greatly by reducing the request and chunk size. Setting both to 1 MiB yielded a maximum memory usage of ~130 MB in my testing (again, with a ~500 MB file), with only a slight slowdown

@szokeasaurusrex
Copy link
Member Author

During an offline discussion with @jan-auer and @mitsuhiko, we decided that we will not adjust the chunk and request sizes from their current settings, due to the high risk associated with changing these and because the more robust solution would be to simply stream the chunks, avoiding the need to ever load them into memory in the first place.

Although curl provides an API to stream data into a multipart form (which is what we currently use to send the chunks), there is no Rust binding for this functionality. Because of this, we have decided to migrate the chunk uploading api code from curl to reqwest, which does support streaming multipart forms.

We will divide this work into two stages:

  1. Move the current chunk uploading request logic from curl to reqwest, without changing how the upload is performed (i.e. we will still load the entire chunks into memory as we currently do).
  2. Once we are using reqwest, refactor the code so that we stream the chunks, rather than reading them entirely into memory.

We may want to consider moving the rest of the code to reqwest long term. However, this would be a separate initiative; work on migrating anything other than chunk uploads to reqwest is out of scope for this initiative.

@szokeasaurusrex
Copy link
Member Author

Or – perhaps an easier option (thanks @mitsuhiko for pointing this out) would be to write our own logic to generate the multipart form (logic should be trivial). Then, we can stream this with curl.

We can still consider changing to reqwest, but this would now be a completely separate task.

@xinsight
Copy link

Is there any way to log the memory usage of uploading? I've spent a ridiculous amount of time trying to figure out why some chunks of my debugging files are not being uploaded and--log-level: trace has not been helpful in diagnosing the issue. (It simply shows that there are 5 chunks to send and then at the end, an error as 2 chunks are missing.) It would be useful to see logging relating to memory usage. Is that something that could be added sooner rather than waiting for this migration from curl to reqwest?

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Jan 16, 2025
@xinsight
Copy link

The memory usage doesn't seem to be related to the chunk uploading but as @asottile-sentry guessed, it's just how sentry-cli loads the entire debug files into memory before processing.

To confirm this, set up some dummy data files:

dd if=/dev/zero of=.dist/10mb bs=1024 count=10240
dd if=/dev/zero of=.dist/100mb bs=1024 count=102400
dd if=/dev/zero of=.dist/200mb bs=1024 count=204800

Now run debug-files check with memory profiling:

/usr/bin/time -l sentry-cli debug-files check .dist/10mb 2>&1 | grep max
            17907712  maximum resident set size
/usr/bin/time -l sentry-cli debug-files check .dist/100mb 2>&1 | grep max
           112754688  maximum resident set size
/usr/bin/time -l sentry-cli debug-files check .dist/200mb 2>&1 | grep max
           217710592  maximum resident set size

As you can see the memory usage grows with the file size. On my macbook processes have no memory limit, and sentry-cli has no problems uploading chunks. My guess is that there is a memory limit when running from a github action so the limit is reached and this is not being caught/logged by sentry-cli (oh the irony...)

@szokeasaurusrex
Copy link
Member Author

@xinsight thank you for writing in.

Regarding your first comment, we do not plan to add any logging regarding memory usage, since there are external tools you can use to monitor memory usage.

Regarding the second comment, I see you are using the debug-files check command, not the debug-files upload command, which is going to use a completely different code path from what we use for uploading. Nevertheless, I did some investigating here, and it appears that debug-files check memory-maps the debug files. My understanding is that the "maximum resident set size" output by the time command can include any portion of the memory-mapped files that get loaded into memory. However, this memory should not count towards any memory limit, because it is simply a cache of the file still on disk, and because the amount of the file loaded into memory is OS and system dependent (memory-mapping is an OS abstraction).

I confirmed with another memory profiler (cargo instruments), that sentry-cli debug-files check does not allocate any memory for the file.

@szokeasaurusrex
Copy link
Member Author

Although, @xinsight – if you have been experiencing an error uploading debug files, which you have been having trouble debugging, please feel free to open a bug report issue. If possible, please provide a reproduction when opening the new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

3 participants