-
Notifications
You must be signed in to change notification settings - Fork 168
Suggestion: Should preserve_finished_jobs
really default to true
?
#560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
preserve_finished_jobs
really default to true
?
Hey @stefanvermaas, sorry that happened and thanks for taking the time to write this up. The current default is the right one. Not deleting jobs right after they're executed is more performant (as you don't have to do multiple deletes) and necessary to ensure recurring tasks aren't enqueued twice. Resque and Sidekiq's behaviours aren't comparable because they use Redis and implement the queue by popping a job from a Redis list (roughly), so the job is always deleted when it's picked up. GoodJob, another database-backed adapter and thus comparable, preserves finished jobs by default as well. The right way to approach this, as I see it, is to periodically clean finished jobs. This is mentioned here:
So what I could see is having this recurring task added by default to the generated |
I knew there was a good reason for this. This makes sense.
Truthfully, I was actually already aware of this. But I forgot, and then this happened. The reason is that I never thought about it twice because we're still using
This makes sense. I'll open a PR for it. |
I am getting the above error when I try with the code in the PR. Did I missed something?
Ruby 3.4.3
|
Ahhh yes! |
Thanks @rosa It works now.
Config:
|
We've just hit a tricky issue that I think many teams might run into, especially when adopting
solid_queue
in smaller projects without deep background job needs.By default, SolidQueue preserves all finished jobs (source). That feels like a safe option at first glance, but if you're not aware of this setting, you might only discover it once it's too late.
In our case, we had a small Rails app using
SolidQueue
out of the box. It was deployed with Kamal, everything worked perfectly for months. Until one day, the app became unreachable. No traffic, no response, and only a single error message from hours earlier: "could not connect to database."The database server was up. The app server was up. But after digging in, we discovered that the entire server had run out of disk space.
The root cause? All those finished background jobs were quietly preserved. Since this app had processed hundreds of thousands of jobs over time, including all error notifications, which also went through the job system, things spiraled. Once the disk was full, new jobs couldn't run, including error reporting. We were blind to the failure.
To make matters worse, recovering the machine wasn't easy. With no space left, even SSH access became unreliable. Freeing up disk space in that state is non-trivial, especially if you don't have automated clean-up in place.
Yes, we should have monitored disk usage more closely. But we did clean up our own records — we just didn’t realize that finished jobs were being preserved automatically.
Proposal
Should SolidQueue.preserve_finished_jobs really default to true?
Most job libraries don’t retain finished jobs unless you explicitly ask for that. For example, Sidekiq and Resque both discard successful jobs by default. It’s easy to opt into persistence if you need auditing or observability, but it's safer to start with a clean slate.
For a lot of teams, especially those adopting
SolidQueue
for the first time, this default might feel invisible until it causes problems. Would it be worth reconsidering?The text was updated successfully, but these errors were encountered: