Skip to content

Add capability to discard duplicate jobs with concurrency configuration #523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
7 changes: 6 additions & 1 deletion app/models/solid_queue/job/concurrency_controls.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ module ConcurrencyControls
included do
has_one :blocked_execution

delegate :concurrency_limit, :concurrency_duration, to: :job_class
delegate :concurrency_limit, :concurrency_at_limit, :concurrency_duration, to: :job_class

before_destroy :unblock_next_blocked_job, if: -> { concurrency_limited? && ready? }
end
Expand All @@ -34,8 +34,13 @@ def blocked?
end

private
def discard_concurrent?
concurrency_at_limit == :discard
end

def acquire_concurrency_lock
return true unless concurrency_limited?
return false if Semaphore.at_limit?(self) && discard_concurrent?

Semaphore.wait(self)
end
Expand Down
2 changes: 2 additions & 0 deletions app/models/solid_queue/job/executable.rb
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ def prepare_for_execution

def dispatch
if acquire_concurrency_lock then ready
elsif discard_concurrent?
discard
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the way 🤔 We're in the middle of a transaction here, and the job hasn't even been committed to the DB. It makes no sense to delete a record in the same transaction you're creating it. It'd make sense to roll that transaction back instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I don't fully understand how we're getting into this code path, but from my investigation, it looks as though there isn't an open transaction at this time.

I tried the following and running the test suite and didn't hit any open transactions.

raise "open transactions" if ApplicationRecord.connection.open_transactions.positive?

I still believe we'd want to discard here. Let me know your thoughts.

else
block
end
Expand Down
12 changes: 12 additions & 0 deletions app/models/solid_queue/semaphore.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ def wait(job)
Proxy.new(job).wait
end

def at_limit?(job)
Proxy.new(job).at_limit?
end

def signal(job)
Proxy.new(job).signal
end
Expand Down Expand Up @@ -39,6 +43,14 @@ def initialize(job)
@job = job
end

def at_limit?
if semaphore = Semaphore.find_by(key: key)
semaphore.value.zero?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is vulnerable to race conditions and the reason we don't check it in this way when blocking jobs. If two concurrent jobs are claiming the semaphore, both of them will see it open, and none will be discarded. Then, both will run together because we won't block them either.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You raise an excellent point. Do you have a suggestion to how we might be able to avoid this race condition?

Are you thinking Pessimistic Locking might help?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was thinking that perhaps we could rely on the same check we're already doing to block the job, but instead of blocking the job, we'd rollback the transaction 🤔 What I'm not sure is whether this should actually raise some exception to indicate the job hasn't been enqueued. Maybe that's not necessary, but at least it should set the successfully_enqueued attribute in the active job to false.

else
false
end
end

def wait
if semaphore = Semaphore.find_by(key: key)
semaphore.value > 0 && attempt_decrement
Expand Down
4 changes: 3 additions & 1 deletion lib/active_job/concurrency_controls.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,17 @@ module ConcurrencyControls
class_attribute :concurrency_group, default: DEFAULT_CONCURRENCY_GROUP, instance_accessor: false

class_attribute :concurrency_limit
class_attribute :concurrency_at_limit
class_attribute :concurrency_duration, default: SolidQueue.default_concurrency_control_period
end

class_methods do
def limits_concurrency(key:, to: 1, group: DEFAULT_CONCURRENCY_GROUP, duration: SolidQueue.default_concurrency_control_period)
def limits_concurrency(key:, to: 1, group: DEFAULT_CONCURRENCY_GROUP, duration: SolidQueue.default_concurrency_control_period, at_limit: :block)
self.concurrency_key = key
self.concurrency_limit = to
self.concurrency_group = group
self.concurrency_duration = duration
self.concurrency_at_limit = at_limit
end
end

Expand Down
50 changes: 48 additions & 2 deletions test/models/solid_queue/job_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ def perform(job_result)
end
end

class DiscardedNonOverlappingJob < NonOverlappingJob
limits_concurrency key: ->(job_result, **) { job_result }, at_limit: :discard
end

class DiscardedOverlappingJob < NonOverlappingJob
limits_concurrency to: 2, key: ->(job_result, **) { job_result }, at_limit: :discard
end

class NonOverlappingGroupedJob1 < NonOverlappingJob
limits_concurrency key: ->(job_result, **) { job_result }, group: "MyGroup"
end
Expand Down Expand Up @@ -98,6 +106,40 @@ class NonOverlappingGroupedJob2 < NonOverlappingJob
assert_equal active_job.concurrency_key, job.concurrency_key
end

test "enqueue jobs with discarding concurrency controls" do
assert_ready do
active_job = DiscardedNonOverlappingJob.perform_later(@result, name: "A")
assert_equal 1, active_job.concurrency_limit
assert_equal "SolidQueue::JobTest::DiscardedNonOverlappingJob/JobResult/#{@result.id}", active_job.concurrency_key
end

assert_discarded do
active_job = DiscardedNonOverlappingJob.perform_later(@result, name: "A")
assert_equal 1, active_job.concurrency_limit
assert_equal "SolidQueue::JobTest::DiscardedNonOverlappingJob/JobResult/#{@result.id}", active_job.concurrency_key
end
end

test "enqueue jobs with discarding concurrency controls when below limit" do
assert_ready do
active_job = DiscardedOverlappingJob.perform_later(@result, name: "A")
assert_equal 2, active_job.concurrency_limit
assert_equal "SolidQueue::JobTest::DiscardedOverlappingJob/JobResult/#{@result.id}", active_job.concurrency_key
end

assert_ready do
active_job = DiscardedOverlappingJob.perform_later(@result, name: "A")
assert_equal 2, active_job.concurrency_limit
assert_equal "SolidQueue::JobTest::DiscardedOverlappingJob/JobResult/#{@result.id}", active_job.concurrency_key
end

assert_discarded do
active_job = DiscardedOverlappingJob.perform_later(@result, name: "A")
assert_equal 2, active_job.concurrency_limit
assert_equal "SolidQueue::JobTest::DiscardedOverlappingJob/JobResult/#{@result.id}", active_job.concurrency_key
end
end

test "enqueue jobs with concurrency controls in the same concurrency group" do
assert_ready do
active_job = NonOverlappingGroupedJob1.perform_later(@result, name: "A")
Expand Down Expand Up @@ -289,8 +331,12 @@ def assert_blocked(&block)
assert SolidQueue::Job.last.blocked?
end

def assert_job_counts(ready: 0, scheduled: 0, blocked: 0, &block)
assert_difference -> { SolidQueue::Job.count }, +(ready + scheduled + blocked) do
def assert_discarded(&block)
assert_job_counts(discarded: 1, &block)
end

def assert_job_counts(ready: 0, scheduled: 0, blocked: 0, discarded: 0, &block)
assert_difference -> { SolidQueue::Job.count }, +(ready + scheduled + blocked + discarded) do
assert_difference -> { SolidQueue::ReadyExecution.count }, +ready do
assert_difference -> { SolidQueue::ScheduledExecution.count }, +scheduled do
assert_difference -> { SolidQueue::BlockedExecution.count }, +blocked, &block
Expand Down
Loading