Skip to content

Avoid deadlock when creating ready execution #229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

jk-es335
Copy link

@jk-es335 jk-es335 commented May 21, 2024

I got deadlock when running multiple workers/dispatchers with docker compose.

  • ruby 3.2.4
  • mysql 8.0.31
  • solid_queue 0.3.1

I found @rosa PR and fixed it using a similar approach.
It does not completely eliminate deadlock but mitigates deadlock I think.

The deadlock error is following:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2024-05-21 01:15:11 281472305999808
*** (1) TRANSACTION:
TRANSACTION 5223, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1128, 3 row lock(s), undo log entries 2
MySQL thread id 172, OS thread handle 281471652687808, query id 11099 192.168.0.5 root update
INSERT INTO `solid_queue_ready_executions` (`job_id`, `queue_name`, `priority`, `created_at`) VALUES (469, 'default', 0, '2024-05-21 01:15:11.201125')

*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 4 n bits 264 index PRIMARY of table `handson`.`solid_queue_ready_executions` trx id 5223 lock_mode X locks rec but not gap
Record lock, heap no 144 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 8; hex 80000000000001d5; asc         ;;
 1: len 6; hex 000000001467; asc      g;;
 2: len 7; hex 810000010e0121; asc       !;;
 3: len 8; hex 80000000000001d5; asc         ;;
 4: len 7; hex 64656661756c74; asc default;;
 5: len 4; hex 80000000; asc     ;;
 6: len 8; hex 99b36a13cb0311a5; asc   j     ;;


*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 6 n bits 264 index index_solid_queue_poll_all of table `handson`.`solid_queue_ready_executions` trx id 5223 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;


*** (2) TRANSACTION:
TRANSACTION 5227, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 10 lock struct(s), heap size 1128, 23 row lock(s), undo log entries 10
MySQL thread id 177, OS thread handle 281471649517504, query id 11103 192.168.0.4 root updating
DELETE FROM `solid_queue_ready_executions` WHERE `solid_queue_ready_executions`.`job_id` IN (464, 465, 466, 467, 468)

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 6 n bits 264 index index_solid_queue_poll_all of table `handson`.`solid_queue_ready_executions` trx id 5227 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

Record lock, heap no 39 PHYSICAL RECORD: n_fields 3; compact format; info bits 32
 0: len 4; hex 80000000; asc     ;;
 1: len 8; hex 80000000000001d1; asc         ;;
 2: len 8; hex 80000000000001d1; asc         ;;

Record lock, heap no 141 PHYSICAL RECORD: n_fields 3; compact format; info bits 32
 0: len 4; hex 80000000; asc     ;;
 1: len 8; hex 80000000000001d0; asc         ;;
 2: len 8; hex 80000000000001d0; asc         ;;

Record lock, heap no 143 PHYSICAL RECORD: n_fields 3; compact format; info bits 32
 0: len 4; hex 80000000; asc     ;;
 1: len 8; hex 80000000000001d4; asc         ;;
 2: len 8; hex 80000000000001d4; asc         ;;

Record lock, heap no 145 PHYSICAL RECORD: n_fields 3; compact format; info bits 32
 0: len 4; hex 80000000; asc     ;;
 1: len 8; hex 80000000000001d2; asc         ;;
 2: len 8; hex 80000000000001d2; asc         ;;

Record lock, heap no 146 PHYSICAL RECORD: n_fields 3; compact format; info bits 32
 0: len 4; hex 80000000; asc     ;;
 1: len 8; hex 80000000000001d3; asc         ;;
 2: len 8; hex 80000000000001d3; asc         ;;


*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 4 n bits 264 index PRIMARY of table `handson`.`solid_queue_ready_executions` trx id 5227 lock_mode X waiting
Record lock, heap no 144 PHYSICAL RECORD: n_fields 7; compact format; info bits 0
 0: len 8; hex 80000000000001d5; asc         ;;
 1: len 6; hex 000000001467; asc      g;;
 2: len 7; hex 810000010e0121; asc       !;;
 3: len 8; hex 80000000000001d5; asc         ;;
 4: len 7; hex 64656661756c74; asc default;;
 5: len 4; hex 80000000; asc     ;;
 6: len 8; hex 99b36a13cb0311a5; asc   j    

@rosa
Copy link
Member

rosa commented Jun 7, 2024

Thanks @jk-es335, and sorry for the delay! Looking into this one now.

@@ -37,7 +37,7 @@ def lock_candidates(job_ids, process_id)
return [] if job_ids.none?

SolidQueue::ClaimedExecution.claiming(job_ids, process_id) do |claimed|
where(job_id: claimed.pluck(:job_id)).delete_all
where(id: where(job_id: claimed.pluck(:job_id)).pluck(:id)).delete_all
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only problem with this one is that it introduces another query in the hot path... it's probably ok, but I think we could save it because we've already done a query to ready_executions before, where we select only the job_id. We could select id, job_id, and then filter in memory for the ones that were actually claimed. Let me see how it'd look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I had in mind:

diff --git a/app/models/solid_queue/claimed_execution.rb b/app/models/solid_queue/claimed_execution.rb
index f2ed58d..c67c62a 100644
--- a/app/models/solid_queue/claimed_execution.rb
+++ b/app/models/solid_queue/claimed_execution.rb
@@ -15,7 +15,7 @@ class SolidQueue::ClaimedExecution < SolidQueue::Execution

       insert_all!(job_data)
       where(job_id: job_ids, process_id: process_id).load.tap do |claimed|
-        block.call(claimed)
+        block.call(claimed.map(&:job_id))
       end
     end

diff --git a/app/models/solid_queue/ready_execution.rb b/app/models/solid_queue/ready_execution.rb
index 8eeaddc..2c68cf9 100644
--- a/app/models/solid_queue/ready_execution.rb
+++ b/app/models/solid_queue/ready_execution.rb
@@ -24,20 +24,21 @@ module SolidQueue
           return [] if limit <= 0

           transaction do
-            job_ids = select_candidates(queue_relation, limit)
-            lock_candidates(job_ids, process_id)
+            candidates = select_candidates(queue_relation, limit)
+            lock_candidates(candidates, process_id)
           end
         end

         def select_candidates(queue_relation, limit)
-          queue_relation.ordered.limit(limit).non_blocking_lock.pluck(:job_id)
+          queue_relation.ordered.limit(limit).non_blocking_lock.select(:id, :job_id)
         end

-        def lock_candidates(job_ids, process_id)
-          return [] if job_ids.none?
+        def lock_candidates(executions, process_id)
+          return [] if executions.none?

-          SolidQueue::ClaimedExecution.claiming(job_ids, process_id) do |claimed|
-            where(job_id: claimed.pluck(:job_id)).delete_all
+          SolidQueue::ClaimedExecution.claiming(executions.map(&:job_id), process_id) do |claimed_job_ids|
+            ids_to_delete = executions.index_by(&:job_id).values_at(*claimed_job_ids).map(&:id)
+            where(id: ids_to_delete).delete_all
           end
         end

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @rosa . Thank you for your response.

The solution is good for me. It's more efficient.

Will the fix be included in the next release version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Let me add that one and get a new version out.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

rosa added a commit that referenced this pull request Jun 11, 2024
This is another take on #229, that
tries to solve a deadlock like this:

```
*** (1) TRANSACTION:
TRANSACTION 5223, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1128, 3 row lock(s), undo log entries 2
MySQL thread id 172, OS thread handle 281471652687808, query id 11099 192.168.0.5 root update
INSERT INTO `solid_queue_ready_executions` (`job_id`, `queue_name`, `priority`, `created_at`) VALUES (469, 'default', 0, '2024-05-21 01:15:11.201125')

*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 4 n bits 264 index PRIMARY of table `handson`.`solid_queue_ready_executions` trx id 5223 lock_mode X locks rec but not gap
Record lock, heap no 144 PHYSICAL RECORD: n_fields 7; compact format; info bits 0

...

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 6 n bits 264 index index_solid_queue_poll_all of table `handson`.`solid_queue_ready_executions` trx id 5223 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

...

*** (2) TRANSACTION:
TRANSACTION 5227, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 10 lock struct(s), heap size 1128, 23 row lock(s), undo log entries 10
MySQL thread id 177, OS thread handle 281471649517504, query id 11103 192.168.0.4 root updating
DELETE FROM `solid_queue_ready_executions` WHERE `solid_queue_ready_executions`.`job_id` IN (464, 465, 466, 467, 468)

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 6 n bits 264 index index_solid_queue_poll_all of table `handson`.`solid_queue_ready_executions` trx id 5227 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0

...

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 4 n bits 264 index PRIMARY of table `handson`.`solid_queue_ready_executions` trx id 5227 lock_mode X waiting
Record lock, heap no 144 PHYSICAL RECORD: n_fields 7; compact format; info bits 0

```
rosa added a commit that referenced this pull request Jun 11, 2024
This is another take on #229, that
tries to solve a deadlock like this:

```
*** (1) TRANSACTION:
TRANSACTION 5223, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1128, 3 row lock(s), undo log entries 2
MySQL thread id 172, OS thread handle 281471652687808, query id 11099 192.168.0.5 root update
INSERT INTO `solid_queue_ready_executions` (`job_id`, `queue_name`, `priority`, `created_at`) VALUES (469, 'default', 0, '2024-05-21 01:15:11.201125')

*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 4 n bits 264 index PRIMARY of table `handson`.`solid_queue_ready_executions` trx id 5223 lock_mode X locks rec but not gap
Record lock, heap no 144 PHYSICAL RECORD: n_fields 7; compact format; info bits 0

...

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 6 n bits 264 index index_solid_queue_poll_all of table `handson`.`solid_queue_ready_executions` trx id 5223 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

...

*** (2) TRANSACTION:
TRANSACTION 5227, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 10 lock struct(s), heap size 1128, 23 row lock(s), undo log entries 10
MySQL thread id 177, OS thread handle 281471649517504, query id 11103 192.168.0.4 root updating
DELETE FROM `solid_queue_ready_executions` WHERE `solid_queue_ready_executions`.`job_id` IN (464, 465, 466, 467, 468)

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 6 n bits 264 index index_solid_queue_poll_all of table `handson`.`solid_queue_ready_executions` trx id 5227 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0

...

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 4 n bits 264 index PRIMARY of table `handson`.`solid_queue_ready_executions` trx id 5227 lock_mode X waiting
Record lock, heap no 144 PHYSICAL RECORD: n_fields 7; compact format; info bits 0

```
@rosa
Copy link
Member

rosa commented Jun 11, 2024

Done in #240, so closing this one. Thank you so much, @jk-es335!

@rosa rosa closed this Jun 11, 2024
thomasnynas12 pushed a commit to thomasnynas12/solid-queue-e that referenced this pull request Mar 13, 2025
This is another take on rails/solid_queue#229, that
tries to solve a deadlock like this:

```
*** (1) TRANSACTION:
TRANSACTION 5223, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1128, 3 row lock(s), undo log entries 2
MySQL thread id 172, OS thread handle 281471652687808, query id 11099 192.168.0.5 root update
INSERT INTO `solid_queue_ready_executions` (`job_id`, `queue_name`, `priority`, `created_at`) VALUES (469, 'default', 0, '2024-05-21 01:15:11.201125')

*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 4 n bits 264 index PRIMARY of table `handson`.`solid_queue_ready_executions` trx id 5223 lock_mode X locks rec but not gap
Record lock, heap no 144 PHYSICAL RECORD: n_fields 7; compact format; info bits 0

...

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 6 n bits 264 index index_solid_queue_poll_all of table `handson`.`solid_queue_ready_executions` trx id 5223 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

...

*** (2) TRANSACTION:
TRANSACTION 5227, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 10 lock struct(s), heap size 1128, 23 row lock(s), undo log entries 10
MySQL thread id 177, OS thread handle 281471649517504, query id 11103 192.168.0.4 root updating
DELETE FROM `solid_queue_ready_executions` WHERE `solid_queue_ready_executions`.`job_id` IN (464, 465, 466, 467, 468)

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 6 n bits 264 index index_solid_queue_poll_all of table `handson`.`solid_queue_ready_executions` trx id 5227 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0

...

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 4 n bits 264 index PRIMARY of table `handson`.`solid_queue_ready_executions` trx id 5227 lock_mode X waiting
Record lock, heap no 144 PHYSICAL RECORD: n_fields 7; compact format; info bits 0

```
KingStar365 added a commit to KingStar365/solid_queue that referenced this pull request Mar 25, 2025
This is another take on rails/solid_queue#229, that
tries to solve a deadlock like this:

```
*** (1) TRANSACTION:
TRANSACTION 5223, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1128, 3 row lock(s), undo log entries 2
MySQL thread id 172, OS thread handle 281471652687808, query id 11099 192.168.0.5 root update
INSERT INTO `solid_queue_ready_executions` (`job_id`, `queue_name`, `priority`, `created_at`) VALUES (469, 'default', 0, '2024-05-21 01:15:11.201125')

*** (1) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 4 n bits 264 index PRIMARY of table `handson`.`solid_queue_ready_executions` trx id 5223 lock_mode X locks rec but not gap
Record lock, heap no 144 PHYSICAL RECORD: n_fields 7; compact format; info bits 0

...

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 6 n bits 264 index index_solid_queue_poll_all of table `handson`.`solid_queue_ready_executions` trx id 5223 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
 0: len 8; hex 73757072656d756d; asc supremum;;

...

*** (2) TRANSACTION:
TRANSACTION 5227, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 10 lock struct(s), heap size 1128, 23 row lock(s), undo log entries 10
MySQL thread id 177, OS thread handle 281471649517504, query id 11103 192.168.0.4 root updating
DELETE FROM `solid_queue_ready_executions` WHERE `solid_queue_ready_executions`.`job_id` IN (464, 465, 466, 467, 468)

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 12 page no 6 n bits 264 index index_solid_queue_poll_all of table `handson`.`solid_queue_ready_executions` trx id 5227 lock_mode X
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0

...

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 12 page no 4 n bits 264 index PRIMARY of table `handson`.`solid_queue_ready_executions` trx id 5227 lock_mode X waiting
Record lock, heap no 144 PHYSICAL RECORD: n_fields 7; compact format; info bits 0

```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants