Skip to content

Kombu Resource Pool Corruption: acquire leads to AttributeError: 'tuple' object has no attribute '_debug' followed by TypeError: unhashable type: 'list' during release cleanup. #2291

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ykothari-hirefly opened this issue Apr 29, 2025 · 1 comment

Comments

@ykothari-hirefly
Copy link

ykothari-hirefly commented Apr 29, 2025

We are encountering an intermittent error originating from Kombu's resource pooling mechanism when used via Celery 5.2.2 (which uses Kombu 5.5.3) in a Python 3.9 environment. The error manifests during Celery task submission (apply_async) triggered from a Flask after_request handler.

Traceback:

Apr 29 14:32:19Z sourcerer-ai-staging app/web.1 Traceback (most recent call last):
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/.heroku/python/lib/python3.9/site-packages/kombu/resource.py", line 90, in acquire
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     R = self.prepare(R)  # R is unexpectedly a tuple here
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/.heroku/python/lib/python3.9/site-packages/kombu/connection.py", line 1099, in prepare
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     resource._debug('acquired') # This fails as R (resource) is a tuple
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1 AttributeError: 'tuple' object has no attribute '_debug'

# --- The above exception triggers the 'except BaseException:' block in resource.py:acquire ---

Apr 29 14:32:19Z sourcerer-ai-staging app/web.1 During handling of the above exception, another exception occurred:
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1 Traceback (most recent call last):
# --- (Flask/Gunicorn/Sentry/Celery stack frames leading to the final pool interaction) ---
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/app/app_setup.py", line 336, in log_request
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     amplitude_lro = AmplitudeLoggerLRO.create_and_start(events)
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/app/db/db_models/LRO/amplitude_logger_lro.py", line 91, in create_and_start
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     lro.execute()
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/app/db/db_models/LRO/base_lro.py", line 340, in execute
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     run_lro_task.apply_async(
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/.heroku/python/lib/python3.9/site-packages/celery/app/task.py", line 575, in apply_async
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     return app.send_task(
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/.heroku/python/lib/python3.9/site-packages/celery/app/base.py", line 784, in send_task
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     with self.producer_or_acquire(producer) as P: # This eventually calls Resource.acquire
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/.heroku/python/lib/python3.9/site-packages/celery/utils/objects.py", line 84, in __enter__
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     context = self._context = self.fallback(
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/.heroku/python/lib/python3.9/site-packages/kombu/resource.py", line 90, in acquire
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     R = self.prepare(R) # <<< Initial AttributeError happens here
# --- (Stack frames showing the exception handling during acquire) ---
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/.heroku/python/lib/python3.9/site-packages/kombu/pools.py", line 63, in prepare
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     conn = self._acquire_connection()
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/.heroku/python/lib/python3.9/site-packages/kombu/pools.py", line 38, in _acquire_connection
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     return self.connections.acquire(block=True) # This is the acquire call that fails
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/.heroku/python/lib/python3.9/site-packages/kombu/resource.py", line 97, in acquire
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     self.release(R) # R is the bad tuple; this call is made from the except block
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1   File "/app/.heroku/python/lib/python3.9/site-packages/kombu/resource.py", line 138, in release
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1     self._dirty.discard(resource) # Fails as resource (the tuple R) contains a list
Apr 29 14:32:19Z sourcerer-ai-staging app/web.1 TypeError: unhashable type: 'list'

Analysis:

The core issue appears to be pool state corruption: the resource queue (self._resource) somehow contains a tuple (which includes a list) instead of a valid, hashable resource object.

The exact mechanism for this corruption isn't evident solely from resource.py, but it likely occurs during error handling within a previous operation involving the pool (perhaps connection establishment or a prior release cycle), where exception information might be incorrectly treated as a resource and put back into the queue.

Impact

This prevents the application from acquiring necessary resources (like broker connections) and leads to unhandled exceptions, disrupting task processing.

@auvipy
Copy link
Member

auvipy commented May 12, 2025

can you please try latest release of kombu with the latest release of celery and report back?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants