No GPUs found #157

rkfg · 2025-04-17T23:22:45Z

Commit 4b0824d breaks compilation for my 3090Ti. I'm building in Docker using nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04 as the base. The commands are:

ENV TORCH_CUDA_ARCH_LIST=8.6
RUN --mount=type=cache,target=/home/sd/.cache . /venv/bin/activate && python -m pip install --no-build-isolation git+https://github.com/thu-ml/SageAttention.git@4b0824de43a76027cd615b39d3b4baa724addb7a

which results in the following error:

0.363 Collecting git+https://github.com/thu-ml/SageAttention.git@4b0824de43a76027cd615b39d3b4baa724addb7a                                                                             
0.363   Cloning https://github.com/thu-ml/SageAttention.git (to revision 4b0824de43a76027cd615b39d3b4baa724addb7a) to /tmp/pip-req-build-zemg44h2                                     
0.364   Running command git clone --filter=blob:none --quiet https://github.com/thu-ml/SageAttention.git /tmp/pip-req-build-zemg44h2                                                  
10.17   Running command git rev-parse -q --verify 'sha^4b0824de43a76027cd615b39d3b4baa724addb7a'
10.17   Running command git fetch -q https://github.com/thu-ml/SageAttention.git 4b0824de43a76027cd615b39d3b4baa724addb7a
10.78   Running command git checkout -q 4b0824de43a76027cd615b39d3b4baa724addb7a
11.64   Resolved https://github.com/thu-ml/SageAttention.git to commit 4b0824de43a76027cd615b39d3b4baa724addb7a
11.64   Preparing metadata (pyproject.toml): started
12.75   Preparing metadata (pyproject.toml): finished with status 'error'
12.75   error: subprocess-exited-with-error
12.75   
12.75   × Preparing metadata (pyproject.toml) did not run successfully.
12.75   │ exit code: 1
12.75   ╰─> [18 lines of output]
12.75       No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
12.75       Traceback (most recent call last):
12.75         File "/venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
12.75           main()
12.75         File "/venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
12.75           json_out['return_val'] = hook(**hook_input['kwargs'])
12.75                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12.75         File "/venv/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 149, in prepare_metadata_for_build_wheel
12.75           return hook(metadata_directory, config_settings)
12.75                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12.75         File "/venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 377, in prepare_metadata_for_build_wheel
12.75           self.run_setup()
12.75         File "/venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 522, in run_setup
12.75           super().run_setup(setup_script=setup_script)
12.75         File "/venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 320, in run_setup
12.75           exec(code, locals())
12.75         File "<string>", line 81, in <module>
12.75       RuntimeError: No GPUs found. Please specify the target GPU architectures or build on a machine with GPUs.
12.75       [end of output]
12.75   
12.75   note: This error originates from a subprocess, and is likely not a problem with pip.
12.75 error: metadata-generation-failed
12.75 
12.75 × Encountered error while generating package metadata.
12.75 ╰─> See above for output.
12.75 
12.75 note: This is an issue with the package mentioned above, not pip.
12.75 hint: See above for details.

Using the parent commit 2aecfa8 succeeds. Without ENV TORCH_CUDA_ARCH_LIST=8.6 even 2aecfa8 fails with the same error about GPUs not being found. So I can only build sage attention with it.

Torch version doesn't seem to matter, I remember getting it with 2.6.0 and now I use nightly 2.8.0, the behavior is the same.

The text was updated successfully, but these errors were encountered:

chappie90 · 2025-04-18T18:38:29Z

Hi, I'm getting the same issue trying to build from a Dockerfile.

I don't mean to be a LLM noob but this is not my area of expertise so this is what I got from Gemini:

Okay, this PR diff provides the exact reason for the "No GPU Found" issue, even when TORCH_CUDA_ARCH_LIST is set!
Analysis of the Change:
Removal of get_torch_arch_list(): The most critical change is the complete removal of the function get_torch_arch_list() which was responsible for reading the TORCH_CUDA_ARCH_LIST environment variable.
Unconditional GPU Detection: The code was changed to always try and detect GPUs using torch.cuda.device_count() and torch.cuda.get_device_capability(i).
Error Logic Changed: The RuntimeError: No GPUs found... is now raised only if this direct detection loop finds no suitable GPUs (count is 0, or all detected GPUs are < SM 8.0).
Why it Fails in Your Docker Build:
As established before, standard Docker builds do not have access to the host's GPU.
Therefore, torch.cuda.device_count() inside the RUN command will return 0.
The loop to detect GPUs will not run.
compute_capabilities will remain empty.
The check if not compute_capabilities: will be true.
The RuntimeError is raised.
Conclusion:
The version of setup.py after this PR ignores the TORCH_CUDA_ARCH_LIST environment variable entirely and relies solely on detecting GPUs present during the build. This makes it impossible to build in a standard GPU-less Docker environment by simply setting the environment variable.

This Dockerfile works for me if I checkout that commit mentioned above:

# Base image
FROM runpod/base:0.6.2-cuda12.6.2

ENV HF_HUB_ENABLE_HF_TRANSFER=0

# Install git and build essentials (needed for compiling SageAttention)
RUN apt-get update && apt-get install -y git build-essential && \
    rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt /requirements.txt
RUN python3.11 -m pip install --upgrade pip && \
    python3.11 -m pip install --upgrade -r /requirements.txt --no-cache-dir && \
    rm /requirements.txt

# Build and Install SageAttention
# Clone the repository
# Using 2aecfa89c777ec46c4eaaab66082f188a1e00ae4 commit because the one after that might have introduced an issue result in 'No GPUs found'
# https://github.com/thu-ml/SageAttention/issues/157
RUN git clone https://github.com/thu-ml/SageAttention.git /sageattention && \
    cd /sageattention && \
    git checkout 2aecfa89c777ec46c4eaaab66082f188a1e00ae4 && \
    git rev-parse HEAD

# Specify Target GPU Architectures
# Examples: "8.6" Ampere: A100, A40, A6000, RTX 30xx, "9.0" Hopper: H100, "8.6 9.0" (Ampere & Hopper)
ENV TORCH_CUDA_ARCH_LIST="8.6"

# Install SageAttention from source
# Clean up source code after installation
RUN cd /sageattention && \
    python3.11 setup.py install && \
    cd / && \
    rm -rf /sageattention

# Add src files
RUN mkdir -p /src
COPY src/ /src/

ENV RUNPOD_DEBUG_LEVEL=INFO

CMD python3.11 -u /src/handler.py

rkfg · 2025-04-18T18:44:05Z

Looks like there are some ways to make GPU accessible during build but it's quite a pain considering it's the only library that needs it: https://stackoverflow.com/questions/59691207/docker-build-with-nvidia-runtime

Especially since the working logic is right there and it needs to be brought back. I'd be totally okay if the variable is renamed etc. if it should be for whatever reason. Just let me build an image for my GPU, I swear I actually have it plugged in and working! 😂

VitoChenLY · 2025-06-06T03:37:48Z

I meet the same problem, how to make it when docker build.

rkfg · 2025-06-06T12:19:29Z

I resorted to doing a one-time install in the entrypoint script like this:

#!/bin/sh
if [ ! -f /home/sd/deps_installed ]
then
    export TORCH_CUDA_ARCH_LIST='8.9;12.0'
    pip install git+https://github.com/thu-ml/SageAttention.git && touch /home/sd/deps_installed
fi
cd /app
python main.py "$@"

The actual script I use installs more dependencies. It simply checks for a file presence and if it's absent it installs SageAttention and creates that file. The container should of course have GPU access for this to work but it's expected anyway. Make sure you have a correct arch list in the TORCH_CUDA_ARCH_LIST variable, for me it didn't work without explicitly specifying the versions. I now have a 5090 and maybe by default not everything is compiled.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

No GPUs found #157

No GPUs found #157

rkfg commented Apr 17, 2025

chappie90 commented Apr 18, 2025

Uh oh!

rkfg commented Apr 18, 2025

Uh oh!

VitoChenLY commented Jun 6, 2025

Uh oh!

rkfg commented Jun 6, 2025

Uh oh!

No GPUs found #157

No GPUs found #157

Comments

rkfg commented Apr 17, 2025

chappie90 commented Apr 18, 2025

Uh oh!

rkfg commented Apr 18, 2025

Uh oh!

VitoChenLY commented Jun 6, 2025

Uh oh!

rkfg commented Jun 6, 2025

Uh oh!