You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
when allocate pin memory with size > some threshold, the following error occurs
Traceback (most recent call last):
File "", line 1, in
RuntimeError: CUDA error: invalid argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
Should sglang provide a server argument enable_pin_memory to use pin_memory dynamically?
Hi @beaulian, may I ask what is the thresold you were using?
Hi, @xiezhq-hermann, it's very strange. When I try many times of torch.empty, it succeeds for any size. Maybe pin_memory=True is not a stable argument.
Uh oh!
There was an error while loading. Please reload this page.
Checklist
Describe the bug
In
sglang/srt/mem_cache/memory_pool.py:915-920
when allocate pin memory with size > some threshold, the following error occurs
Should sglang provide a server argument
enable_pin_memory
to use pin_memory dynamically?Reproduction
using torch 2.5.1 or torch 2.6.0
related issues
deepspeedai/DeepSpeed#7150
Environment
Python Packages
sglang==0.4.6
torch==2.5.1 or torch==2.6.0
System
Linux kernel: 5.10.112-005.ali5000.al8.x86_64
GPU: H20
Nvidia driver version: 550.144.04
Cuda version: 12.4
The text was updated successfully, but these errors were encountered: