Skip to content

[Bug] Cuda error: invalid argument when host init_kv_buffer with argument pin_memory=True #6285

Closed
@beaulian

Description

@beaulian

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
  • 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
  • 5. Please use English, otherwise it will be closed.

Describe the bug

In sglang/srt/mem_cache/memory_pool.py:915-920

def init_kv_buffer(self):
    return torch.empty(
        (2, self.layer_num, self.size, self.head_num, self.head_dim),
        dtype=self.dtype,
        device=self.device,
        pin_memory=self.pin_memory,
    )

when allocate pin memory with size > some threshold, the following error occurs

Traceback (most recent call last):
File "", line 1, in
RuntimeError: CUDA error: invalid argument
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Should sglang provide a server argument enable_pin_memory to use pin_memory dynamically?

Reproduction

using torch 2.5.1 or torch 2.6.0

import torch
t = torch.empty((2, 32, 10000, 8, 128), dtype=torch.bfloat16, device="cpu", pin_memory=True)

related issues
deepspeedai/DeepSpeed#7150

Environment

Python Packages
sglang==0.4.6
torch==2.5.1 or torch==2.6.0

System
Linux kernel: 5.10.112-005.ali5000.al8.x86_64
GPU: H20
Nvidia driver version: 550.144.04
Cuda version: 12.4

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions