Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CPU affinity setting to latency benchmark #3085

Merged
merged 12 commits into from
Jan 26, 2025

Conversation

hubertlu-tw
Copy link
Contributor

@hubertlu-tw hubertlu-tw commented Jan 23, 2025

Motivation

This PR adds CPU affinity setting (i.e. NUMA binding) to latency test (bench_one_batch.py) to increase performance.

Modifications

Leverage "set_gpu_proc_affinity" when SGLANG_SET_CPU_AFFINITY=1

Checklist

CC: @HaiShaw

@hubertlu-tw hubertlu-tw changed the title Cpu affinity bench one batch Add CPU affinity setting to latency benchmark Jan 23, 2025
@HaiShaw
Copy link
Collaborator

HaiShaw commented Jan 23, 2025

@hubertlu-tw May you provide offline benchmark numbers to two models and two different builds (machines) for the difference? Previously we only enabled this for online cases. Thanks!

@HaiShaw
Copy link
Collaborator

HaiShaw commented Jan 25, 2025

@hubertlu-tw Can you please fix the CI/Lint?

@hubertlu-tw
Copy link
Contributor Author

hubertlu-tw commented Jan 25, 2025

@hubertlu-tw May you provide offline benchmark numbers to two models and two different builds (machines) for the difference? Previously we only enabled this for online cases. Thanks!

I tested the changes on two different models (DeepSeek-V3 and Llama-3.1-70B) and two different servers (with different system setups). I have consistently observed 1-5% perf improvement with CPU affinity setting while running bench_one_batch.

@HaiShaw the Lint error is resolved. Thanks.

@zhyncs zhyncs requested a review from merrymercy January 26, 2025 07:14
@merrymercy merrymercy merged commit f8b28e4 into sgl-project:main Jan 26, 2025
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants