Add CPU affinity setting to latency benchmark #3085

hubertlu-tw · 2025-01-23T16:30:26Z

Motivation

This PR adds CPU affinity setting (i.e. NUMA binding) to latency test (bench_one_batch.py) to increase performance.

Modifications

Leverage "set_gpu_proc_affinity" when SGLANG_SET_CPU_AFFINITY=1

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling.

CC: @HaiShaw

Add CPU affinity setting for latency test

HaiShaw · 2025-01-23T19:26:47Z

@hubertlu-tw May you provide offline benchmark numbers to two models and two different builds (machines) for the difference? Previously we only enabled this for online cases. Thanks!

HaiShaw · 2025-01-25T07:32:42Z

@hubertlu-tw Can you please fix the CI/Lint?

hubertlu-tw · 2025-01-25T17:21:23Z

@hubertlu-tw May you provide offline benchmark numbers to two models and two different builds (machines) for the difference? Previously we only enabled this for online cases. Thanks!

I tested the changes on two different models (DeepSeek-V3 and Llama-3.1-70B) and two different servers (with different system setups). I have consistently observed 1-5% perf improvement with CPU affinity setting while running bench_one_batch.

@HaiShaw the Lint error is resolved. Thanks.

hubertlu-tw added 2 commits January 21, 2025 18:00

Update bench_one_batch.py

e0a41ee

Add CPU affinity setting for latency test

Merge branch 'sgl-project:main' into cpu_affinity_bench_one_batch

463059e

hubertlu-tw changed the title ~~Cpu affinity bench one batch~~ Add CPU affinity setting to latency benchmark Jan 23, 2025

HaiShaw added 2 commits January 23, 2025 11:26

Merge branch 'main' into cpu_affinity_bench_one_batch

c544154

Merge branch 'main' into cpu_affinity_bench_one_batch

d85c9b9

hubertlu-tw and others added 4 commits January 25, 2025 09:22

Fix Lint errors

80311ff

Merge branch 'main' into cpu_affinity_bench_one_batch

0bb25ee

Merge branch 'main' into cpu_affinity_bench_one_batch

3047303

Merge branch 'main' into cpu_affinity_bench_one_batch

a4db80b

HaiShaw approved these changes Jan 26, 2025

View reviewed changes

HaiShaw added 4 commits January 25, 2025 18:33

Merge branch 'main' into cpu_affinity_bench_one_batch

272f5d4

Merge branch 'main' into cpu_affinity_bench_one_batch

2391c11

Merge branch 'main' into cpu_affinity_bench_one_batch

8bd3e6e

Merge branch 'main' into cpu_affinity_bench_one_batch

7c2140f

zhyncs requested a review from merrymercy January 26, 2025 07:14

merrymercy merged commit f8b28e4 into sgl-project:main Jan 26, 2025
15 of 16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CPU affinity setting to latency benchmark #3085

Add CPU affinity setting to latency benchmark #3085

hubertlu-tw commented Jan 23, 2025 •

edited

Loading

HaiShaw commented Jan 23, 2025

HaiShaw commented Jan 25, 2025

hubertlu-tw commented Jan 25, 2025 •

edited

Loading

Add CPU affinity setting to latency benchmark #3085

Add CPU affinity setting to latency benchmark #3085

Conversation

hubertlu-tw commented Jan 23, 2025 • edited Loading

Motivation

Modifications

Checklist

HaiShaw commented Jan 23, 2025

HaiShaw commented Jan 25, 2025

hubertlu-tw commented Jan 25, 2025 • edited Loading

hubertlu-tw commented Jan 23, 2025 •

edited

Loading

hubertlu-tw commented Jan 25, 2025 •

edited

Loading