Skip to content

Commit cdeb30b

Browse files
committed
More instructions.
1 parent ff8a5c7 commit cdeb30b

File tree

2 files changed

+18
-4
lines changed

2 files changed

+18
-4
lines changed

cloud-infrastructure/ai-infra-gpu/ai-infrastructure/llm-benchmark-docker/README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,24 @@ are gated and require an access token.
125125
results in the directory `./results`, containing information about the vLLM
126126
parameters and the shape used.
127127

128+
To run only certain scenarios and concurrent request settings, modify
129+
[`compose.yaml`](files/compose.yaml) and have the `command` for the `perf`
130+
container read, i.e:
131+
```yaml
132+
command:
133+
- "wait-for-it.sh"
134+
- "--timeout=300"
135+
- "llm:8000"
136+
- "--"
137+
- "/appli/scripts/benchmark.py"
138+
- "--scenario"
139+
- "chatbot"
140+
- "--concurrency"
141+
- "1"
142+
- "4"
143+
- "16"
144+
```
145+
128146
5. Run the plotting:
129147
```sh
130148
docker-compose run plot

cloud-infrastructure/ai-infra-gpu/ai-infrastructure/llm-benchmark-docker/files/compose.yaml

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,10 +32,6 @@ services:
3232
- "llm:8000"
3333
- "--"
3434
- "/appli/scripts/benchmark.py"
35-
- "--concurrency"
36-
- "1"
37-
- "4"
38-
- "16"
3935
plot:
4036
build: plot
4137
container_name: plot

0 commit comments

Comments
 (0)