Releases: BerriAI/litellm
v1.60.8
What's Changed
- UI Updates by @krrishdholakia in #8345
- OIDC Scope based model access by @krrishdholakia in #8343
- Fix azure max retries error by @krrishdholakia in #8340
- Update deepseek API prices for 2025-02-08 by @Winston-503 in #8363
- fix(nvidia_nim/embed.py): add 'dimensions' support by @krrishdholakia in #8302
- fix: dictionary changed size during iteration error (#8327) by @krrishdholakia in #8341
- fix: add azure/o1-2024-12-17 to model_prices_and_context_window.json by @byrongrogan in #8371
- (Security fix) Mask redis pwd on
/cache/ping
+ add timeout value and elapsed time on azure + http calls by @krrishdholakia in #8377 - Handle azure deepseek reasoning response (#8288) by @krrishdholakia in #8366
- Anthropic Citations API Support by @krrishdholakia in #8382
- (Feat) - Add
/bedrock/invoke
support for all Anthropic models by @ishaan-jaff in #8383 - O3 mini native streaming support by @krrishdholakia in #8387
Full Changelog: v1.60.6...v1.60.8
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.8
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 170.0 | 189.56173781509457 | 6.206468643400922 | 0.0 | 1855 | 0 | 149.30551800000558 | 3488.08786699999 |
Aggregated | Passed ✅ | 170.0 | 189.56173781509457 | 6.206468643400922 | 0.0 | 1855 | 0 | 149.30551800000558 | 3488.08786699999 |
v1.60.4-stable
Full Changelog: v1.60.2-dev1...v1.60.4-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.4-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 140.0 | 150.928871417902 | 6.3094717994281515 | 0.0 | 1888 | 0 | 115.36740400003964 | 821.5439119999814 |
Aggregated | Passed ✅ | 140.0 | 150.928871417902 | 6.3094717994281515 | 0.0 | 1888 | 0 | 115.36740400003964 | 821.5439119999814 |
v1.60.6
What's Changed
- Azure OpenAI improvements - o3 native streaming, improved tool call + response format handling by @krrishdholakia in #8292
- Fix edit team on ui by @krrishdholakia in #8295
- Improve rpm check on keys by @krrishdholakia in #8301
- docs: fix enterprise links by @wagnerjt in #8294
- Add gemini-2.0-flash pricing + model info by @krrishdholakia in #8303
- Add Arize Cookbook for Turning on LiteLLM Proxy by @exiao in #8336
- Add aistudio GEMINI 2.0 to model_prices_and_context_window.json by @dceluis in #8335
- Fix pricing for Gemini 2.0 Flash 001 by @elabbarw in #8320
- [DOCS] Update local_debugging.md by @rokbenko in #8308
- (Bug Fix - Langfuse) - fix for when model response has
choices=[]
by @ishaan-jaff in #8339 - Fixed meta llama 3.3 key for Databricks API by @anton164 in #8093
- fix(utils.py): handle key error in msg validation by @krrishdholakia in #8325
- (bug fix router.py) - safely handle
choices=[]
on llm responses by @ishaan-jaff in #8342 - (QA+UI) - e2e flow for adding assembly ai passthrough endpoints by @ishaan-jaff in #8337
New Contributors
- @exiao made their first contribution in #8336
- @dceluis made their first contribution in #8335
- @rokbenko made their first contribution in #8308
- @anton164 made their first contribution in #8093
Full Changelog: v1.60.5...v1.60.6
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.6
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 200.0 | 217.05167674521235 | 6.288425886864887 | 0.0 | 1880 | 0 | 164.17646499996863 | 2306.284880000021 |
Aggregated | Passed ✅ | 200.0 | 217.05167674521235 | 6.288425886864887 | 0.0 | 1880 | 0 | 164.17646499996863 | 2306.284880000021 |
v1.60.5-dev1
What's Changed
- Azure OpenAI improvements - o3 native streaming, improved tool call + response format handling by @krrishdholakia in #8292
- Fix edit team on ui by @krrishdholakia in #8295
- Improve rpm check on keys by @krrishdholakia in #8301
- docs: fix enterprise links by @wagnerjt in #8294
- Add gemini-2.0-flash pricing + model info by @krrishdholakia in #8303
Full Changelog: v1.60.5...v1.60.5-dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.5-dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 244.25501266865584 | 6.165773096861878 | 0.006687389475989022 | 1844 | 2 | 149.75326000001132 | 28799.14171600001 |
Aggregated | Passed ✅ | 190.0 | 244.25501266865584 | 6.165773096861878 | 0.006687389475989022 | 1844 | 2 | 149.75326000001132 | 28799.14171600001 |
v1.60.5
What's Changed
- Added a guide for users who want to use LiteLLM with AI/ML API. by @waterstark in #7058
- Added compatibility guidance, etc. for xAI Grok model by @zhaohan-dong in #8282
- (Security fix) - remove code block that inserts master key hash into DB by @ishaan-jaff in #8268
- (UI) - Add Assembly AI provider to UI by @ishaan-jaff in #8297
- (feat) - Add Assembly AI to model cost map by @ishaan-jaff in #8298
- fixed issues #8126 and #8127 (#8275) by @ishaan-jaff in #8299
- (Refactor) - migrate bedrock invoke to
BaseLLMHTTPHandler
class by @ishaan-jaff in #8290
New Contributors
- @waterstark made their first contribution in #7058
Full Changelog: v1.60.4...v1.60.5
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.5
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 210.0 | 251.44053604962153 | 6.19421782055854 | 0.0 | 1854 | 0 | 167.35073600000305 | 4496.06190000003 |
Aggregated | Passed ✅ | 210.0 | 251.44053604962153 | 6.19421782055854 | 0.0 | 1854 | 0 | 167.35073600000305 | 4496.06190000003 |
v1.60.4
What's Changed
- Internal User Endpoint - vulnerability fix + response type fix by @krrishdholakia in #8228
- Litellm UI fixes 8123 v2 (#8208) by @ishaan-jaff in #8245
- Update model_prices_and_context_window.json by @superpoussin22 in #8249
- Update model_prices_and_context_window.json by @superpoussin22 in #8256
- (dependency) - pip loosen httpx version requirement by @ishaan-jaff in #8255
- Add hyperbolic deepseek v3 model configurations by @lowjiansheng in #8232
- fix(prometheus.py): fix setting key budget metrics by @krrishdholakia in #8234
- (feat) - add supports tool choice to model cost map by @ishaan-jaff in #8265
- (feat) - track org_id in SpendLogs by @ishaan-jaff in #8253
- (Bug fix) - Langfuse / Callback settings stored in DB by @ishaan-jaff in #8251
- Fix passing top_k parameter for Bedrock Anthropic models (#8131) by @ishaan-jaff in #8269
- (Feat) - Add support for structured output on
bedrock/nova
models + add utillitellm.supports_tool_choice
by @ishaan-jaff in #8264 - [BETA] Support OIDC
role
based access to proxy by @krrishdholakia in #8260 - Fix deepseek calling - refactor to use base_llm_http_handler by @krrishdholakia in #8266
- allows dynamic message redaction by @krrishdholakia in #8270
Full Changelog: v1.60.2...v1.60.4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.4
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 210.0 | 243.98647747354212 | 6.187158959524932 | 0.0033407985742575225 | 1852 | 1 | 94.81396500007122 | 3976.009301999966 |
Aggregated | Passed ✅ | 210.0 | 243.98647747354212 | 6.187158959524932 | 0.0033407985742575225 | 1852 | 1 | 94.81396500007122 | 3976.009301999966 |
v1.60.2-dev1
What's Changed
- Internal User Endpoint - vulnerability fix + response type fix by @krrishdholakia in #8228
- Litellm UI fixes 8123 v2 (#8208) by @ishaan-jaff in #8245
- Update model_prices_and_context_window.json by @superpoussin22 in #8249
- Update model_prices_and_context_window.json by @superpoussin22 in #8256
- (dependency) - pip loosen httpx version requirement by @ishaan-jaff in #8255
- Add hyperbolic deepseek v3 model configurations by @lowjiansheng in #8232
- fix(prometheus.py): fix setting key budget metrics by @krrishdholakia in #8234
- (feat) - add supports tool choice to model cost map by @ishaan-jaff in #8265
- (feat) - track org_id in SpendLogs by @ishaan-jaff in #8253
- (Bug fix) - Langfuse / Callback settings stored in DB by @ishaan-jaff in #8251
- Fix passing top_k parameter for Bedrock Anthropic models (#8131) by @ishaan-jaff in #8269
- (Feat) - Add support for structured output on
bedrock/nova
models + add utillitellm.supports_tool_choice
by @ishaan-jaff in #8264 - [BETA] Support OIDC
role
based access to proxy by @krrishdholakia in #8260 - Fix deepseek calling - refactor to use base_llm_http_handler by @krrishdholakia in #8266
- allows dynamic message redaction by @krrishdholakia in #8270
Full Changelog: v1.60.2...v1.60.2-dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.2-dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 251.01715871561 | 6.124144736848293 | 0.0 | 1832 | 0 | 171.2837300000274 | 3691.155395999999 |
Aggregated | Passed ✅ | 220.0 | 251.01715871561 | 6.124144736848293 | 0.0 | 1832 | 0 | 171.2837300000274 | 3691.155395999999 |
v1.60.2
What's Changed
- Control Model Access by IDP 'groups' by @krrishdholakia in #8164
- build(schema.prisma): add new
sso_user_id
to LiteLLM_UserTable by @krrishdholakia in #8167 - Litellm dev contributor prs 01 31 2025 by @krrishdholakia in #8168
- Improved O3 + Azure O3 support by @krrishdholakia in #8181
- test: add more unit testing for team member endpoints by @krrishdholakia in #8170
- Add azure/deepseek-r1 by @Klohto in #8177
- [Bug Fix] -
/vertex_ai/
was not detected as llm_api_route on pass through butvertex-ai
was by @ishaan-jaff in #8186 - (UI + SpendLogs) - Store SpendLogs in UTC Timezone, Fix filtering logs by start/end time by @ishaan-jaff in #8190
- Azure AI Foundry - Deepseek R1 by @elabbarw in #8188
- fix(main.py): fix passing openrouter specific params by @krrishdholakia in #8184
- Complete o3 model support by @krrishdholakia in #8183
- Easier user onboarding via SSO by @krrishdholakia in #8187
- LiteLLM Minor Fixes & Improvements (01/16/2025) - p2 by @krrishdholakia in #7828
- Added deprecation date for gemini-1.5 models by @yurchik11 in #8210
- docs: Updating the available VoyageAI models in the docs by @fzowl in #8215
- build: ui updates by @krrishdholakia in #8206
- Fix tokens for deepseek by @SmartManoj in #8207
- (UI Fixes for add new model flow) by @ishaan-jaff in #8216
- Update xAI provider and fix some old model config by @zhaohan-dong in #8218
- Support guardrails
mode
as list, fix valid keys error in pydantic, add more testing by @krrishdholakia in #8224 - docs: fix typo in lm_studio.md by @foreign-sub in #8222
- (Feat) - New pass through add assembly ai passthrough endpoints by @ishaan-jaff in #8220
- fix(openai/): allows 'reasoning_effort' param to be passed correctly by @krrishdholakia in #8227
New Contributors
- @Klohto made their first contribution in #8177
- @zhaohan-dong made their first contribution in #8218
- @foreign-sub made their first contribution in #8222
Full Changelog: v1.60.0...v1.60.2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 170.0 | 187.78487681207412 | 6.365583292626693 | 0.0 | 1905 | 0 | 135.5453470000043 | 3644.0179759999864 |
Aggregated | Passed ✅ | 170.0 | 187.78487681207412 | 6.365583292626693 | 0.0 | 1905 | 0 | 135.5453470000043 | 3644.0179759999864 |
v1.60.0.dev4
What's Changed
- Azure AI Foundry - Deepseek R1 by @elabbarw in #8188
- fix(main.py): fix passing openrouter specific params by @krrishdholakia in #8184
- Complete o3 model support by @krrishdholakia in #8183
- Easier user onboarding via SSO by @krrishdholakia in #8187
- LiteLLM Minor Fixes & Improvements (01/16/2025) - p2 by @krrishdholakia in #7828
- Added deprecation date for gemini-1.5 models by @yurchik11 in #8210
- docs: Updating the available VoyageAI models in the docs by @fzowl in #8215
- build: ui updates by @krrishdholakia in #8206
- Fix tokens for deepseek by @SmartManoj in #8207
- (UI Fixes for add new model flow) by @ishaan-jaff in #8216
Full Changelog: v1.60.0.dev2...v1.60.0.dev4
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.0.dev4
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 150.0 | 179.79463683736907 | 6.359486247668494 | 0.0 | 1900 | 0 | 123.9115270000184 | 3798.7273850000065 |
Aggregated | Passed ✅ | 150.0 | 179.79463683736907 | 6.359486247668494 | 0.0 | 1900 | 0 | 123.9115270000184 | 3798.7273850000065 |
v1.60.0.dev2
What's Changed
- Control Model Access by IDP 'groups' by @krrishdholakia in #8164
- build(schema.prisma): add new
sso_user_id
to LiteLLM_UserTable by @krrishdholakia in #8167 - Litellm dev contributor prs 01 31 2025 by @krrishdholakia in #8168
- Improved O3 + Azure O3 support by @krrishdholakia in #8181
- test: add more unit testing for team member endpoints by @krrishdholakia in #8170
- Add azure/deepseek-r1 by @Klohto in #8177
- [Bug Fix] -
/vertex_ai/
was not detected as llm_api_route on pass through butvertex-ai
was by @ishaan-jaff in #8186 - (UI + SpendLogs) - Store SpendLogs in UTC Timezone, Fix filtering logs by start/end time by @ishaan-jaff in #8190
New Contributors
Full Changelog: v1.60.0...v1.60.0.dev2
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.60.0.dev2
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 160.0 | 179.3387644765704 | 6.274867330705683 | 0.0 | 1878 | 0 | 134.8906900000202 | 3148.732781000035 |
Aggregated | Passed ✅ | 160.0 | 179.3387644765704 | 6.274867330705683 | 0.0 | 1878 | 0 | 134.8906900000202 | 3148.732781000035 |