v1.69.0-stable - Loadbalance Batch API Models

May 10, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

Deploy this version

Docker
Pip

docker run litellm
docker run
-e STORE_MODEL_IN_DB=True
-p 4000:4000
ghcr.io/berriai/litellm:main-v1.69.0-stable

pip install litellm
pip install litellm==1.69.0.post1

Key Highlights

LiteLLM v1.69.0-stable brings the following key improvements:

Loadbalance Batch API Models: Easily loadbalance across multiple azure batch deployments using LiteLLM Managed Files
Email Invites 2.0: Send new users onboarded to LiteLLM an email invite.
Nscale: LLM API for compliance with European regulations.
Bedrock /v1/messages: Use Bedrock Anthropic models with Anthropic's /v1/messages.

Batch API Load Balancing

This release brings LiteLLM Managed File support to Batches. This is great for:

Proxy Admins: You can now control which Batch models users can call.
Developers: You no longer need to know the Azure deployment name when creating your batch .jsonl files - just specify the model your LiteLLM key has access to.

Over time, we expect LiteLLM Managed Files to be the way most teams use Files across /chat/completions, /batch, /fine_tuning endpoints.

Email Invites

This release brings the following improvements to our email invite integration:

New templates for user invited and key created events.
Fixes for using SMTP email providers.
Native support for Resend API.
Ability for Proxy Admins to control email events.

For LiteLLM Cloud Users, please reach out to us if you want this enabled for your instance.

New Models / Updated Models

Gemini (VertexAI + Google AI Studio)
- Added gemini-2.5-pro-preview-05-06 models with pricing and context window info - PR
- Set correct context window length for all Gemini 2.5 variants - PR
Perplexity:
- Added new Perplexity models - PR
- Added sonar-deep-research model pricing - PR
Azure OpenAI:
- Fixed passing through of azure_ad_token_provider parameter - PR
OpenAI:
- Added support for pdf url's in 'file' parameter - PR
Sagemaker:
- Fix content length for sagemaker_chat provider - PR
Azure AI Foundry:
- Added cost tracking for the following models PR
  - DeepSeek V3 0324
  - Llama 4 Scout
  - Llama 4 Maverick
Bedrock:
- Added cost tracking for Bedrock Llama 4 models - PR
- Fixed template conversion for Llama 4 models in Bedrock - PR
- Added support for using Bedrock Anthropic models with /v1/messages format - PR
- Added streaming support for Bedrock Anthropic models with /v1/messages format - PR
OpenAI: Added reasoning_effort support for o3 models - PR
Databricks:
- Fixed issue when Databricks uses external model and delta could be empty - PR
Cerebras: Fixed Llama-3.1-70b model pricing and context window - PR
Ollama:
- Fixed custom price cost tracking and added 'max_completion_token' support - PR
- Fixed KeyError when using JSON response format - PR
🆕 Nscale:
- Added support for chat, image generation endpoints - PR

LLM API Endpoints

Messages API:
- 🆕 Added support for using Bedrock Anthropic models with /v1/messages format - PR and streaming support - PR
Moderations API:
- Fixed bug to allow using LiteLLM UI credentials for /moderations API - PR
Realtime API:
- Fixed setting 'headers' in scope for websocket auth requests and infinite loop issues - PR
Files API:
- Unified File ID output support - PR
- Support for writing files to all deployments - PR
- Added target model name validation - PR
Batches API:
- Complete unified batch ID support - replacing model in jsonl to be deployment model name - PR
- Beta support for unified file ID (managed files) for batches - PR

Spend Tracking / Budget Improvements

Bug Fix - PostgreSQL Integer Overflow Error in DB Spend Tracking - PR

Management Endpoints / UI

Models
- Fixed model info overwriting when editing a model on UI - PR
- Fixed team admin model updates and organization creation with specific models - PR
Logs:
- Bug Fix - copying Request/Response on Logs Page - PR
- Bug Fix - log did not remain in focus on QA Logs page + text overflow on error logs - PR
- Added index for session_id on LiteLLM_SpendLogs for better query performance - PR
User Management:
- Added user management functionality to Python client library & CLI - PR
- Bug Fix - Fixed SCIM token creation on Admin UI - PR
- Bug Fix - Added 404 response when trying to delete verification tokens that don't exist - PR

Logging / Guardrail Integrations

Custom Logger API: v2 Custom Callback API (send llm logs to custom api) - PR, Get Started
OpenTelemetry:
- Fixed OpenTelemetry to follow genai semantic conventions + support for 'instructions' param for TTS - PR
** Bedrock PII**:
- Add support for PII Masking with bedrock guardrails - Get Started, PR
Documentation:
- Added documentation for StandardLoggingVectorStoreRequest - PR

Performance / Reliability Improvements

Python Compatibility:
- Added support for Python 3.11- (fixed datetime UTC handling) - PR
- Fixed UnicodeDecodeError: 'charmap' on Windows during litellm import - PR
Caching:
- Fixed embedding string caching result - PR
- Fixed cache miss for Gemini models with response_format - PR

General Proxy Improvements

Proxy CLI:
- Added --version flag to litellm-proxy CLI - PR
- Added dedicated litellm-proxy CLI - PR
Alerting:
- Fixed Slack alerting not working when using a DB - PR
Email Invites:
- Added V2 Emails with fixes for sending emails when creating keys + Resend API support - PR
- Added user invitation emails - PR
- Added endpoints to manage email settings - PR
General:
- Fixed bug where duplicate JSON logs were getting emitted - PR

New Contributors

@zoltan-ongithub made their first contribution in PR #10568
@mkavinkumar1 made their first contribution in PR #10548
@thomelane made their first contribution in PR #10549
@frankzye made their first contribution in PR #10540
@aholmberg made their first contribution in PR #10591
@aravindkarnam made their first contribution in PR #10611
@xsg22 made their first contribution in PR #10648
@casparhsws made their first contribution in PR #10635
@hypermoose made their first contribution in PR #10370
@tomukmatthews made their first contribution in PR #10638
@keyute made their first contribution in PR #10652
@GPTLocalhost made their first contribution in PR #10687
@husnain7766 made their first contribution in PR #10697
@claralp made their first contribution in PR #10694
@mollux made their first contribution in PR #10690

Deploy this version​

Key Highlights​

Batch API Load Balancing​

Email Invites​

New Models / Updated Models​

LLM API Endpoints​

Spend Tracking / Budget Improvements​

Management Endpoints / UI​

Logging / Guardrail Integrations​

Performance / Reliability Improvements​

General Proxy Improvements​

New Contributors​