← Back to Blog

Sarvam AI API Guide for Developers 2026: Speech, Translation & Indian Language Integration

Sarvam AI provides the most comprehensive API suite for Indian language AI in 2026, covering speech-to-text, text-to-speech, translation, transliteration, and chat completion across 22+ Indian languages. This developer guide walks you through integrating every Sarvam AI API with code examples, best practices, and production-ready patterns for building multilingual Indian applications.

Whether you're adding Hindi voice support to your app, building a multilingual chatbot, or creating a translation pipeline for enterprise content, this guide provides the technical foundation you need.

Setup & Authentication 2026

Install Python SDK 2026

# Install the Sarvam AI Python SDK
pip install sarvam-ai

# Or with pip3
pip3 install sarvam-ai

Authentication 2026

All Sarvam AI APIs use API key authentication via the API-Subscription-Key header.

# Environment variable (recommended)
export SARVAM_API_KEY="your_api_key_here"

# Python usage
import os
api_key = os.environ.get("SARVAM_API_KEY")

Base URL 2026

Base URL: https://api.sarvam.ai

# Headers for all requests
headers = {
    "Content-Type": "application/json",
    "API-Subscription-Key": "YOUR_API_KEY"
}

Quick Connectivity Test 2026

import requests

url = "https://api.sarvam.ai/translate"
headers = {
    "Content-Type": "application/json",
    "API-Subscription-Key": "YOUR_API_KEY"
}
payload = {
    "input": "Hello, how are you?",
    "source_language_code": "en-IN",
    "target_language_code": "hi-IN",
    "model": "mayura:v1"
}

response = requests.post(url, json=payload, headers=headers)
print(response.json())
# Output: {"translated_text": "नमस्ते, आप कैसे हैं?"}

Speech-to-Text API (Saarika) 2026

Saarika is Sarvam's ASR model for transcribing Indian language audio with industry-leading accuracy.

REST API - Quick Transcription 2026

For audio under 30 seconds with instant results:

import requests

url = "https://api.sarvam.ai/speech-to-text"
headers = {
    "API-Subscription-Key": "YOUR_API_KEY"
}

# Upload audio file
with open("audio_hindi.wav", "rb") as f:
    files = {"file": ("audio.wav", f, "audio/wav")}
    data = {
        "language_code": "hi-IN",
        "model": "saarika:v2"
    }
    response = requests.post(url, headers=headers,
                            files=files, data=data)

result = response.json()
print(result["transcript"])
# Output: "आज का मौसम बहुत अच्छा है"

Batch API - Long Audio 2026

For audio longer than 30 seconds, use the Batch API for asynchronous processing:

# Step 1: Submit batch job
url = "https://api.sarvam.ai/speech-to-text/batch"
headers = {
    "API-Subscription-Key": "YOUR_API_KEY"
}

with open("long_meeting.mp3", "rb") as f:
    files = {"file": ("meeting.mp3", f, "audio/mpeg")}
    data = {
        "language_code": "hi-IN",
        "model": "saarika:v2",
        "with_diarization": "true"
    }
    response = requests.post(url, headers=headers,
                            files=files, data=data)

job_id = response.json()["job_id"]

# Step 2: Poll for results
import time
while True:
    status_url = f"https://api.sarvam.ai/speech-to-text/batch/{job_id}"
    status = requests.get(status_url, headers=headers).json()
    if status["status"] == "completed":
        print(status["transcript"])
        break
    time.sleep(5)  # Wait 5 seconds before polling again

Supported Audio Formats 2026

FormatExtensionMIME Type
WAV.wavaudio/wav
MP3.mp3audio/mpeg
AAC.aacaudio/aac
OGG.oggaudio/ogg
OPUS.opusaudio/opus
FLAC.flacaudio/flac
M4A.m4aaudio/mp4
WebM.webmaudio/webm
AMR.amraudio/amr
WMA.wmaaudio/x-ms-wma

Speech-to-Text Translate API 2026

Combines transcription and translation in a single API call. Automatically detects the spoken language, transcribes it, and translates to English.

url = "https://api.sarvam.ai/speech-to-text-translate"
headers = {
    "API-Subscription-Key": "YOUR_API_KEY"
}

with open("tamil_audio.wav", "rb") as f:
    files = {"file": ("audio.wav", f, "audio/wav")}
    data = {
        "model": "saarika:v2"
        # No need to specify source language - auto-detected
    }
    response = requests.post(url, headers=headers,
                            files=files, data=data)

result = response.json()
print(f"Detected: {result['language_code']}")
print(f"Transcript: {result['transcript']}")
print(f"Translation: {result['translated_text']}")
# Detected: ta-IN
# Transcript: "இன்று வானிலை மிகவும் நன்றாக உள்ளது"
# Translation: "The weather is very nice today"

Text-to-Speech API (Bulbul) 2026

Bulbul generates natural-sounding speech in Indian languages with multiple voice options.

Basic TTS 2026

url = "https://api.sarvam.ai/text-to-speech"
headers = {
    "Content-Type": "application/json",
    "API-Subscription-Key": "YOUR_API_KEY"
}

payload = {
    "input": "नमस्ते, मेरा नाम सर्वम है। मैं आपकी कैसे मदद कर सकता हूँ?",
    "language_code": "hi-IN",
    "model": "bulbul:v1",
    "voice": "meera",  # Female Hindi voice
    "speed": 1.0
}

response = requests.post(url, json=payload, headers=headers)

# Save audio file
with open("output.wav", "wb") as f:
    f.write(response.content)

Available Voices 2026

LanguageVoice NameGender
Hindimeera, arvindFemale, Male
Tamilamala, karthikFemale, Male
Telugulakshmi, raviFemale, Male
Bengaliananya, sohamFemale, Male
Kannadakavya, pradeepFemale, Male
Malayalamdevika, arunFemale, Male
Gujaratidiya, harshFemale, Male
Marathisneha, sachinFemale, Male
Englishpriya, rajFemale, Male

Translation API (Mayura) 2026

Basic Translation 2026

url = "https://api.sarvam.ai/translate"
headers = {
    "Content-Type": "application/json",
    "API-Subscription-Key": "YOUR_API_KEY"
}

payload = {
    "input": "India's digital economy is growing rapidly with UPI transactions crossing 10 billion per month.",
    "source_language_code": "en-IN",
    "target_language_code": "hi-IN",
    "model": "mayura:v1"
}

response = requests.post(url, json=payload, headers=headers)
result = response.json()
print(result["translated_text"])

Multi-Language Batch Translation 2026

# Translate to multiple Indian languages at once
source_text = "Welcome to our platform"
target_languages = ["hi-IN", "ta-IN", "te-IN", "bn-IN", "mr-IN", "gu-IN"]

for lang in target_languages:
    payload = {
        "input": source_text,
        "source_language_code": "en-IN",
        "target_language_code": lang,
        "model": "mayura:v1"
    }
    response = requests.post(url, json=payload, headers=headers)
    result = response.json()
    print(f"{lang}: {result['translated_text']}")

Transliteration API 2026

Convert text between scripts while preserving pronunciation. Essential for handling romanized Indian language input common in chat and social media.

url = "https://api.sarvam.ai/transliterate"
headers = {
    "Content-Type": "application/json",
    "API-Subscription-Key": "YOUR_API_KEY"
}

# Romanized Hindi to Devanagari
payload = {
    "input": "namaste aap kaise hain",
    "source_language_code": "en-IN",
    "target_language_code": "hi-IN"
}

response = requests.post(url, json=payload, headers=headers)
print(response.json()["transliterated_text"])
# Output: "नमस्ते आप कैसे हैं"

Chat Completion API (Sarvam-M) 2026

Multilingual conversational AI for building chatbots and assistants. Currently free for basic usage.

url = "https://api.sarvam.ai/chat/completions"
headers = {
    "Content-Type": "application/json",
    "API-Subscription-Key": "YOUR_API_KEY"
}

payload = {
    "model": "sarvam-m",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful customer support assistant. Respond in the same language the user writes in."
        },
        {
            "role": "user",
            "content": "मेरा ऑर्डर कहाँ है? ऑर्डर नंबर #12345"
        }
    ],
    "temperature": 0.7,
    "max_tokens": 500
}

response = requests.post(url, json=payload, headers=headers)
result = response.json()
print(result["choices"][0]["message"]["content"])

Language Codes Reference 2026

LanguageCodeScript
Hindihi-INDevanagari
Bengalibn-INBengali
Tamilta-INTamil
Telugute-INTelugu
Marathimr-INDevanagari
Gujaratigu-INGujarati
Kannadakn-INKannada
Malayalamml-INMalayalam
Odiaod-INOdia
Punjabipa-INGurmukhi
Assameseas-INBengali
Urduur-INNastaliq
Englishen-INLatin

Error Handling 2026

Common Error Codes 2026

CodeMeaningSolution
401Invalid API keyCheck your API-Subscription-Key header
402Insufficient creditsTop up credits on the dashboard
413Audio too largeUse Batch API for files over 30 seconds
415Unsupported formatConvert to a supported audio format
429Rate limit exceededImplement exponential backoff
500Server errorRetry with backoff, contact support if persistent

Robust Error Handling Pattern 2026

import requests
import time

def sarvam_api_call(url, payload, headers, max_retries=3):
    """Make API call with retry logic and error handling."""
    for attempt in range(max_retries):
        try:
            response = requests.post(url, json=payload,
                                    headers=headers, timeout=30)
            if response.status_code == 200:
                return response.json()
            elif response.status_code == 429:
                wait_time = 2 ** attempt
                time.sleep(wait_time)
                continue
            elif response.status_code == 402:
                raise Exception("Insufficient credits")
            else:
                if attempt < max_retries - 1:
                    time.sleep(1)
                    continue
                raise Exception(f"API error: {response.status_code}")
        except requests.exceptions.Timeout:
            if attempt < max_retries - 1:
                time.sleep(2)
                continue
            raise
    raise Exception("Max retries exceeded")

Rate Limits & Optimization 2026

Optimization Tips 2026

  • Cache translations: Store repeated translations to avoid redundant API calls
  • Batch where possible: Group multiple short translations into fewer calls
  • Compress audio: Use OGG/OPUS for smaller file sizes with maintained quality
  • Specify language: Always provide language codes instead of relying on auto-detection
  • Use connection pooling: Reuse HTTP connections for multiple API calls

Credit-Saving Cache Pattern 2026

import hashlib

translation_cache = {}

def translate_with_cache(text, source_lang, target_lang):
    cache_key = hashlib.md5(
        f"{text}:{source_lang}:{target_lang}".encode()
    ).hexdigest()

    if cache_key in translation_cache:
        return translation_cache[cache_key]

    payload = {
        "input": text,
        "source_language_code": source_lang,
        "target_language_code": target_lang,
        "model": "mayura:v1"
    }
    response = requests.post(
        "https://api.sarvam.ai/translate",
        json=payload, headers=headers
    )
    result = response.json()["translated_text"]
    translation_cache[cache_key] = result
    return result

Production Best Practices 2026

Security 2026

  • Never hardcode API keys: Use environment variables or secret managers
  • Rotate keys regularly: Generate new keys periodically
  • Use server-side calls: Never expose API keys in frontend JavaScript
  • Set up alerts: Monitor for unusual credit consumption

Architecture Patterns 2026

  • Build a proxy layer: Route all Sarvam API calls through your backend
  • Implement queuing: Use Redis/RabbitMQ for high-volume batch processing
  • Add monitoring: Track API latency, error rates, and credit usage
  • Cache aggressively: Cache translations, transliterations for repeated content
  • Handle failures gracefully: Fall back to English if translation fails

End-to-End Voice Pipeline 2026

def voice_to_voice_pipeline(audio_file, target_language):
    """
    Complete pipeline: Listen in any Indian language,
    understand, respond, and speak back.
    """
    # Step 1: Transcribe + Translate to English
    transcript = speech_to_text_translate(audio_file)

    # Step 2: Process with chat completion
    response = chat_completion(
        system="You are a helpful assistant.",
        user=transcript["translated_text"]
    )

    # Step 3: Translate response to target language
    translated = translate(
        text=response, source="en-IN",
        target=target_language
    )

    # Step 4: Generate speech in target language
    audio = text_to_speech(
        text=translated, language=target_language
    )
    return audio

FAQs: Sarvam AI API 2026

How do I get started with Sarvam AI API in 2026?

Sign up at sarvam.ai to get ₹1,000 free API credits. Generate your API key from the dashboard, install the Python SDK with 'pip install sarvam-ai', and make your first API call. The quickstart guide at docs.sarvam.ai covers setup in under 5 minutes with code examples for all major APIs.

What programming languages does Sarvam AI SDK support in 2026?

Sarvam AI provides an official Python SDK and REST APIs that work with any programming language. The REST API can be called from JavaScript, Java, Go, Ruby, PHP, or any language that supports HTTP requests. The Python SDK offers the most streamlined integration with built-in helper functions.

What audio formats does Sarvam AI Speech-to-Text API support in 2026?

Sarvam AI's Saarika STT model supports WAV, MP3, AAC, AIFF, OGG, OPUS, FLAC, MP4/M4A, AMR, WMA, WebM, and PCM audio formats. For REST API, audio should be under 30 seconds. For longer audio, use the Batch API for asynchronous processing.

Can I use Sarvam AI API for real-time applications in 2026?

Yes, Sarvam AI APIs are optimized for real-time applications with low latency since servers are based in India. The REST API for speech-to-text provides instant results for audio under 30 seconds. Text-to-speech and translation APIs return results in milliseconds. For streaming use cases, check the WebSocket API documentation.

Key Takeaways: Sarvam AI API 2026

  • Comprehensive API Suite 2026: Sarvam covers STT (Saarika), TTS (Bulbul), Translation (Mayura), Transliteration, and Chat Completion for 22+ Indian languages in one platform.
  • Developer-Friendly 2026: Python SDK, REST APIs, starter notebooks, and comprehensive documentation at docs.sarvam.ai make integration straightforward.
  • Production-Ready 2026: Low-latency India-based servers, multiple audio format support, batch processing for long files, and enterprise SLAs for production workloads.
  • Cost-Effective 2026: Starting at ₹30/hour for STT and ₹1,000 free credits, Sarvam is 2-3x cheaper than global alternatives with better Indian language accuracy.
  • Build Complete Pipelines 2026: Chain STT → Translation → Chat → TTS for end-to-end multilingual voice applications serving India's 1.4 billion population.

Need Help Integrating Indian Language AI?

Distk helps businesses integrate Sarvam AI and other Indian language AI platforms into their products and workflows. From architecture design to production deployment, let's build your multilingual AI stack.

Schedule a Callback