Transcription

View as Markdown

Use the transcription API to upload an audio file and poll for the completed transcript.

Submit a job

1POST /v1/transcription/jobs
$curl -X POST "https://api.voice.formantai.com/v1/transcription/jobs" \
> -H "Authorization: Bearer $FORMANT_API_KEY" \
> -F "audio=@call.wav" \
> -F "languages=hi-IN,en-IN" \
> -F "diarize=true" \
> -F "format=standard"
Form fieldTypeRequiredDescription
audiofileYesAudio file to transcribe.
languagesstringNoComma-separated language codes, or unknown for auto-detection.
diarizebooleanNoInclude speaker segments when supported.
formatstringNostandard or code_mixed.
1{
2 "id": "job_abc123",
3 "engine": "sarvam",
4 "status": "pending"
5}

Poll a job

1GET /v1/transcription/jobs/{job_id}
$curl "https://api.voice.formantai.com/v1/transcription/jobs/job_abc123" \
> -H "Authorization: Bearer $FORMANT_API_KEY"
1{
2 "id": "job_abc123",
3 "engine": "sarvam",
4 "status": "completed",
5 "text": "नमस्ते, मैं राहुल बोल रहा हूं.",
6 "language": "hi-IN",
7 "utterances": [
8 {
9 "speaker": "speaker_0",
10 "text": "नमस्ते, मैं राहुल बोल रहा हूं.",
11 "start": 0.2,
12 "end": 2.8
13 }
14 ]
15}

Supported language examples

The API supports English and multiple Indic language codes, including:

hi-IN, bn-IN, kn-IN, ml-IN, mr-IN, pa-IN, ta-IN, te-IN, en-IN, gu-IN, as-IN, ur-IN, and unknown.

Format modes

FormatUse when
standardYou want normal transcription output.
code_mixedAudio may switch between English and Indian languages.