WhisperX

Fast Transcription with Word Timestamps

WhisperX provides lightning-fast transcription with accurate word-level timestamps. Built on OpenAI's Whisper model with additional optimizations for speed and accuracy. Supports 99+ languages.

Endpoint

POST https://api-gpuse.maatrics.com/v1/whisperx/transcribe

Parameters

NameTypeRequiredDescription
urlstringYesURL of the audio/video file
languagestringNoLanguage code (auto-detect if not specified)
alignbooleanNoEnable word-level timestamps (default: true)
diarizebooleanNoAdd speaker labels (default: false)
webhookstringNoURL for completion notification

Supported Languages

enesfrdeitptnlruzhjakoartrpluk+84 more

Request Example

bash
curl -X POST "https://api-gpuse.maatrics.com/v1/whisperx/transcribe" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/podcast.mp3",
    "language": "en",
    "align": true,
    "diarize": false,
    "webhook": "https://your-server.com/webhook"
  }'

Response

json
{
  "job_id": "abc123-def456-ghi789",
  "status": "processing",
  "created_at": "2024-01-15T10:30:00Z"
}

Completed Result

json
{
  "job_id": "abc123-def456-ghi789",
  "status": "completed",
  "result": {
    "text": "Hello, welcome to our podcast...",
    "segments": [
      {
        "start": 0.0,
        "end": 2.5,
        "text": "Hello, welcome to our podcast.",
        "words": [
          {"word": "Hello,", "start": 0.0, "end": 0.4},
          {"word": "welcome", "start": 0.5, "end": 0.9},
          {"word": "to", "start": 1.0, "end": 1.1},
          {"word": "our", "start": 1.2, "end": 1.4},
          {"word": "podcast.", "start": 1.5, "end": 2.5}
        ]
      }
    ],
    "language": "en",
    "duration_seconds": 3600.0
  },
  "cost": 2.40,
  "processing_time": 180.5
}

Pricing

$0.004per minute of audio

Billed per second. Minimum charge: 1 second.