Pyannote
Speaker Diarization API
Pyannote is a state-of-the-art speaker diarization model that identifies who spoke when in an audio file. It can automatically detect the number of speakers or work with a specified count.
Endpoint
POST https://api-gpuse.maatrics.com/v1/pyannote/diarize Parameters
| Name | Type | Required | Description |
|---|---|---|---|
| url | string | Yes | URL of the audio file to process |
| webhook | string | No | URL to receive completion notification |
| num_speakers | integer | No | Exact number of speakers (if known) |
| min_speakers | integer | No | Minimum number of speakers |
| max_speakers | integer | No | Maximum number of speakers |
Request Example
bash
curl -X POST "https://api-gpuse.maatrics.com/v1/pyannote/diarize" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/meeting.wav",
"webhook": "https://your-server.com/webhook",
"num_speakers": 2,
"min_speakers": 1,
"max_speakers": 5
}'Response
Initial response when job is created:
json
{
"job_id": "5d2aee8b-c35b-4fdc-af7d-3309b19b7420",
"status": "processing",
"created_at": "2024-01-15T10:30:00Z",
"estimated_duration": "2 minutes"
}Completed Result
Response when job is completed (via webhook or polling):
json
{
"job_id": "5d2aee8b-c35b-4fdc-af7d-3309b19b7420",
"status": "completed",
"result": {
"segments": [
{
"speaker": "SPEAKER_00",
"start": 0.0,
"end": 2.5,
"text": null
},
{
"speaker": "SPEAKER_01",
"start": 2.7,
"end": 5.2,
"text": null
},
{
"speaker": "SPEAKER_00",
"start": 5.5,
"end": 8.1,
"text": null
}
],
"num_speakers": 2,
"duration_seconds": 120.5
},
"cost": 0.012,
"processing_time": 45.2
}Pricing
$0.006per minute of audio
Billed per second. Minimum charge: 1 second.