Now with YOLO v11 support

State-of-the-art
AI Models.
One API.

Access Pyannote, WhisperX, YOLO, and MediaPipe through simple REST endpoints. No GPU management. No infrastructure headaches. Just code.

Speaker Diarization

Identify who spoke when in audio

Request
cURL
curl -X POST "https://api-gpuse.maatrics.com/v1/pyannote/diarize" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/audio.wav"}'
Response
JSON
{
  "job_id": "abc123",
  "status": "processing",
  "webhook": "https://your-server.com/webhook"
}

Powerful AI Models

Choose from our collection of production-ready models. Each one optimized for speed and accuracy.

Pyannote

Speaker diarization - identify who spoke when in your audio files with state-of-the-art accuracy.

POST /v1/pyannote/diarize
Starting at$0.006/min

WhisperX

Lightning-fast transcription with word-level timestamps. Supports 99+ languages.

POST /v1/whisperx/transcribe
Starting at$0.004/min

YOLO v11

Real-time object detection in images and videos. Detect 80+ object classes instantly.

POST /v1/yolo-v11/detect
Starting at$0.002/image

MediaPipe

Face mesh and pose estimation. Get 468 facial landmarks and 33 body keypoints.

POST /v1/mediapipe/face-mesh
Starting at$0.001/image

Why GPUSE?

Everything you need to integrate AI into your applications.

GPU Powered

All models run on dedicated NVIDIA GPUs for maximum performance.

Webhook Support

Get notified instantly when your job completes via webhooks.

Enterprise Ready

SOC 2 compliant infrastructure with 99.9% uptime SLA.

Pay As You Go

No subscriptions. Only pay for what you use, down to the second.

Ready to get started?

Start processing your first file in under 5 minutes. No credit card required.