Video & Audio Processing

AI-powered transcription using Whisper with caption cleanup and export to VTT/SRT formats.

Capabilities

faster-whisper for accurate speech-to-text with timestamps.

AI fixes technical terminology, punctuation, and formatting.

Export to standard caption formats for any video player.

AI-generated descriptions for visual content (coming soon).

MP4, MOV, AVI, MKV, WebM, MP3, WAV, M4A, OGG

curl -X POST http://localhost:8000/api/v1/transcribe \
  -F "[email protected]" \
  -F "output_format=vtt"