Back to Features
Video & Audio Processing
AI-powered transcription using Whisper with caption cleanup and export to VTT/SRT formats.
Capabilities
Transcription
faster-whisper for accurate speech-to-text with timestamps.
Caption Cleanup
AI fixes technical terminology, punctuation, and formatting.
VTT/SRT Export
Export to standard caption formats for any video player.
Audio Description
AI-generated descriptions for visual content (coming soon).
Supported Formats
MP4, MOV, AVI, MKV, WebM, MP3, WAV, M4A, OGG
API Usage
curl -X POST http://localhost:8000/api/v1/transcribe \
-F "[email protected]" \
-F "output_format=vtt"