transcribe_audio
Transcribe an audio file and return the full text with timestamps. Results are cached automatically.
Model Sizes
| Model | Speed | Accuracy |
|---|---|---|
| tiny | Fastest | Excellent (default) |
| base | Fast | Excellent |
| small | Medium | Superior |
| medium | Slow | Outstanding |
| large | Slowest | Maximum |
Use
tinyfor nearly everything. Only upgrade for heavy accents, poor audio quality, or lyrics.
Example
Request:
{
"audio_path": "/Users/you/Downloads/podcast.webm",
"model_size": "tiny"
}Response:
{
"text": "Full transcription text...",
"language": "en",
"duration": 1076.12,
"duration_formatted": "17:56",
"segment_count": 430,
"cached": false,
"model_used": "tiny"
}Caching
- Transcriptions are cached by file content hash + model size
- Same file, same model = instant cache hit
- Same file, different model = new transcription
- Modified file = new transcription (hash changes)
- A markdown file is also saved to
~/.augent/cache/transcriptions/
Last updated on