Skip to Content
MCP Toolstranscribe_audio

transcribe_audio

Transcribe an audio file and return the full text with timestamps. Results are cached automatically.


Model Sizes

ModelSpeedAccuracy
tinyFastestExcellent (default)
baseFastExcellent
smallMediumSuperior
mediumSlowOutstanding
largeSlowestMaximum

Use tiny for nearly everything. Only upgrade for heavy accents, poor audio quality, or lyrics.


Example

Request:

{ "audio_path": "/Users/you/Downloads/podcast.webm", "model_size": "tiny" }

Response:

{ "text": "Full transcription text...", "language": "en", "duration": 1076.12, "duration_formatted": "17:56", "segment_count": 430, "cached": false, "model_used": "tiny" }

Caching

  • Transcriptions are cached by file content hash + model size
  • Same file, same model = instant cache hit
  • Same file, different model = new transcription
  • Modified file = new transcription (hash changes)
  • A markdown file is also saved to ~/.augent/cache/transcriptions/
Last updated on