Skip to main content
Saves an MP3 file to your Desktop (or custom directory). No account required.

Example

Request:
{
  "text": "Augent lets Claude transcribe, search, and analyze audio files locally.",
  "voice": "af_heart"
}
Response:
{
  "file_path": "/Users/you/Desktop/tts_20260208_143256.mp3",
  "voice": "af_heart",
  "language": "American English",
  "duration": 5.82,
  "duration_formatted": "0:05",
  "sample_rate": 24000,
  "text_length": 89
}

Parameters

ParameterRequiredDefaultDescription
textNoText to convert to speech. Either text or file_path is required.
file_pathNoPath to a notes file to read aloud. Strips markdown formatting, skips metadata, generates MP3, and embeds an audio player in the file.
job_idNoCheck status of a running TTS job. Pass the job_id returned from a previous call.
voiceNoaf_heartVoice ID (see voices below)
output_dirNo~/DesktopDirectory to save the MP3 file
output_filenameNoauto-generatedCustom filename for the output
speedNo1.0Speech speed multiplier

Voices

American English

VoiceGender
af_heartFemale (default)
af_alloyFemale
af_aoedeFemale
af_bellaFemale
af_jessicaFemale
af_koreFemale
af_nicoleFemale
af_novaFemale
af_riverFemale
af_sarahFemale
af_skyFemale
am_adamMale
am_echoMale
am_ericMale
am_fenrirMale
am_liamMale
am_michaelMale
am_onyxMale
am_puckMale

British English

VoiceGender
bf_emmaFemale
bf_isabellaFemale
bf_lilyFemale
bm_danielMale
bm_fableMale
bm_georgeMale
bm_lewisMale

Other Languages

LanguageVoices
Spanishef_dora, em_alex
Frenchff_siwis
Hindihf_alpha, hf_beta, hm_omega, hm_psi
Italianif_sara, im_nicola
Japanesejf_alpha, jf_gongitsune, jf_nezumi, jf_tebukuro, jm_kumo
Brazilian Portuguesepf_dora, pm_alex
Mandarin Chinesezf_xiaobei, zf_xiaoni, zf_xiaoxiao, zf_xiaoyi, zm_yunjian, zm_yunxi, zm_yunxia, zm_yunyang

Notes

Voice ID format: first letter = language (a American, b British, e Spanish, etc.), second letter = gender (f female, m male), rest = name.
The Kokoro model (~350MB) downloads automatically on first use and is cached locally. After that, it works offline.
Generate notes from a video with take_notes, then read the summary back as audio. One prompt, two tools.