Skip to main content
Augent remembers everything it transcribes. The first run processes the audio. Every operation after that is instant.

Cache key

Every transcription is keyed by file hash + model size:
SHA256(file_content):model_size
  • Same file, same model = instant cache hit
  • Same file, different model = new transcription
  • Modified file = different hash = new transcription
The file content is hashed in 8KB chunks using SHA256. This means renaming or moving a file doesn’t invalidate the cache — only changing the content does.

Storage location

Everything lives in ~/.augent/memory/:
FilePurpose
transcriptions.dbSQLite database with all cached data
transcriptions/*.mdOne markdown file per transcription (human-readable, Obsidian-compatible)

What gets cached

DataCache keyStorage
Transcriptionsfile_hash:model_sizeSQLite + .md file
Embeddingsfile_hash:embedding_modelSQLite (numpy BLOB)
Diarizationfile_hash:num_speakersSQLite
Source URLsfile_hashSQLite
Audio separationfile_hash:model:stem_modeFilesystem (~/.augent/separated/)
Each type of data is cached independently. You can diarize with different speaker counts without re-transcribing. You can run semantic search without re-computing embeddings on subsequent queries.

Model caching

Whisper models stay loaded in memory between tool calls. The MCP server is a long-lived process — once a model is loaded for the first transcription, subsequent transcriptions with the same model size are faster because there’s no model loading overhead. The sentence-transformer model (all-MiniLM-L6-v2, ~80MB) is also loaded once and kept in memory.

Translations

When you translate a non-English transcription, the English version is stored as a sibling (eng) markdown file alongside the original. Both appear in the Memory Explorer and Web UI.

Managing memory

ToolWhat it does
list_memoriesBrowse all stored transcriptions with titles, durations, dates
memory_statsTotal count, duration, and storage size
clear_memoryDelete all cached data
Or use the Web UI to browse, search, and delete individual transcriptions.