Cache key
Every transcription is keyed by file hash + model size:- Same file, same model = instant cache hit
- Same file, different model = new transcription
- Modified file = different hash = new transcription
Storage location
Everything lives in~/.augent/memory/:
| File | Purpose |
|---|---|
transcriptions.db | SQLite database with all cached data |
transcriptions/*.md | One markdown file per transcription (human-readable, Obsidian-compatible) |
What gets cached
| Data | Cache key | Storage |
|---|---|---|
| Transcriptions | file_hash:model_size | SQLite + .md file |
| Embeddings | file_hash:embedding_model | SQLite (numpy BLOB) |
| Diarization | file_hash:num_speakers | SQLite |
| Source URLs | file_hash | SQLite |
| Audio separation | file_hash:model:stem_mode | Filesystem (~/.augent/separated/) |
Model caching
Whisper models stay loaded in memory between tool calls. The MCP server is a long-lived process — once a model is loaded for the first transcription, subsequent transcriptions with the same model size are faster because there’s no model loading overhead. The sentence-transformer model (all-MiniLM-L6-v2, ~80MB) is also loaded once and kept in memory.
Translations
When you translate a non-English transcription, the English version is stored as a sibling(eng) markdown file alongside the original. Both appear in the Memory Explorer and Web UI.
Managing memory
| Tool | What it does |
|---|---|
list_memories | Browse all stored transcriptions with titles, durations, dates |
memory_stats | Total count, duration, and storage size |
clear_memory | Delete all cached data |

