Skip to main content
Common questions about Augent. For setup help, see Quick start. For tool details, see MCP Tools.

Table of contents

General Transcription Search Memory Privacy and security Performance Configuration Compatibility

General

What is Augent?

Augent is a complete audio processing pipeline exposed as MCP tools. It downloads, transcribes, indexes, and searches audio and video content entirely on your machine. Give it URLs or files, get structured answers back.

Who is Augent for?

Anyone whose work touches audio or video. Researchers, developers, legal teams, educators, analysts, content creators, journalists. If you need answers from content without sitting through it, Augent handles it.

Is Augent free?

Yes. Augent is open source under the MIT license. No API keys, no subscriptions, no usage limits.

Does anything get sent to an external server?

No. Everything runs locally. Transcription, search, embeddings — all on your machine. The only network calls Augent makes are when you ask it to download audio from a URL.

Transcription

How does transcription work?

Augent uses faster-whisper, a fast local implementation of OpenAI’s Whisper model. Everything runs on your machine.

Which model size should I use?

tiny is the default and handles almost everything: tutorials, interviews, lectures, podcasts, even audio with background music. Use small or above for heavy accents, very poor audio quality, or song lyrics.
ModelSpeedAccuracy
tinyFastestExcellent (default)
baseFastExcellent
smallMediumSuperior
mediumSlowOutstanding
largeSlowestMaximum

Do I need a GPU?

No. Augent runs on CPU by default. If you have a CUDA-compatible GPU, it will use it automatically for faster transcription.

What languages are supported?

Whisper supports 99+ languages. Augent auto-detects the language from the audio.

What audio formats can I use?

MP3, WAV, M4A, FLAC, OGG, WebM, and any other format FFmpeg can handle.

What sites can I download audio from?

1,000+ sites. YouTube, Vimeo, TikTok, Twitter/X, SoundCloud, Twitch, and anything else yt-dlp supports.
Keyword search finds exact word matches. Fast and precise. Deep search finds matches by meaning. It uses embeddings to understand what was said, even when the exact words don’t match your query.

How does proximity search work?

It finds where two keywords appear near each other. For example, “pricing” near “competitor” returns only the moments where both concepts come up together.

Can I search multiple files at once?

Yes. batch_search searches multiple audio files in parallel. No file limit.

Can I search across everything I have ever transcribed?

Yes. search_memory queries all your stored transcriptions at once. No file path needed, no limit on how many files it searches.

Memory

How does memory work?

Every transcription is stored by file hash in a local SQLite database. The first time you process a file, it transcribes. Every time after that, results are instant.

Where is memory stored?

~/.augent/memory/. Transcriptions in transcriptions.db, markdown copies in transcriptions/.

Does memory persist between sessions?

Yes. Memory is permanent until you clear it.

How do I clear memory?

Use the clear_memory tool or run augent memory clear from the CLI.

Is there a storage limit?

No. Memory grows as you transcribe. A typical transcription takes a few KB.

Privacy and security

Is my data private?

Yes. Audio stays local, transcriptions stay local, search stays local. Nothing leaves your device. The only network activity is when you ask Augent to download audio from a URL you provide.

Can I use Augent fully offline?

Yes, for everything except downloading new audio from URLs. Transcription, search, and all analysis tools work without an internet connection.

Performance

How fast is transcription?

With the tiny model on a modern machine, Augent transcribes faster than real-time. A 1-hour file typically takes a few minutes.

Why is the first search slow but after that it is instant?

The first search triggers transcription if the file hasn’t been processed before. Once it is in memory, every search after that queries the stored transcript instantly.

Configuration

Can I customize default settings?

Yes. Create ~/.augent/config.yaml to set defaults for model size, output directories, clip padding, TTS voice, and more. Per-call arguments always override config values. No config file is required — all values have sensible defaults. See Configuration.

Can I hide tools I don’t need?

Yes. Add tool names to disabled_tools in your config file. They are removed from the tool list entirely and cannot be called. See Configuration.

Compatibility

What operating systems are supported?

macOS and Linux natively. Windows via WSL2 or pip install.

What Python version do I need?

Python 3.10 or above.

What MCP clients work with Augent?

Any MCP client. Claude Code, Codex, and OpenClaw are tested and documented. Any other MCP-compatible client works the same way.

Does Augent have a CLI?

Yes. Full CLI for terminal workflows. Run augent --help to see all commands.

Does Augent have a web UI?

Yes. Run augent-web and open http://127.0.0.1:8282. Runs 100% locally.