Skip to content
tonebox
Recording

Everything you say,
kept. On your Mac.

Tonebox records your calls and meetings, transcribes them with Whisper on-device, and turns months of conversations into a searchable, answerable archive. No cloud. No account. Your voice never leaves the machine.

Private beta · macOS 15+ · Apple Silicon & Intel

Audio never leaves your Mac
Whisper on-device
Bring your own LLM
tonebox · recording
REC · 14:32
MacBook Microphone + System Audio
−25 dBLive transcript on · Whisper large-v3−17 dB
Live caption

“…so the way I'd frame the renewal pitch is around the time-to-value story we saw with the Q3 cohort — let's pull those numbers before Friday.”

How it works

One session, from capture to answers.

Tonebox treats recording, transcription, summarising, and asking as one continuous workflow — each step feeds the next without you copying anything between apps.

00:04
Capture

Hotkey-trigger a recording, dictate into any app, or drop in audio you already have. Mic + system audio both work.

00:31
Transcribe

Whisper runs on-device with speaker diarisation. CoreML acceleration on Apple Silicon, CPU fallback on Intel.

12:48
Summarise

Auto-summarise sessions into decisions, action items, and follow-ups. Plug your own LLM API in.

38:00
Search & ask

Full-text and semantic search across every session, with cited answers — your spoken notes become a knowledge base.

Founder dinner notes · transcript
Summary· 38 min · 3 speakers

Decision: open New York office this quarter. Marc to confirm the Tuesday sublease by Friday; Renata will brief Atlas on Monday so they hear it from us first.

02:14
38:00
RENATA
02:14

I want us to leave dinner with one decision — should we open the New York office this quarter or push to Q1?

MARC
02:26

Pushing buys us a cleaner runway, but we lose the lead we already met with on Tuesday. They were ready to sign a sublease.

RENATA
02:40

Right — and Atlas already cited the New York presence as part of why they renewed. That's a real signal.

Transcribe

A transcript that's actually navigable.

Every session lives in a single document — diarised, summarised, and tied back to the audio it came from. No copy-pasting between Otter, Notion, and Apple Voice Memos.

  • Speaker diarisation

    Per-session diarisation runs alongside transcription so quotes carry attribution from the start.

  • Auto-summary card

    Each session gets a structured summary — decisions, action items, owners — refreshed every time the transcript changes.

  • Click-to-play timestamps

    Jump to any moment in the recording from the transcript line. Bookmark moments live with ⌘⇧M.

Ask

Your spoken notes, finally interrogable.

Ask anything across the conversations you've had — meetings, dictation, calls, voice memos. Tonebox answers in plain English with the relevant clips cited underneath.

  • Searches across every session

    Embedded once, queryable forever. Tonebox keeps a local vector index of your transcripts so answers stay grounded in what was actually said.

  • Cited back to the audio

    Every answer carries inline citations that link to the exact transcript line and timestamp — auditable, never made-up.

  • Bring your own model

    Pick the LLM that fits your privacy bar — Claude, GPT, or a local Llama via Ollama. Your audio stays on disk either way.

Ask Tonebox
What did Marc say about the New York sublease?⌘ K
Answer· 2 sourcesStreaming…

MARC · Founder dinner notes · 02:41
“Pushing buys us a cleaner runway, but we lose the lead we already met with on Tuesday.”
MARC · Renewal sync — Atlas · 18:09
“The Manhattan space was the only one inside the board cap — everything else trips the threshold we set.”
Features

Built for the way voice work actually flows.

Tonebox isn't a transcription web app with extra buttons — it's a desktop tool that lives next to the apps you already use, with hotkeys, system-wide dictation, and local storage at the core.

Push-to-talk dictation

Hold a hotkey and speak — Tonebox types into whatever app is focused. Works system-wide.

Mic + system audio

Record both sides of a Zoom call, the room mic, or both at once. No virtual audio drivers.

Mark moments live

⌘⇧M drops a pin during a recording so the moment is one click away in the transcript later.

Smart sessions

Auto-named, auto-grouped sessions with project folders, tags, and full-text search across the lot.

Auto-summaries

Decisions, action items, and follow-ups extracted from each session — re-summarises when you edit.

Semantic search

Local embeddings index every word so you can find clips by meaning, not just keyword.

Import anything

Drop in audio, PDFs, Word docs, web articles, even YouTube links — everything joins the same searchable library.

End-to-end local

Everything lives in a folder on your Mac. No cloud account, no upload, telemetry off by default.

Jira & agents built in

Push extracted tasks to Jira, or let your AI agents query the library over MCP. Your archive works for you.

Privacy

Local-first, by construction.

Voice work is some of the most sensitive content you produce — calls with customers, dinners with co-founders, dictation that contains things you haven't even decided yet. Tonebox treats that as the default, not the upgrade.

Default-on guarantees
  • • No analytics or telemetry pings unless you opt in.
  • • No account, no email, no login wall.
  • • No background uploads. The app works fully offline.
  • • Cloud LLMs are opt-in and per-request — you see exactly what's sent.
Audio never leaves your Mac

Recordings, transcripts, and embeddings live in a folder you can move, back up, or delete. There is no cloud copy, ever.

Whisper runs on-device

Transcription is local — Apple Silicon CoreML acceleration, Intel CPU fallback. No audio is shipped to a server to be transcribed.

Your LLM, your call

Want full local? Point the summariser at Ollama. Need quality? Use Claude or GPT with your own API key. We don't proxy your data.

Signed, notarized, sandboxed keys

The app is Developer ID-signed and notarized by Apple; your API keys live in the macOS Keychain, never in plaintext files.

Compare

Why people pick Tonebox over the alternatives.

We respect the tools below — Otter is great in the browser, Voice Memos is dead simple. Tonebox solves a different problem: keeping your voice work fast, local, and queryable on a single Mac.

FeatureToneboxOtter.aiVoice MemosMacWhisper-style apps
Records on-device
Transcribes locally (Whisper)
System-wide dictation hotkeypartial
Speaker diarisationpartial
Auto-summaries with custom promptspartialpartial
Ask across all sessions (RAG)partial
Bring-your-own LLMpartial
No cloud account required
Tasks, Jira sync & MCP agents
Records on-device
  • Tonebox
  • Otter.ai
  • Voice Memos
  • MacWhisper-style apps
Transcribes locally (Whisper)
  • Tonebox
  • Otter.ai
  • Voice Memos
  • MacWhisper-style apps
System-wide dictation hotkey
  • Tonebox
  • Otter.ai
  • Voice Memos
  • MacWhisper-style appspartial
Speaker diarisation
  • Tonebox
  • Otter.ai
  • Voice Memos
  • MacWhisper-style appspartial
Auto-summaries with custom prompts
  • Tonebox
  • Otter.aipartial
  • Voice Memos
  • MacWhisper-style appspartial
Ask across all sessions (RAG)
  • Tonebox
  • Otter.aipartial
  • Voice Memos
  • MacWhisper-style apps
Bring-your-own LLM
  • Tonebox
  • Otter.ai
  • Voice Memos
  • MacWhisper-style appspartial
No cloud account required
  • Tonebox
  • Otter.ai
  • Voice Memos
  • MacWhisper-style apps
Tasks, Jira sync & MCP agents
  • Tonebox
  • Otter.ai
  • Voice Memos
  • MacWhisper-style apps
FAQ

Honest answers to the questions we get most.

Anything missing? Reach out at contact and we'll add it.

How do I get Tonebox?
Tonebox is in private early access. Leave your email in the form below and we'll send your invite — a signed, notarized build with auto-updates — as seats open up. We sequence invites by use case, so a sentence about what you'd use it for genuinely helps.
What does it cost?
Nothing during early access. We want pricing to come from real usage, not guesses — early-access users will hear about any pricing plans first, and won't be surprised by them.
Do you upload my recordings anywhere?
No. Recordings, transcripts, and embeddings stay in a local folder on your Mac. The only network traffic is when you explicitly ask a cloud LLM to summarise or answer — and even then, you choose the provider and only the relevant transcript snippets are sent.
What runs on-device versus in the cloud?
Recording, transcription (Whisper), diarisation, search index, and full-text/semantic search all run on-device. Summaries and the “ask” surface optionally call out to an LLM you pick — Anthropic, OpenAI, or a local model via Ollama.
What hardware do I need?
A Mac running macOS 15 (Sequoia) or later. Apple Silicon is recommended for fast Whisper transcription via CoreML, but Intel Macs work too with the CPU fallback.
Can I record both sides of a Zoom or Teams call?
Yes. Tonebox captures the system audio output alongside your microphone in a single session, with both sides diarised in the transcript. No virtual audio drivers required. It can even auto-start when it detects a meeting beginning.
What about Windows or Linux?
Not yet. Tonebox is macOS-only at launch — the dictation and system-audio plumbing are deeply Mac-specific. Windows and Linux are on the roadmap but not soon.
Is there an iOS or Android app?
Not at launch. The mobile companion is in development; for now Tonebox is a Mac-only desktop app.
Can my team share recordings?
Tonebox is a single-user, local-first app today. End-to-end encrypted sync and sharing are built and in testing — they'll roll out to early-access users first, opt-in, never required.
Get early access

Your words are worth keeping.

Tonebox is in private early access while we polish the beta with a small group. Leave your email and we'll send your invite — with the signed build and five-minute setup — as seats open up.

No newsletter, no spam. One email when your invite is ready.