Local vs cloud AI assistants
This is a practical comparison, not an ideological one. Both models work. They differ in where captured data sits, which AI runs where, and what you need to explain to your security team.
What runs where in Overshow
Capture, transcription, search, and chat all run on the laptop.
- Screen capture: hybrid event-driven on macOS, with a 0.5 FPS floor. Apple Vision does the OCR.
- Audio: microphone and system audio, transcribed on-device with FluidAudio Parakeet TDT v3 in fixed 20-30 second windows.
- Speaker identification: FluidAudio Sortformer fastV2_1 produces 4-speaker streaming diarisation per meeting; profiles are linked locally.
- Search: SQLite FTS5 for full text, plus EmbeddingGemma 300M for semantic search.
- Chat and Ask: Gemma 4 E2B via MLX Swift on Apple Silicon.
Nothing in that list requires a network call to process your captured content.
What leaves the device, and when
The desktop app is local. The account does sit in the cloud, because billing, device registration, and SSO need to. What stays on the machine is the captured content itself.
Optional paths where data does leave:
- MCP server: exposes approved tools to AI clients you explicitly configure (Claude Desktop, Cursor, Jan, LM Studio, Ollama). Most tools are retrieval-only; commitment and meeting-classification tools can update local metadata. Requires the app to be running, and you approve what connects.
- Exports: you run them manually when you want to.
- Cloud assistant integrations (if you add them): require explicit consent and default to local-first.
Where cloud-first is a better fit
Cloud-first assistants are usually the right choice when:
- You want the most capable frontier models available without buying Apple Silicon.
- You do not capture screens or audio, only chat transcripts.
- Your security posture already permits sending work content to the provider.
Where local-first is a better fit
Local-first is usually the right choice when:
- You record screens or meeting audio and would rather not upload that material to a third party.
- You work across clients, vendors, or regulated environments and need the boundary to be the laptop.
- You want semantic search over your own history without an external index service.
Questions to ask before deciding
- Does the assistant capture your screen or audio? If yes, where is it stored?
- Can the assistant run transcription and embeddings without a cloud call?
- What does exporting or deleting everything look like?
- What does a security reviewer have to approve: an API or a database file?