Screen capture
Event-driven screen capture on macOS, multi-monitor support, native OCR, accessibility-first text extraction, filtering, and what is stored for search.
Last updated: 2 April 2026
What screen capture is for
Overshow records what you see on your displays so you can find it later: exact phrases from documents, terminal output, chat threads, and browser content. Capture runs on your device; readable text and structured metadata are stored locally for full-text and semantic search. Overshow does not fabricate screen content. it captures what was on screen and indexes what it found.
This page explains how capture is scheduled, how text is obtained (accessibility and OCR), how privacy controls apply, and what actually lands in your database.
No screen images or video files are persisted. Only text, metadata, and signals derived at capture time are stored, under your organisation's retention and pause/resume choices.
How capture works
On macOS, capture is event-driven. The system listens for accessibility signals. Window focus changes, content updates, UI transitions. And triggers captures in response. There is no fixed polling interval; frames are taken when meaningful changes actually happen. This keeps CPU low during quiet periods and responsive during active work.
On Windows, capture is polling-based so periodic sampling continues even when the platform provides few observable events.
macOS versus Windows
| Aspect | macOS | Windows |
|---|---|---|
| Capture model | Event-driven via accessibility signals | Polling-based |
| Text priority | Accessibility-first, OCR fallback | OCR and platform text APIs |
| OCR engine | Apple Vision | Windows Media OCR |
Why event-driven on macOS?
Event-driven capture reacts when the desktop changes in ways the system can observe, which reduces wasted work and keeps latency low for active sessions. Because macOS exposes rich accessibility events, Overshow captures precisely when content updates rather than sampling on a timer. This avoids gaps in your searchable history without requiring constant full-screen grabs at a fixed rate.
Multi-monitor capture and layout
You can capture multiple monitors with per-monitor inclusion in settings. The system tracks monitor layout. including x/y positions. so metadata can reflect which display hosted a given surface. This helps when you filter or review results by physical arrangement (for example, a laptop panel versus an external display).
Name or remember which monitor is "primary" for your workflow; per-monitor toggles in Settings → Recording let you omit displays you never want indexed (for example, a wallboard or a shared TV).
Changes to which monitors are included require an app relaunch to apply consistently across the capture stack.
Text extraction: OCR and platforms
When pixels are captured, Overshow runs native OCR on the device:
| Platform | Engine | Notes |
|---|---|---|
| macOS | Apple Vision | Broad language support |
| Windows | Windows Media OCR | On-device text recognition aligned with the platform stack |
OCR reads visible UI: window chrome where exposed, document text, labels, and other rendered glyphs. Input frames are downscaled before OCR, balancing accuracy and throughput on typical hardware.
OCR reflects what was rendered on screen. Very small text, heavy effects, or unusual fonts may produce imperfect text; search still benefits from partial matches and, where enabled, embeddings over the captured text.
Accessibility-first text on macOS
On macOS, Overshow prefers accessibility text as the primary source where it is available and useful, and falls back to OCR when accessibility does not yield sufficient content. That reduces reliance on bitmaps for standard controls and many document surfaces.
A quality gate filters low-value snapshots so they do not bloat search or embedding indexes. Deduplication drops identical snapshots so repeated static UI does not multiply storage and search noise.
Together, accessibility-first sourcing and quality gating keep the index aligned with meaningful UI state changes rather than every redundant paint.
What the quality gate filters out
Low-value snapshots. empty chrome, repetitive noise, or windows with little extractable text. are dropped before they reach search indexes. The gate is conservative about search utility: it aims to keep recall high for real work while avoiding index pollution. If you ever expect a sparse UI to appear and it does not, check whether the window exposes meaningful text or whether filtering applies.
Per-window capture and monitor ownership
Capture attributes content to windows where possible, with pre-capture monitor ownership so metadata ties a window to the correct display in multi-monitor setups. That supports filtering and inspection by app, title, and display context.
Window filtering and ignored apps
Window filtering matches against window or app identifiers you configure. Patterns are evaluated before capture, so excluded surfaces are not processed.
Category-based exclusion applies in addition to your patterns: password managers are always excluded and cannot be disabled. Optional category blocks include banking, health, and adult content. These rules reduce the chance of highly sensitive material entering the index even if a custom pattern misses an edge case.
Capture scope: focused versus full coverage
| Mode | Typical use |
|---|---|
| Focused only (desktop default) | Indexes foreground or otherwise "focused" surfaces so background clutter and unrelated windows contribute less |
| Full coverage | Broader inclusion across eligible visible content where policy and filters allow |
Choose the mode that matches how aggressively you want peripheral windows to appear in search. Focused only suits most knowledge work; full coverage suits workflows where background panes matter.
Duplicate frame detection
Consecutive or near-identical frames are detected automatically. When a frame is deemed a duplicate of a recent one, redundant OCR and indexing work is skipped, saving CPU and storage while preserving meaningful updates when the screen actually changes.
Browser URL detection
Where the platform allows it, Overshow can read the address bar via accessibility. Captured URLs enrich metadata so you can find sessions by site or path alongside on-page text.
UI monitoring via accessibility snapshots
Accessibility snapshots capture structured information about controls, roles, and text where exposed by applications. That UI monitoring layer supplements OCR for richer search context (for example, labels near fields, list items, or tool-specific chrome). Quality gating ensures only worthwhile snapshots feed search indexes.
PII removal before persistence
Before text is written to storage, PII removal runs on captured text. Common sensitive patterns are redacted according to the product's redaction rules. This reduces accidental retention of highly sensitive literals; it is not a substitute for organisational policy, pause/resume, or category exclusions where those apply.
What is stored (and what is not)
| Stored | Not stored |
|---|---|
| Text extracted or read via AX/OCR | Full-resolution screenshots or video files on disk |
| Metadata (app, window, monitor, timestamps, URLs where captured) | Raw frame archives for replay |
| Derived signals for search and deduplication | Unbounded image libraries of your desktop |
Lightweight event-driven processing drops image payloads after OCR where applicable, retaining text and metadata only for the indexing path.
Configuration in Settings → Recording
Most screen-related options live under Settings → Recording: monitors, ignored windows/apps, capture scope, and related toggles. Exact labels may evolve with releases; the tables below summarise intent.
What gets indexed
| Source type | What is stored for search |
|---|---|
| Accessibility text (macOS) | Primary text where quality gate passes; deduplicated |
| OCR from captured frames | Recognised text after processing |
| Accessibility snapshots | Structured UI text and labels that pass the quality gate |
| Browser address bar | URL strings where detection succeeds |
| Window and monitor metadata | App name, titles, monitor attribution, timing |
| Embeddings (when enabled) | Representations of accepted text for semantic find operations |
Configuration reference
| Setting area | Purpose | Relaunch required? |
|---|---|---|
| Included monitors | Which displays participate in capture | Yes |
| Ignored windows / patterns | Case-insensitive exclusions, pre-capture and pre-OCR | Yes |
| Capture scope | Focused only versus full coverage | Follow on-screen guidance; monitor/window changes yes |
| Category exclusions | Password managers (fixed on), optional banking/health/adult | As per product UI |
| Pause / resume | Temporarily stop all capture | No |
When you must relaunch the app
You need a full app relaunch after changing:
- Monitor selection or layout participation
- Ignored windows / exclusion patterns (and some related capture wiring)
Audio device changes are documented on the audio page; screen-specific relaunch rules above reflect how the embedded capture stack initialises hardware and window graphs.
Tips for the best capture quality
- Prefer legible scaling: browser and IDE zoom that keeps body text readable improves OCR when AX text is thin.
- Avoid extreme sub-pixel UI: fractional scaling that makes glyphs fuzzy can reduce OCR accuracy.
- Stabilise window titles: consistent titles help filtering and later review in the data inspector.
- Use focused mode if background chat or personal windows should rarely appear in search results.
- Maintain ignore lists for apps that should never contribute, beyond category defaults your organisation accepts.
- Relaunch after display topology changes (docking, rearranging monitors) so attribution stays accurate.
- Verify permissions: screen recording and accessibility permissions must remain granted for AX-first paths and capture to function as designed.
When OCR is doing more of the work
Electron apps, canvas-heavy tools, or remote-desktop clients sometimes expose limited accessibility text. In those cases OCR carries more of the load. keep contrast reasonable and fonts at comfortable sizes. Native macOS applications usually produce cleaner text with less bitmap dependency.
Organisation and compliance reminders
Pair technical controls with policy: train users on pause during regulated conversations, align ignore lists with data classification, and review exclusion categories against your risk register. Overshow's defaults favour privacy, but your procedures determine acceptable use.
Related documentation
- Audio transcription for how microphone capture complements screen text.
- Privacy: capture controls for pause/resume and high-level data handling.
- Exclusion categories for category-based blocking in depth.