Skip to content

Question detection

Detect questions from screen text, audio, and UI context; retrieve candidate answers from your capture history; review, deduplicate, and work through suggestions in the question inbox.

Last updated: 28 April 2026

Overview

Question detection finds questions in your workflow. on screen, in speech, and in structured UI. and retrieves candidate answers from historical captures using time-filtered semantic search. You review suggestions in a question inbox: accept when a candidate fits, dismiss when it does not, or leave items pending while you decide.

Question detection from screen text with automatic answer retrieval

The pipeline runs through background workers. Suggestions appear in the question inbox for review rather than interrupting your current task.

Question detection is temporarily disabled in current builds. The question_worker flag is force-off because the detector did not clear our dogfood quality bar; the surface is hidden until the worker is rebuilt. The settings toggle, OVERSHOW_FLAG_QUESTION_WORKER=1 env var, and any stale user override are all clamped to off. The page below describes the intended behaviour once the rebuild lands.

Overshow surfaces material already in your index. It does not fabricate authoritative answers without captured support; low-signal or low-confidence matches may not appear until better evidence exists.

Question sources

Source What is analysed Typical signals
OCR screen text Visible text from captured frames Written questions in editors, browsers, documents, chat panes
Audio transcriptions Speech-to-text from capture windows Spoken questions in meetings and solo work
UI accessibility snapshots Accessibility trees where enabled Labels, headings, and control text that encode questions or prompts

Together these cover most knowledge-work surfaces where questions appear explicitly; quality tracks your capture settings, app coverage, and transcription configuration.

How candidate answers are found

For each detected question, the system runs semantic search over historical captures with time bounds so suggestions favour recent, relevant context rather than decade-old noise. Answer similarity scoring ranks candidates so the inbox prioritises stronger matches.

Similarity and ranking

Embedding-based similarity compares the question’s representation to chunks of past captures. Higher scores indicate phrasing or topic overlap; the pipeline still applies time filters so a strong but ancient document does not displace a moderate match from last week without justification. Weak matches may be suppressed entirely to avoid false confidence.

Relationship to full-text search

Semantic retrieval complements keyword search: synonyms and rephrased questions still find related notes, while exact phrases benefit from lexical overlap in the underlying index where configured. The question worker focuses on short-lived, contextual suggestions rather than replacing the main search experience.

Why time filtering matters

Knowledge work is seasonal: last month’s project matters more than a generic match from years ago. Time filtering keeps suggestions aligned with how you are working now, while still allowing deep history when it scores highly.

Question lifecycle

Questions move through controlled states so you can triage without losing auditability:

State Meaning
Detected / pending A question was found; candidates may be loading or awaiting your review.
Resolved You accepted a candidate answer or the system treated the thread as closed in line with product rules.
Dismissed You rejected suggestions for this question; it leaves the active review queue.

The inbox is the canonical place to accept or dismiss suggestions and review edge cases in the full list.

Batch dismiss noisy or duplicate items so pending stays actionable; accepted answers reinforce trust in what your index actually contains.

Deduplication

Without deduplication, small UI changes or repeated phrasing would spam the inbox. Overshow deduplicates using:

  • A content hash for exact or near-exact repeats.
  • Embedding similarity to collapse paraphrases and minor edits.
  • An app-scoped window so the same phrase in unrelated apps can still surface separately when appropriate.
  • A time window so legitimate revisits across days are not incorrectly suppressed.

The result is a steadier stream of distinct questions worth reviewing.

The defaults keep the most recent 1000 questions in memory, suppress matches for 24 hours, compare up to 100 recent neighbours per new question, and treat 0.85 cosine similarity as a paraphrase. Local operators can tune those controls with OVERSHOW_DEDUP_CACHE_SIZE, OVERSHOW_DEDUP_WINDOW_HOURS, OVERSHOW_DEDUP_MAX_SIMILARITY_CHECKS, and OVERSHOW_DEDUP_SIMILARITY_THRESHOLD.

Background processing

Question processing uses background workers that advance jobs through a multi-stage pipeline. Stages separate detection, retrieval, scoring, and notification so failures in one stage can retry without corrupting the whole pipeline.

This architecture keeps the desktop responsive while still scanning new captures continuously.

Stage Responsibility
Ingest New screen text, transcript segments, and UI snapshots become searchable.
Detect Heuristics flag interrogative or prompt-like spans per source.
Deduplicate Checks collapse repeated or paraphrased questions within a time window.
Retrieve and score Semantic search with time bounds; answer similarity ordering.
Notify Inbox updates for pending questions.
Why polling instead of only push triggers

The worker tolerates bursty capture (many frames or transcript windows in seconds) without over-subscribing resources. Stages can back off under load while preserving order.

Configuration and feature availability

  • Question detection is currently force-off (see the warning at the top of this page). When the worker is rebuilt, the settings toggle and OVERSHOW_FLAG_QUESTION_WORKER env var will become the levers for enabling it again.
  • Ignored windows and excluded apps reduce signals: questions in those surfaces may not be detected.
  • Transcription stride and audio devices affect spoken questions; weak or muted audio produces fewer candidates.

How quality improves over time

Question detection is index-dependent. As you capture more OCR, transcripts, and UI context, semantic search has richer targets, deduplication becomes more meaningful, and similarity scoring separates strong from weak matches more clearly. Early adopters often see a noticeable step change after a few dense workdays with capture enabled.

Tips for useful questions and answers

  • Work in captured apps for the material you might need again; if it was never indexed, no candidate can surface.
  • Speak clearly in meetings when you want spoken questions picked up; background noise and overlap reduce transcription quality.
  • Dismiss wrong candidates promptly; that keeps mental load low and prevents mistaking stale suggestions for current truth.
  • Accept strong matches to build confidence in retrieval; the product does not “train” on your clicks like a cloud ranker, but your habits align expectations with the index.
  • Review the inbox after large imports or new projects so deduplication windows do not hide genuinely new threads you care about.

Review workflow in practice

Step You Product
1. Capture Work normally in indexed apps OCR, audio windows, and AX snapshots accumulate
2. Detect (no action) Workers find question-like spans and enqueue review items
3. Suggest Open inbox Candidates appear ranked by similarity and recency
4. Decide Accept or dismiss State moves to resolved or dismissed; pending clears when you act
5. Continue Repeat Deduplication prevents the same wording from refilling the queue all day

Accepting does not send content to the cloud by itself; it records your choice against local state so the product reflects what you found useful.

Related concepts

Question detection complements meetings, search, and daily summaries: meetings supply long-form context, search is deliberate retrieval, and question detection is proactive nudging at the moment phrasing looks like a question.

If your role is support or incident response, pair question detection with strong capture of ticketing and runbooks: answers only appear when those artefacts were visible or spoken while capture was active.