Lucy
Talk
Voice OS · 2026

Barge-In and Interruption Handling

Barge-in is the ability of a voice AI to detect that the user has started speaking while the AI is still talking, then to gracefully stop and process the new input. Without barge-in, voice AI feels like a one-way broadcast: you ask a question, you wait through the entire answer whether it is what you wanted or not. With proper barge-in, voice AI feels like a real conversation, where you can correct, redirect, or agree mid-sentence. Barge-in is technically demanding because it requires distinguishing the user's voice from the AI's own audio output.

WHAT TO LOOK FOR

The three things that actually matter

1

Acoustic echo cancellation

When the AI's voice plays through the user's speakers, the microphone hears it. AEC subtracts the known TTS audio from the input stream so the VAD layer only sees the user's actual voice. Without AEC, every TTS playback would trigger a false interruption.

2

Immediate TTS halt

When barge-in is detected, the TTS playback must stop within 100 milliseconds. Anything longer feels rude, like the AI is finishing its sentence before listening. Buffered audio frames must be discarded, not played out.

3

LLM cancellation

The LLM is often still generating when barge-in occurs. The generation must be cancelled cleanly to free inference resources and to prevent the cancelled response from being sent to TTS. This requires the orchestrator to be aware of in-flight generations.

TLDR:Lucy OS1 supports natural barge-in. The microphone stays open while Lucy speaks, and the VAD layer uses acoustic echo cancellation to subtract Lucy's own voice from the input stream. When the user starts speaking, TTS playback halts within 100 milliseconds, the LLM generation cancels, and the new utterance enters the pipeline as a fresh turn that has access to whatever Lucy was saying when interrupted. You can correct, redirect, or finish Lucy's sentence for her, and the conversation continues smoothly.

Why Lucy OS1

Acoustic echo cancellation

When the AI's voice plays through the user's speakers, the microphone hears it. AEC subtracts the known TTS audio from the input stream so the VAD layer only sees the user's actual voice. Without AEC, every TTS playback would trigger a false interruption.

Immediate TTS halt

When barge-in is detected, the TTS playback must stop within 100 milliseconds. Anything longer feels rude, like the AI is finishing its sentence before listening. Buffered audio frames must be discarded, not played out.

LLM cancellation

The LLM is often still generating when barge-in occurs. The generation must be cancelled cleanly to free inference resources and to prevent the cancelled response from being sent to TTS. This requires the orchestrator to be aware of in-flight generations.

Context preservation

When the user interrupts, the partial AI response must be kept in the conversation history so the new turn can refer to it. 'No, I meant the other one' only makes sense if the AI knows what it was saying when interrupted.

Interruption thresholds

Not every user vocalization is an interruption. Backchannel responses like 'mm-hmm' and 'right' should not stop the AI. The barge-in detector uses duration and content thresholds to filter these out, which prevents overly twitchy stopping.

Graceful resume

Sometimes the user interrupts to add information rather than redirect. 'Oh, and also include Sarah' should let the AI incorporate the addition and continue. The session manager must distinguish redirect from amend and handle each case.

QUICK COMPARISON

Lucy OS1 vs most AI tools

Capability Lucy OS1 Most AI tools
Memory across sessions ✓ Permanent, never resets ✗ Resets after every session
Voice quality ✓ Lucy OS1 Natural Voice (best-in-class) ✗ Basic STT, struggles with noise
Calendar awareness ✓ Reads Google Calendar in real time ✗ No calendar access
Available 24/7 Always on, any device Available but stateless each time
Gets personal over time ✓ Builds your context continuously ✗ Starts from zero every session

Try Lucy OS1, setup takes 30 seconds

Voice-first AI with memory and calendar integration. Free to try.

Start Talking

Free tier available. No credit card required.

GET STARTED

How to use Lucy OS1

1

Create your free account

No credit card required. Sign in with your Google account and you're inside in under a minute.

2

Connect your Google Calendar

Lucy reads your upcoming events before every conversation, so it already knows your day before you say a word.

3

Start talking about barge-in and interruption handling

Speak naturally. Lucy listens, responds by voice, and begins building context from your very first exchange. The more you use it, the better it gets.

Start for free → Free tier available. No credit card.

Frequently Asked Questions

Why does Siri stop listening when it speaks?
Siri historically closes the microphone during TTS playback to avoid having to handle echo cancellation and false barge-in detection. The result is that you cannot interrupt; you must wait for the full response, then start a new query. This is why Siri feels less conversational than newer voice AIs.
How fast does barge-in halt need to be?
The TTS playback must stop within 100 milliseconds of detected user speech. Longer than that and the AI sounds like it is finishing its sentence before deigning to listen. Under 100 milliseconds and the halt feels natural, like the AI noticed you immediately.
Can barge-in work without acoustic echo cancellation?
Only if the user is wearing headphones, which prevents the microphone from hearing the AI's voice. Without headphones, the speaker output bleeds into the microphone and AEC is required to prevent the AI from interrupting itself.
What happens to the cancelled response after a barge-in?
The partial response is logged in the conversation history so the LLM can reference it on the next turn. This is important for follow-ups like 'finish that thought' or 'no, I wanted the other option'.
How does barge-in interact with multi-turn tool use?
If the user interrupts during a tool call, the call still completes in the background but its result is held until the user finishes the new turn. Cancelling tool calls mid-execution can cause inconsistent state, especially for write operations.
Are short backchannel sounds treated as interruptions?
No. Sounds under 300 milliseconds or matching common backchannel patterns like 'mm-hmm', 'yeah', 'right' are filtered out so the AI keeps speaking. This is the same convention humans use in conversation.

MORE IN THIS CATEGORY

→ Voice OS Architecture → The Voice AI Audio Pipeline → Voice AI Latency Budget → Endpointing in Voice AI → Wakeword Detection → The Memory Layer of a Voice OS → Voice OS Context Window → Voice OS Permissions Model → See all

COMPARE LUCY OS1

Lucy OS1 vs Siri → Lucy OS1 vs ChatGPT → Lucy OS1 vs Google Gemini → Lucy OS1 vs Google Assistant → Lucy OS1 vs Amazon Alexa → See all comparisons →

Welcome