Barge-in is the ability of a voice AI to detect that the user has started speaking while the AI is still talking, then to gracefully stop and process the new input. Without barge-in, voice AI feels like a one-way broadcast: you ask a question, you wait through the entire answer whether it is what you wanted or not. With proper barge-in, voice AI feels like a real conversation, where you can correct, redirect, or agree mid-sentence. Barge-in is technically demanding because it requires distinguishing the user's voice from the AI's own audio output.
WHAT TO LOOK FOR
Acoustic echo cancellation
When the AI's voice plays through the user's speakers, the microphone hears it. AEC subtracts the known TTS audio from the input stream so the VAD layer only sees the user's actual voice. Without AEC, every TTS playback would trigger a false interruption.
Immediate TTS halt
When barge-in is detected, the TTS playback must stop within 100 milliseconds. Anything longer feels rude, like the AI is finishing its sentence before listening. Buffered audio frames must be discarded, not played out.
LLM cancellation
The LLM is often still generating when barge-in occurs. The generation must be cancelled cleanly to free inference resources and to prevent the cancelled response from being sent to TTS. This requires the orchestrator to be aware of in-flight generations.
TLDR:Lucy OS1 supports natural barge-in. The microphone stays open while Lucy speaks, and the VAD layer uses acoustic echo cancellation to subtract Lucy's own voice from the input stream. When the user starts speaking, TTS playback halts within 100 milliseconds, the LLM generation cancels, and the new utterance enters the pipeline as a fresh turn that has access to whatever Lucy was saying when interrupted. You can correct, redirect, or finish Lucy's sentence for her, and the conversation continues smoothly.
When the AI's voice plays through the user's speakers, the microphone hears it. AEC subtracts the known TTS audio from the input stream so the VAD layer only sees the user's actual voice. Without AEC, every TTS playback would trigger a false interruption.
When barge-in is detected, the TTS playback must stop within 100 milliseconds. Anything longer feels rude, like the AI is finishing its sentence before listening. Buffered audio frames must be discarded, not played out.
The LLM is often still generating when barge-in occurs. The generation must be cancelled cleanly to free inference resources and to prevent the cancelled response from being sent to TTS. This requires the orchestrator to be aware of in-flight generations.
When the user interrupts, the partial AI response must be kept in the conversation history so the new turn can refer to it. 'No, I meant the other one' only makes sense if the AI knows what it was saying when interrupted.
Not every user vocalization is an interruption. Backchannel responses like 'mm-hmm' and 'right' should not stop the AI. The barge-in detector uses duration and content thresholds to filter these out, which prevents overly twitchy stopping.
Sometimes the user interrupts to add information rather than redirect. 'Oh, and also include Sarah' should let the AI incorporate the addition and continue. The session manager must distinguish redirect from amend and handle each case.
QUICK COMPARISON
| Capability | Lucy OS1 | Most AI tools |
|---|---|---|
| Memory across sessions | ✓ Permanent, never resets | ✗ Resets after every session |
| Voice quality | ✓ Lucy OS1 Natural Voice (best-in-class) | ✗ Basic STT, struggles with noise |
| Calendar awareness | ✓ Reads Google Calendar in real time | ✗ No calendar access |
| Available 24/7 | Always on, any device | Available but stateless each time |
| Gets personal over time | ✓ Builds your context continuously | ✗ Starts from zero every session |
Voice-first AI with memory and calendar integration. Free to try.
Start TalkingFree tier available. No credit card required.
GET STARTED
Create your free account
No credit card required. Sign in with your Google account and you're inside in under a minute.
Connect your Google Calendar
Lucy reads your upcoming events before every conversation, so it already knows your day before you say a word.
Start talking about barge-in and interruption handling
Speak naturally. Lucy listens, responds by voice, and begins building context from your very first exchange. The more you use it, the better it gets.
Welcome