But what if the user interrupts?
When the user interrupts mid-response, the webhook request that was generating the assistant’s reply is abruptly terminated.Unless we’ve already written something to memory, the assistant’s partial message could be lost. In practice, this happens a lot with voice agents — users cut off the model to ask something new before the previous response finishes.
If we don’t handle this carefully, our in-memory state drifts out of sync with what actually happened in the conversation. And you might not even realize, and think the LLM is just being a silly billy.
So what do I need to do?
When a new user webhook arrives, persist in this order:- Store the user message right away so the turn is anchored in history.
- Insert the assistant placeholder before you start streaming tokens back.
- Remove the placeholder and append final messages with the same
turn_id
.
- The placeholder remains, capturing the interrupted turn.
- The next
message
webhook includesinterruption_context
, which tells us whichassistant_turn_id
was cut off. - You can reconcile by marking that entry as interrupted.
Example Interruption Handling
Why doesn’t the assistant finish the turn?
When a user interrupts, Layercode immediately cancels the webhook request that was streaming the assistant response.Because the request terminates, your worker never has a chance to finalize the response or append it to history.
There is currently no back-channel for Layercode to notify your backend gracefully — cancelling the request is the only interruption signal we can provide. This is why persisting the placeholder before you stream tokens is essential.
Do I get an AbortSignal
?
Layercode does not propagate a custom AbortSignal
into your AI SDK calls.Instead, the framework relies on the platform aborting the request (Cloudflare Workers receive the native
ExecutionContext
cancellation). Make sure any long-running model or fetch calls can tolerate the request being torn down mid-stream; the placeholder you stored lets you recover once the next webhook arrives.
What about multiple interruptions in a row?
Even if a user interrupts several turns back-to-back, Layercode only sendsinterruption_context
for the immediately previous assistant turn.Persist that context as soon as the new webhook starts (before any expensive work) so it survives if another interruption happens quickly afterward. The placeholder pattern above keeps your transcript accurate even during rapid-fire interrupts.
Stored Message Shape and turn_id
Every stored message (user and assistant) includes a turn_id
corresponding to the webhook event that created it:
turn_id
.
Persistence Notes
- There is no deduplication or idempotency handling yet in Layercode. So you will need to write logic to filter this.
TL;DR
✅ Always store user messages immediately.✅ Add a placeholder assistant message before streaming.
✅ Replace or mark the placeholder when the turn finishes or is interrupted.
✅ Never rely on the webhook completing — it might abort anytime.
✅ Keep
turn_id
and conversation_id
consistent for reconciliation.