Browser Test

Browser Test is the local loop for validating a voice agent before binding it to hardware.

What you see in the UI

Browser Test has four main areas:

Run this agent: connection and microphone controls.
Live transcript / Assistant: text and audio for the latest turn.
Audio pipeline: timing spans for STT, LLM, and TTS work.
Runtime events: connection, VAD, transcript, assistant, and error logs.

Before you connect

Make sure the platform API, web console, and voice runtime are running:

# terminal 1
pnpm run api

# terminal 2
pnpm run dev

# terminal 3
pnpm run runtime

The platform API and runtime load the root .env. Leave POSTGRES_URI empty for local defaults, and add provider keys there before testing live audio. The Docker Compose stack wires the local service hostnames for you.

Run this agent

Select an agent in Agent Studio.
Review its pipeline in Configuration.
Open Browser Test.
Click Connect to open an authenticated runtime WebSocket.
Click Start mic.
Speak naturally.
Click Stop mic and wait for VAD/STT to close the turn.

Use Force reply only when you want the runtime to respond to the current final transcript without waiting for normal turn close. Use Reset to clear the runtime conversation and local panel state.

Live transcript and assistant output

The transcript card shows:

interim speech from the STT stream
final user turn text after speech finalization

The assistant card shows:

clean assistant text
optional XML-like response chunks
generated TTS chunks and replay controls
current audio playback or PCM stream status

When Show chunks is enabled, the panel maps assistant <chunk> output to the generated audio chunks, which makes TTS timing and chunk quality easier to debug.

Audio pipeline timeline

The timeline groups runtime spans into speech input, LLM, and speech output. It helps answer whether a slow turn came from STT finalization, LLM generation, TTS, or browser playback.

Runtime event log

The runtime log captures connection state, Deepgram readiness, VAD events, final transcripts, assistant text, generated audio chunks, resets, and errors. Keep this log in bug reports when voice behavior changes.

Runtime authorization

Before the socket opens, the console asks the platform API for a short-lived voice-session token for the selected agent. The runtime verifies that token with the platform API and loads the authorized agent config from the API. If you edit or switch agents while connected, the browser reconnects with a newly minted token. The runtime WebSocket defaults to:

ws://localhost:8787/voice

Override it with:

VITE_RUNTIME_WS_URL=ws://localhost:8787/voice

Common checks

If the page does not load agent data, confirm pnpm run api is still running and Postgres is reachable.
If Connect fails, confirm pnpm run api, pnpm run runtime, and OPENDOT_RUNTIME_INTERNAL_SECRET are aligned.
If the mic does not start, check browser microphone permissions.
If text appears but audio does not play, review the TTS encoding and browser delivery settings.
If turns close too quickly, tune VAD endpointing in Configuration.

Getting started

Understand OpenDot

Console workflow

Operate and deploy

Contributing

What you see in the UI

Before you connect

Run this agent

Live transcript and assistant output

Audio pipeline timeline

Runtime event log

Runtime authorization

Common checks

Getting started

Understand OpenDot

Console workflow

Operate and deploy

Contributing

Documentation Index

​What you see in the UI

​Before you connect

​Run this agent

​Live transcript and assistant output

​Audio pipeline timeline

​Runtime event log

​Runtime authorization

​Common checks

What you see in the UI

Before you connect

Run this agent

Live transcript and assistant output

Audio pipeline timeline

Runtime event log

Runtime authorization

Common checks