Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.opendot.ai/llms.txt

Use this file to discover all available pages before exploring further.

Browser Test is the local loop for validating a voice agent before binding it to hardware.
OpenDot Browser Test screen

What you see in the UI

Browser Test has four main areas:
  • Run this agent: connection and microphone controls.
  • Live transcript / Assistant: text and audio for the latest turn.
  • Audio pipeline: timing spans for STT, LLM, and TTS work.
  • Runtime events: connection, VAD, transcript, assistant, and error logs.

Before you connect

Make sure the platform API, web console, and voice runtime are running:
# terminal 1
pnpm run api

# terminal 2
pnpm run dev

# terminal 3
pnpm run runtime
The platform API and runtime load the root .env. Leave POSTGRES_URI empty for local defaults, and add provider keys there before testing live audio. The Docker Compose stack wires the local service hostnames for you.

Run this agent

  1. Select an agent in Agent Studio.
  2. Review its pipeline in Configuration.
  3. Open Browser Test.
  4. Click Connect to open an authenticated runtime WebSocket.
  5. Click Start mic.
  6. Speak naturally.
  7. Click Stop mic and wait for VAD/STT to close the turn.
Use Force reply only when you want the runtime to respond to the current final transcript without waiting for normal turn close. Use Reset to clear the runtime conversation and local panel state.

Live transcript and assistant output

The transcript card shows:
  • interim speech from the STT stream
  • final user turn text after speech finalization
The assistant card shows:
  • clean assistant text
  • optional XML-like response chunks
  • generated TTS chunks and replay controls
  • current audio playback or PCM stream status
When Show chunks is enabled, the panel maps assistant <chunk> output to the generated audio chunks, which makes TTS timing and chunk quality easier to debug.

Audio pipeline timeline

The timeline groups runtime spans into speech input, LLM, and speech output. It helps answer whether a slow turn came from STT finalization, LLM generation, TTS, or browser playback.

Runtime event log

The runtime log captures connection state, Deepgram readiness, VAD events, final transcripts, assistant text, generated audio chunks, resets, and errors. Keep this log in bug reports when voice behavior changes.

Runtime authorization

Before the socket opens, the console asks the platform API for a short-lived voice-session token for the selected agent. The runtime verifies that token with the platform API and loads the authorized agent config from the API. If you edit or switch agents while connected, the browser reconnects with a newly minted token. The runtime WebSocket defaults to:
ws://localhost:8787/voice
Override it with:
VITE_RUNTIME_WS_URL=ws://localhost:8787/voice

Common checks

  • If the page does not load agent data, confirm pnpm run api is still running and Postgres is reachable.
  • If Connect fails, confirm pnpm run api, pnpm run runtime, and OPENDOT_RUNTIME_INTERNAL_SECRET are aligned.
  • If the mic does not start, check browser microphone permissions.
  • If text appears but audio does not play, review the TTS encoding and browser delivery settings.
  • If turns close too quickly, tune VAD endpointing in Configuration.