Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Deploy, evaluate, and manage AI agents end-to-end on Microsoft Azure AI Foundry
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
foundry-agent/invocations-ws/invocations-ws.md
1# Invocations WebSocket (`invocations_ws`) Protocol23Build, deploy, and connect to Foundry hosted agents that expose a **duplex WebSocket** endpoint instead of an HTTP request/response surface. Use this for real-time, bidirectional workloads — voice agents, live transcripts, custom streaming protocols, and signaling for out-of-band media transports.45> ℹ️ **Preview.** `invocations_ws` is in public preview. For current region availability see [Foundry Hosted Agents — region availability](https://learn.microsoft.com/azure/foundry/agents/concepts/hosted-agents#region-availability). Every upgrade must carry the preview flag — either the `foundry_features=HostedAgents=V1Preview` query parameter or the `Foundry-Features: HostedAgents=V1Preview` request header.67## Quick Reference89| Property | Value |10|----------|-------|11| Agent type | Hosted (Bring Your Own container) only |12| Protocol id (`agent.yaml`) | `invocations_ws` |13| Recommended version | `1.0.0` |14| Container route | `WS /invocations_ws` (served by `azure-ai-agentserver-invocations`; the host binds the port and probes for you) |15| Foundry-side URL | `wss://{account}.services.ai.azure.com/api/projects/agents/endpoint/protocols/invocations_ws?project_name={project}&agent_name={agentName}&agent_session_id={sessionId}&foundry_features=HostedAgents=V1Preview` |16| Auth | `Authorization: Bearer <Entra token>` for scope `https://ai.azure.com/.default` |17| Wire format | Developer-defined (binary frames, JSON text frames, protobuf, raw PCM — anything) |18| Session affinity | Per-connection, keyed by the `agent_session_id` query parameter (optional — auto-generated if omitted) |19| Multi-turn / state | Agent-managed inside the container; platform does **not** store history |2021## When to Use This Skill2223- Build or operate a hosted real-time voice agent (audio in / audio out, control frames)24- Bridge an out-of-band media transport (WebRTC, SFU, telephony) to a Foundry-hosted bot via WebSocket signaling25- Stream events bidirectionally that don't fit `responses` (OpenAI-compatible) or `invocations` (single bytes-in/bytes-out HTTP)26- Connect a browser or native client to an already-deployed `invocations_ws` agent2728> ℹ️ For HTTP-based invocation (single request/response, OpenAI `responses` API, or custom HTTP `invocations`), use the [`invoke`](../invoke/invoke.md) skill instead.2930## Protocol Comparison3132| Aspect | `responses` | `invocations` | `invocations_ws` |33|--------|-------------|---------------|------------------|34| Transport | HTTPS | HTTPS | WebSocket (`wss://`) |35| Lifetime | Per request | Per request | Long-lived duplex |36| Wire format | OpenAI-compatible JSON | Raw bytes (developer-defined) | Frames, developer-defined |37| History | Platform via `conversationId` | Agent-managed | Agent-managed via `agent_session_id` |38| Streaming | `stream: true` (SSE) | Agent-controlled | Native duplex |39| Best for | Chat | Webhooks / classifiers / protocol bridges | Voice, signaling, real-time |4041## Workflow4243### Step 1: Author the Container4445Use the `azure-ai-agentserver-invocations` host — the same package that serves HTTP `/invocations` — and register a WebSocket handler with `@app.ws_handler`. The host runs the server, binds the port, exposes `/readiness`, handles `await websocket.accept()`, runs Ping/Pong keep-alive (default 30s), maps uncaught handler exceptions to close code `1011`, and emits the structured close event used by `azd ai agent monitor`. You can register `@app.invocation_handler` (HTTP `POST /invocations`) and `@app.ws_handler` (WebSocket `GET /invocations_ws`) on the same `app`.4647```python48from azure.ai.agentserver.invocations import InvocationAgentServerHost49from starlette.websockets import WebSocket5051app = InvocationAgentServerHost()5253@app.ws_handler # GET /invocations_ws (WebSocket upgrade)54async def ws(websocket: WebSocket) -> None:55await run_bot(websocket) # your duplex protocol lives here5657app.run()58```5960Inside the handler, read the session id from `FOUNDRY_AGENT_SESSION_ID` (env var set by the host), or fall back to the `agent_session_id` query parameter. The container does **not** see the `Authorization` header — APIM and the Agents service strip it after validation, so don't depend on it and don't accept an `authorization` query parameter.6162> ⚠️ **You define the wire format.** The platform forwards frames as-is in both directions. There is no schema validation, no OpenAPI registration, no platform-managed history. Document your protocol for callers.6364See [Invocations WebSocket Protocol Guide](references/invocations-ws-protocol.md) for the framing model, the `agent_session_id` query parameter, control-vs-data frame patterns, and discovery guidance.6566### Step 2: Declare the Protocol in `agent.yaml`6768```yaml69kind: hosted70name: my-ws-agent71protocols:72- protocol: invocations_ws73version: 1.0.074resources:75cpu: "1" # voice/media: at least 1 vCPU / 2 GiB; up to 2 vCPU / 4 GiB76memory: 2Gi77environment_variables:78- name: SOME_SECRET79value: ${SOME_SECRET}80# Resolve every secret from the azd environment; do not bake values into the image.81```8283The matching `agent.manifest.yaml` declares the same `protocol: invocations_ws` under `template.protocols`.8485> ⚠️ The default `azd` scaffold uses `0.25 cpu / 0.5Gi`, which is too small for most real-time workloads. Bump `resources` before deploying.8687### Step 3: Deploy via `azd`8889Use the standard hosted-agent flow from the [`deploy`](../deploy/deploy.md) skill:9091```bash92mkdir ~/azd-deploys/my-ws-agent && cd ~/azd-deploys/my-ws-agent93azd ai agent init -m <path>/agent.manifest.yaml -p <project-resource-id> --no-prompt94# azd env set ... for every variable referenced in agent.yaml95azd deploy my-ws-agent96```9798Once `Running`, the Foundry endpoint is reachable at the URL pattern in the Quick Reference table above.99100### Step 4: Connect a Client101102Connect to the Foundry-side WebSocket directly:1031041. **Mint an Entra token** for the audience `https://ai.azure.com`:105106```bash107az account get-access-token --resource https://ai.azure.com --query accessToken -o tsv108```1091102. **Build the upstream URL.** The `agent_session_id` query parameter is **optional** — if you omit it the platform generates one; supply your own (URL-safe; see [Session Management](../invoke/references/session-management.md) for ID format) only when you need to resume an existing session. The preview flag is required:111112```113wss://{account}.services.ai.azure.com/api/projects/agents/endpoint/protocols/invocations_ws114?project_name={project}115&agent_name={agentName}116&agent_session_id={your-id} # optional117&foundry_features=HostedAgents=V1Preview118```119120You can alternatively pass the preview flag as the `Foundry-Features: HostedAgents=V1Preview` request header on the upgrade.1211223. **Open the WebSocket** with header `Authorization: Bearer <token>`. Browser code typically needs a small server-side proxy because the browser `WebSocket` constructor cannot set headers.1231244. **Speak your protocol.** Send and receive whatever your container expects.125126### Step 5: Multi-turn / Session State127128There is no platform-managed history. To correlate frames across reconnects or keep per-user state, reuse the same `agent_session_id` and key your state off it inside the container. See [Session Management](../invoke/references/session-management.md).129130### Step 6: Observe and Troubleshoot131132Stream container logs while testing:133134```bash135azd ai agent monitor my-ws-agent --follow136# scope to a single connection137azd ai agent monitor my-ws-agent --session-id <agent_session_id> --follow138```139140The same `agent_session_id` can be used to stream container logs (see the [`troubleshoot`](../troubleshoot/troubleshoot.md) skill for deeper diagnostics).141142## Error Handling143144| Error | Cause | Resolution |145|-------|-------|------------|146| HTTP 401 / 403 on WS upgrade | Missing or stale Entra token | Re-run `az account get-access-token --resource https://ai.azure.com`; ensure the caller has Foundry data-plane RBAC |147| HTTP 404 on upgrade | Wrong `agent_name` / `project_name`, missing preview flag, or unsupported region | Verify with `agent_get`; ensure `foundry_features=HostedAgents=V1Preview` is on the URL (or `Foundry-Features` header); confirm region per [Hosted Agents region availability](https://learn.microsoft.com/azure/foundry/agents/concepts/hosted-agents#region-availability) |148| WS closes immediately after accept | Container handler raised inside the request | Check logs via `azd ai agent monitor`; typical causes are missing env vars or unreachable backend services |149| Browser cannot connect directly | Browser `WebSocket` cannot set `Authorization` | Run a thin server-side proxy that injects the token before forwarding |150| Frames received but no response | Wire-format mismatch | Confirm both ends use the same framing (binary vs text, codec, sample rate, schema). The platform does **not** validate or transcode frames |151| Cold-start delay on first connect | Container initialising (VAD, model load, etc.) | Expected; subsequent connections to the same container are fast |152| State lost across reconnect | Different `agent_session_id` used | Reuse the same `agent_session_id` query parameter to preserve agent-managed state |153154## Reference Samples155156End-to-end working samples (server container + browser portal) live in the [`foundry-samples`](https://github.com/microsoft-foundry/foundry-samples) repo under:157158```159samples/python/hosted-agents/bring-your-own/invocations_ws/160```161162Each sub-folder shows a different media-path strategy (audio entirely over the WebSocket vs. WebSocket as signaling-only for an out-of-band media transport). Pick the one whose architecture matches your latency, NAT-traversal, and operational constraints.163164## Additional Resources165166- [Invocations WebSocket Protocol Guide](references/invocations-ws-protocol.md)167- [Session Management](../invoke/references/session-management.md)168- [Foundry Hosted Agents](https://learn.microsoft.com/en-us/azure/ai-foundry/agents/concepts/hosted-agents?view=foundry)169- [`invoke` skill](../invoke/invoke.md) — HTTP-based `responses` and `invocations` protocols170- [`deploy` skill](../deploy/deploy.md) — package and deploy hosted-agent containers171- [`troubleshoot` skill](../troubleshoot/troubleshoot.md) — diagnose hosted-agent runtime failures172