Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Build and deploy AI applications on Azure AI Foundry using Microsoft's model catalog and AI services
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
foundry-agent/invocations-ws/invocations-ws.md
1# Invocations WebSocket (`invocations_ws`) Protocol23Build, deploy, and connect to Foundry hosted agents that expose a **duplex WebSocket** endpoint instead of an HTTP request/response surface. Use this for real-time, bidirectional workloads — voice agents, live transcripts, custom streaming protocols, and signaling for out-of-band media transports.45> ℹ️ **Preview.** `invocations_ws` is in public preview. For current region availability see [Foundry Hosted Agents — region availability](https://learn.microsoft.com/azure/foundry/agents/concepts/hosted-agents#region-availability). Every upgrade must carry the preview flag — either the `foundry_features=HostedAgents=V1Preview` query parameter or the `Foundry-Features: HostedAgents=V1Preview` request header.67## Quick Reference89| Property | Value |10|----------|-------|11| Agent type | Hosted (Bring Your Own container) only |12| Protocol id (`agent.yaml`) | `invocations_ws` |13| Recommended version | `1.0.0` |14| Container route | `WS /invocations_ws` (served by `azure-ai-agentserver-invocations`; the host binds the port and probes for you) |15| Foundry-side URL | `wss://{account}.services.ai.azure.com/api/projects/agents/endpoint/protocols/invocations_ws?project_name={project}&agent_name={agentName}&agent_session_id={sessionId}&foundry_features=HostedAgents=V1Preview` |16| Auth | `Authorization: Bearer <Entra token>` for scope `https://ai.azure.com/.default` |17| Wire format | Developer-defined (binary frames, JSON text frames, protobuf, raw PCM — anything) |18| Session affinity | Per-connection, keyed by the `agent_session_id` query parameter (optional — auto-generated if omitted) |19| Multi-turn / state | Agent-managed inside the container; platform does **not** store history |2021## When to Use This Skill2223- Build or operate a hosted real-time voice agent (audio in / audio out, control frames)24- Bridge an out-of-band media transport (WebRTC, SFU, telephony) to a Foundry-hosted bot via WebSocket signaling25- Stream events bidirectionally that don't fit `responses` (OpenAI-compatible) or `invocations` (single bytes-in/bytes-out HTTP)26- Connect a browser or native client to an already-deployed `invocations_ws` agent2728> ℹ️ For HTTP-based invocation (single request/response, OpenAI `responses` API, or custom HTTP `invocations`), use the [`invoke`](../invoke/invoke.md) skill instead.2930## Protocol Comparison3132| Aspect | `responses` | `invocations` | `invocations_ws` |33|--------|-------------|---------------|------------------|34| Transport | HTTPS | HTTPS | WebSocket (`wss://`) |35| Lifetime | Per request | Per request | Long-lived duplex |36| Wire format | OpenAI-compatible JSON | Raw bytes (developer-defined) | Frames, developer-defined |37| History | Platform via `conversationId` | Agent-managed | Agent-managed via `agent_session_id` |38| Streaming | `stream: true` (SSE) | Agent-controlled | Native duplex |39| Best for | Chat | Webhooks / classifiers / protocol bridges | Voice, signaling, real-time |4041## Workflow4243### Step 1: Author the Container4445Use the `azure-ai-agentserver-invocations` host — the same package that serves HTTP `/invocations` — and register a WebSocket handler with `@app.ws_handler`. The host runs the server, binds the port, exposes `/readiness`, handles `await websocket.accept()`, runs Ping/Pong keep-alive (default 30s), maps uncaught handler exceptions to close code `1011`, and emits the structured close event used by `azd ai agent monitor`. You can register `@app.invocation_handler` (HTTP `POST /invocations`) and `@app.ws_handler` (WebSocket `GET /invocations_ws`) on the same `app`.4647```python48from azure.ai.agentserver.invocations import InvocationAgentServerHost49from starlette.websockets import WebSocket5051app = InvocationAgentServerHost()5253@app.ws_handler # GET /invocations_ws (WebSocket upgrade)54async def ws(websocket: WebSocket) -> None:55await run_bot(websocket) # your duplex protocol lives here5657app.run()58```5960Inside the handler, read the session id from `FOUNDRY_AGENT_SESSION_ID` (env var set by the host), or fall back to the `agent_session_id` query parameter. The container does **not** see the `Authorization` header — APIM and the Agents service strip it after validation, so don't depend on it and don't accept an `authorization` query parameter.6162> ⚠️ **You define the wire format.** The platform forwards frames as-is in both directions. There is no schema validation, no OpenAPI registration, no platform-managed history. Document your protocol for callers.6364See [Invocations WebSocket Protocol Guide](references/invocations-ws-protocol.md) for the framing model, the `agent_session_id` query parameter, control-vs-data frame patterns, and discovery guidance.6566### Step 2: Declare the Protocol in `agent.yaml`6768```yaml69kind: hosted70name: my-ws-agent71protocols:72- protocol: invocations_ws73version: 1.0.074resources:75cpu: "1" # voice/media: at least 1 vCPU / 2 GiB; up to 2 vCPU / 4 GiB76memory: 2Gi77environment_variables:78- name: SOME_SECRET79value: ${SOME_SECRET}80# Resolve every secret from the azd environment; do not bake values into the image.81```8283The matching `agent.manifest.yaml` declares the same `protocol: invocations_ws` under `template.protocols`.8485> ⚠️ The default `azd` scaffold uses `0.25 cpu / 0.5Gi`, which is too small for most real-time workloads. Bump `resources` before deploying.8687### Step 3: Deploy via `azd`8889Use the standard hosted-agent flow from the [`deploy`](../deploy/deploy.md) skill:9091```bash92mkdir ~/azd-deploys/my-ws-agent && cd ~/azd-deploys/my-ws-agent93azd ai agent init -m <path>/agent.manifest.yaml -p <project-resource-id> --no-prompt94# azd env set ... for every variable referenced in agent.yaml95azd deploy my-ws-agent96```9798Once `Running`, the Foundry endpoint is reachable at the URL pattern in the Quick Reference table above.99100### Step 4: Connect a Client101102Connect to the Foundry-side WebSocket directly:1031041. **Mint an Entra token** for the audience `https://ai.azure.com`:105106```bash107az account get-access-token --resource https://ai.azure.com --query accessToken -o tsv108```1091102. **Build the upstream URL.** The `agent_session_id` query parameter is **optional** — if you omit it the platform generates one; supply your own (URL-safe; see [Session Management](../invoke/references/session-management.md) for ID format) only when you need to resume an existing session. The preview flag is required:111112```113wss://{account}.services.ai.azure.com/api/projects/agents/endpoint/protocols/invocations_ws114?project_name={project}115&agent_name={agentName}116&agent_session_id={your-id} # optional117&foundry_features=HostedAgents=V1Preview118```119120You can alternatively pass the preview flag as the `Foundry-Features: HostedAgents=V1Preview` request header on the upgrade.1211223. **Open the WebSocket** with header `Authorization: Bearer <token>`. Browser code typically needs a small server-side proxy because the browser `WebSocket` constructor cannot set headers.1231244. **Speak your protocol.** Send and receive whatever your container expects.125126### Step 5: Multi-turn / Session State127128There is no platform-managed history. To correlate frames across reconnects or keep per-user state, reuse the same `agent_session_id` and key your state off it inside the container. See [Session Management](../invoke/references/session-management.md).129130### Step 6: Observe and Troubleshoot131132Stream container logs while testing:133134```bash135azd ai agent monitor my-ws-agent --follow136# scope to a single connection137azd ai agent monitor my-ws-agent --session-id <agent_session_id> --follow138```139140The same `agent_session_id` can be used to stream container logs (see the [`troubleshoot`](../troubleshoot/troubleshoot.md) skill for deeper diagnostics).141142## Error Handling143144| Error | Cause | Resolution |145|-------|-------|------------|146| HTTP 401 / 403 on WS upgrade | Missing or stale Entra token | Re-run `az account get-access-token --resource https://ai.azure.com`; ensure the caller has Foundry data-plane RBAC |147| HTTP 404 on upgrade | Wrong `agent_name` / `project_name`, missing preview flag, or unsupported region | Verify with `agent_get`; ensure `foundry_features=HostedAgents=V1Preview` is on the URL (or `Foundry-Features` header); confirm region per [Hosted Agents region availability](https://learn.microsoft.com/azure/foundry/agents/concepts/hosted-agents#region-availability) |148| WS closes immediately after accept | Container handler raised inside the request | Check logs via `azd ai agent monitor`; typical causes are missing env vars or unreachable backend services |149| Browser cannot connect directly | Browser `WebSocket` cannot set `Authorization` | Run a thin server-side proxy that injects the token before forwarding |150| Frames received but no response | Wire-format mismatch | Confirm both ends use the same framing (binary vs text, codec, sample rate, schema). The platform does **not** validate or transcode frames |151| Cold-start delay on first connect | Container initialising (VAD, model load, etc.) | Expected; subsequent connections to the same container are fast |152| State lost across reconnect | Different `agent_session_id` used | Reuse the same `agent_session_id` query parameter to preserve agent-managed state |153154## Reference Samples155156End-to-end working samples (server container + browser portal) live in the [`foundry-samples`](https://github.com/microsoft-foundry/foundry-samples) repo under:157158```159samples/python/hosted-agents/bring-your-own/invocations_ws/160```161162Each sub-folder shows a different media-path strategy (audio entirely over the WebSocket vs. WebSocket as signaling-only for an out-of-band media transport). Pick the one whose architecture matches your latency, NAT-traversal, and operational constraints.163164## Additional Resources165166- [Invocations WebSocket Protocol Guide](references/invocations-ws-protocol.md)167- [Session Management](../invoke/references/session-management.md)168- [Foundry Hosted Agents](https://learn.microsoft.com/en-us/azure/ai-foundry/agents/concepts/hosted-agents?view=foundry)169- [`invoke` skill](../invoke/invoke.md) — HTTP-based `responses` and `invocations` protocols170- [`deploy` skill](../deploy/deploy.md) — package and deploy hosted-agent containers171- [`troubleshoot` skill](../troubleshoot/troubleshoot.md) — diagnose hosted-agent runtime failures172