Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Deploy, evaluate, and manage AI agents end-to-end on Microsoft Azure AI Foundry
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
foundry-agent/troubleshoot/troubleshoot.md
1# Foundry Agent Troubleshoot23Troubleshoot and debug Foundry agents by collecting hosted-agent session logs, discovering observability connections, and querying Application Insights telemetry.45## Quick Reference67| Property | Value |8|----------|-------|9| Agent types | Prompt (LLM-based), Hosted |10| MCP servers | `azure` |11| Key Foundry MCP tools | `agent_get` |12| Related skills | `trace` (telemetry analysis) |13| Preferred query tool | `monitor_resource_log_query` (Azure MCP) — preferred over `azure-kusto` for App Insights |14| CLI references | `az cognitiveservices account connection`, `az rest`, `curl` |1516## When to Use This Skill1718- Agent is not responding or returning errors19- Hosted agent version is not becoming active20- Need to view hosted-agent session logs21- Diagnose latency or timeout issues22- Query Application Insights for agent traces and exceptions23- Investigate agent runtime failures2425## MCP Tools2627| Tool | Description | Parameters |28|------|-------------|------------|29| `agent_get` | Get agent details to determine type and inspect agent/version status | `projectEndpoint` (required), `agentName` (optional) |3031## Workflow3233### Step 1: Collect Agent Information3435Use the project endpoint and agent name from the project context (see [Common Project Context Resolution](../../SKILL.md#agent-common-project-context-resolution)). Ask the user only for values not already resolved:36- **Project endpoint** — AI Foundry project endpoint URL37- **Agent name** — Name of the agent to troubleshoot3839### Step 2: Determine Agent Type4041Use `agent_get` with `projectEndpoint` and `agentName` to retrieve the agent definition. Check the `kind` field:42- `"hosted"` → Proceed to Step 343- `"prompt"` → Skip to Step 4 (Discover Observability Connections)4445### Step 3: Retrieve Logs (Hosted Agents Only)4647Hosted-agent logs are scoped to individual **sessions** (sandbox instances).4849> ℹ️ **`invocations_ws` agents:** the `sessionId` used by these REST endpoints is the **client-supplied `agent_session_id`** that the WebSocket client put on the upgrade URL — not a value issued by `session_create`. If the user has the WS client logs, pull the `agent_session_id` from there and pass it as `sessionId` below. See the [invocations-ws skill](../invocations-ws/invocations-ws.md) for the WS URL contract.50511. **Check agent version status** — Use `agent_get` to verify the agent version status is `active`. If it is not active, the agent may still be provisioning or may have failed to become active.52532. **List sessions** — Hosted-agent logs require a `sessionId`. If the user does not have one, list available sessions:54```bash55az rest --method GET \56--url "<projectEndpoint>/agents/<agentName>/sessions?api-version=2025-11-15-preview" \57--headers "Foundry-Features=HostedAgents=V1Preview" \58--resource "https://ai.azure.com"59```60613. **Retrieve session logs** — The log stream endpoint uses Server-Sent Events (SSE). Use `curl` with a timeout:62```bash63TOKEN=$(az account get-access-token --resource "https://ai.azure.com" --query accessToken -o tsv)64curl -s --max-time 15 \65-H "Authorization: Bearer $TOKEN" \66-H "Accept: text/event-stream" \67-H "Foundry-Features: HostedAgents=V1Preview" \68"<projectEndpoint>/agents/<agentName>/sessions/<sessionId>:logstream?api-version=2025-11-15-preview"69```7071> ⚠️ **404 is expected** if the session sandbox has not been created yet. Advise the user to send a message to the agent first to trigger sandbox creation, then retry.72734. **Interpret the logs** — Each SSE frame is `event: log\ndata: {...}\n\n`:74- **Preamble** (first event): JSON with `session_state`, `session_id`, `agent`, `version`, `last_accessed`75- **Log lines** (subsequent events): JSON with `stream` (`stdout`/`stderr`/`status`), `message`, and `timestamp`76- **Error events**: `event: error` frames indicate server-side errors within the session sandbox7778Present the logs to the user and highlight any errors or warnings found.7980### Step 4: Discover Observability Connections8182List the project connections to find Application Insights or Azure Monitor resources using the Azure CLI command documented at:83[az cognitiveservices account connection](https://learn.microsoft.com/en-us/cli/azure/cognitiveservices/account/connection?view=azure-cli-latest)8485Refer to the documentation above for the exact command syntax and parameters. Look for connections of type `ApplicationInsights` or `AzureMonitor` in the output.8687If no observability connection is found, inform the user and suggest setting up Application Insights for the project. Ask if they want to proceed without telemetry data.8889### Step 5: Query Application Insights Telemetry9091Use **`monitor_resource_log_query`** (Azure MCP tool) to run KQL queries against the Application Insights resource discovered in Step 4. This is preferred over delegating to the `azure-kusto` skill. Pass the App Insights resource ID and the KQL query directly.9293> ⚠️ **Always pass `subscription` explicitly** to Azure MCP tools like `monitor_resource_log_query` — they don't extract it from resource IDs.9495Use `* contains "<response_id>"` or `* contains "<agent_name>"` filters to narrow down results to the specific agent instance.9697### Step 6: Summarize Findings9899Present a summary to the user including:100- **Agent type and status** — hosted or prompt; hosted agent version status when relevant101- **Log errors** — key errors from hosted-agent session logs102- **Telemetry insights** — exceptions, failed requests, latency trends103- **Recommended actions** — specific steps to resolve identified issues104105## Error Handling106107| Error | Cause | Resolution |108|-------|-------|------------|109| Agent not found | Invalid agent name or project endpoint | Use `agent_get` to list available agents and verify name |110| Hosted agent not active | Hosted agent is still provisioning or failed | Check that the ACR image was pushed correctly and agent identity permissions are assigned; wait and re-check status |111| Session logs 404 | Session sandbox has not been created yet | The sandbox is created on first invocation — send a message to the agent to trigger sandbox creation, then retry |112| SSE error event | Server-side error within the session sandbox | Check the error event `data` field for details |113| No session ID | User does not know which session to troubleshoot | List sessions via REST API (see Step 3) |114| No observability connection | Application Insights not configured for the project | Suggest configuring Application Insights for the Foundry project |115| Kusto query failed | Invalid cluster/database or insufficient permissions | Verify Application Insights resource details and reader permissions |116| No telemetry data | Agent not instrumented or too recent | Check if Application Insights SDK is configured; data may take a few minutes to appear |117118## Additional Resources119120- [Foundry Hosted Agents](https://learn.microsoft.com/azure/ai-foundry/agents/concepts/hosted-agents?view=foundry)121- [Account Connection CLI Reference](https://learn.microsoft.com/en-us/cli/azure/cognitiveservices/account/connection?view=azure-cli-latest)122- [KQL Quick Reference](https://learn.microsoft.com/azure/data-explorer/kusto/query/kql-quick-reference)123- [Foundry Samples](https://github.com/microsoft-foundry/foundry-samples)124