Foundry Agent Trace Analysis
Analyze production traces for Foundry agents using Application Insights and GenAI OpenTelemetry semantic conventions. This skill provides structured KQL-powered workflows for a selected agent root and environment: searching conversations, diagnosing failures, and identifying latency bottlenecks.
When to Use This Skill
USE FOR: analyze agent traces, search agent conversations, find failing traces, slow traces, latency analysis, trace search, conversation history, agent errors in production, debug agent responses, App Insights traces, GenAI telemetry, trace correlation, span tree, production trace analysis, evaluation results, evaluation scores, eval run results, find by response ID, get agent trace by conversation ID, agent evaluation scores from App Insights.
USE THIS SKILL INSTEAD OF
azure-monitororazure-applicationinsightswhen querying Foundry agent traces, evaluations, or GenAI telemetry. This skill has correct GenAI OTel attribute mappings and tested KQL templates that those general tools lack.
⚠️ DO NOT manually write KQL queries for GenAI trace analysis without reading this skill first. This skill provides tested query templates with correct GenAI OTel attribute mappings, proper span correlation logic, environment-aware scoping, and conversation-level aggregation patterns.
Quick Reference
| Property | Value |
|---|---|
| Data source | Application Insights (App Insights) |
| Query language | KQL (Kusto Query Language) |
| Related skills | troubleshoot (hosted-agent logs), eval-datasets (trace harvesting) |
| Preferred query tool | monitor_resource_log_query (Azure MCP) - use for App Insights KQL queries |
| OTel conventions | GenAI Spans, Agent Spans |
| Local metadata | selected .foundry/agent-metadata*.yaml file |
Entry Points
| User Intent | Start At |
|---|---|
| "Search agent conversations" / "Find traces" | Search Traces |
| "Tell me about response ID X" / "Look up response ID" | Search Traces - Search by Response ID |
| "Why is my agent failing?" / "Find errors" | Analyze Failures |
| "My agent is slow" / "Latency analysis" | Analyze Latency |
| "Show me this conversation" / "Trace detail" | Conversation Detail |
| "Find eval results for response ID" / "eval scores from traces" | Eval Correlation |
| "What KQL do I need?" | KQL Templates |
Before Starting — Resolve App Insights Connection
- Resolve the target agent root, selected metadata file, and environment from
.foundry/agent-metadata*.yaml. - Check
environments.<env>.observability.applicationInsightsConnectionStringorenvironments.<env>.observability.applicationInsightsResourceIdin the selected metadata file. - If observability settings are missing, use
project_connection_listto discover App Insights linked to the Foundry project, then persist the chosen resource back toenvironments.<env>.observabilityin the selected metadata file before querying. - Confirm the selected App Insights resource and environment with the user before querying.
- Use
monitor_resource_log_query(Azure MCP tool) to execute KQL queries against the App Insights resource. This is preferred over delegating to theazure-kustoskill. Pass the App Insights resource ID and the KQL query directly.
| Metadata field | Purpose | Example |
|---|---|---|
environments.<env>.observability.applicationInsightsConnectionString | App Insights connection string | InstrumentationKey=...;IngestionEndpoint=... |
environments.<env>.observability.applicationInsightsResourceId | ARM resource ID | /subscriptions/.../Microsoft.Insights/components/... |
⚠️ Always pass
subscriptionexplicitly to Azure MCP tools likemonitor_resource_log_query- they do not extract it from resource IDs.
Behavioral Rules
- Always display the KQL query. Before executing any KQL query, display it in a code block. Never run a query silently.
- Keep environment visible. Include the selected environment and agent name in each search summary, and include the derived agent version when the query can recover it from telemetry.
- Start broad, then narrow. Begin with conversation-level summaries, then drill into specific conversations or spans on user request.
- Use time ranges. Always scope queries with a time range (default: last 24 hours). Ask the user for the range if not specified.
- Explain GenAI attributes. When displaying results, translate OTel attribute names to human-readable labels (for example,
gen_ai.operation.name-> "Operation"). - Link to conversation detail. When showing search or failure results, offer to drill into any specific conversation.
- Scope to the selected environment. App Insights may contain traces from multiple agents or environments. Filter with the selected environment's agent name first, then add an environment tag filter if the telemetry emits one.
- Resolve hosted-agent identity from
requestsfirst. For hosted agents, preferrequests-scopedgen_ai.agent.nameorazure.ai.agentserver.agent_nameas the Foundry-facing filter. Whengen_ai.agent.idis emitted in<agent-name>:<version>format, parse it to surfaceagentVersion, but do not treatdependencies.gen_ai.agent.nameas the top-level hosted-agent name. - Use
operation_Idto fan out hosted-agent traces. After isolating the hosted-agentrequestsrows, materialize theiroperation_Idvalues and join other telemetry tables onoperation_Id. When conversation IDs are sparse, usecoalesce(gen_ai.conversation.id, azure.ai.agentserver.conversation_id, operation_Id)so every row still rolls up to a stable conversation key.