Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Deploy, evaluate, and manage AI agents end-to-end on Microsoft Azure AI Foundry
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
foundry-agent/trace/references/analyze-latency.md
1# Analyze Latency — Find and Diagnose Slow Traces23Identify slow agent traces, find bottleneck spans, and correlate with token usage.45## Step 1 — Find Slow Conversations67> ⚠️ **Hosted agents:** `gen_ai.agent.name` on `dependencies` holds the **code-level class name** (e.g., `BingSearchAgent`), NOT the Foundry agent name. To scope by Foundry name, use the [Hosted Agent Variant](#hosted-agent-variant--latency) below.89```kql10dependencies11| where timestamp > ago(24h)12| where customDimensions["gen_ai.operation.name"] == "invoke_agent"13| project timestamp, duration, success,14agentName = tostring(customDimensions["gen_ai.agent.name"]),15conversationId = tostring(customDimensions["gen_ai.conversation.id"]),16operation_Id17| summarize18totalDuration = sum(duration),19spanCount = count(),20hasErrors = countif(success == false) > 021by conversationId, operation_Id22| where totalDuration > 500023| order by totalDuration desc24| take 5025```2627> **Default threshold:** 5 seconds. Ask the user for their latency threshold if not specified.2829## Step 2 — Latency Distribution (P50/P95/P99)3031```kql32dependencies33| where timestamp > ago(24h)34| where customDimensions["gen_ai.operation.name"] in ("chat", "invoke_agent")35| summarize36p50 = percentile(duration, 50),37p95 = percentile(duration, 95),38p99 = percentile(duration, 99),39avg = avg(duration),40count = count()41by operation = tostring(customDimensions["gen_ai.operation.name"]),42model = tostring(customDimensions["gen_ai.request.model"])43| order by p95 desc44```4546Present as:4748| Operation | Model | P50 (ms) | P95 (ms) | P99 (ms) | Avg (ms) | Count |49|-----------|-------|---------|---------|---------|---------|-------|5051## Step 3 — Bottleneck Breakdown5253For a specific slow conversation, break down time spent per span type:5455```kql56dependencies57| where operation_Id == "<operation_id>"58| extend operation = tostring(customDimensions["gen_ai.operation.name"])59| summarize60totalDuration = sum(duration),61spanCount = count(),62avgDuration = avg(duration)63by operation, name64| order by totalDuration desc65```6667Common bottleneck patterns:68- **`chat` spans dominate** → LLM inference is slow (consider smaller model or caching)69- **`execute_tool` spans dominate** → Tool execution is slow (optimize tool implementation)70- **`invoke_agent` has long gaps** → Orchestration overhead (check agent framework)7172## Step 4 — Token Usage vs Latency Correlation7374```kql75dependencies76| where timestamp > ago(24h)77| where customDimensions["gen_ai.operation.name"] == "chat"78| extend79inputTokens = toint(customDimensions["gen_ai.usage.input_tokens"]),80outputTokens = toint(customDimensions["gen_ai.usage.output_tokens"])81| where isnotempty(inputTokens)82| project duration, inputTokens, outputTokens,83model = tostring(customDimensions["gen_ai.request.model"]),84operation_Id85| order by duration desc86| take 10087```8889High token counts often correlate with high latency. If confirmed, suggest:90- Reduce system prompt length91- Limit conversation history window92- Use a faster model for simpler queries9394## Hosted Agent Variant — Latency9596For hosted agents, scope by Foundry agent name via `requests` then join to `dependencies`:9798```kql99let reqIds = requests100| where timestamp > ago(24h)101| where customDimensions["gen_ai.agent.name"] == "<foundry-agent-name>"102| distinct id;103dependencies104| where timestamp > ago(24h)105| where operation_ParentId in (reqIds)106| where customDimensions["gen_ai.operation.name"] in ("chat", "invoke_agent")107| summarize108p50 = percentile(duration, 50),109p95 = percentile(duration, 95),110p99 = percentile(duration, 99),111avg = avg(duration),112count = count()113by operation = tostring(customDimensions["gen_ai.operation.name"]),114model = tostring(customDimensions["gen_ai.request.model"])115| order by p95 desc116```117