Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Configure Azure API Management as an AI Gateway with caching, token limits, and content safety
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
references/troubleshooting.md
1# AI Gateway Troubleshooting23Common issues when using Azure API Management as an AI Gateway.45---67## Authentication Issues89### 401 Unauthorized from Backend1011**Symptom**: APIM returns `401` when calling Azure OpenAI.1213**Causes & Solutions**:1415| Cause | Fix |16|-------|-----|17| Managed identity not enabled on APIM | `az apim update --name <apim> --resource-group <rg> --set identity.type=SystemAssigned` |18| Missing RBAC role | `az role assignment create --assignee <apim-principal-id> --role "Cognitive Services User" --scope <aoai-resource-id>` |19| Wrong auth resource | Ensure `resource="https://cognitiveservices.azure.com"` (not the endpoint URL) |20| RBAC propagation delay | Wait 5-10 minutes after role assignment |2122**Diagnostic**:2324```bash25# Verify identity is enabled26az apim show --name <apim> --resource-group <rg> --query "identity" -o json2728# Check role assignments29AOAI_ID=$(az cognitiveservices account show --name <aoai> --resource-group <rg> --query id -o tsv)30az role assignment list --scope "$AOAI_ID" --query "[?principalType=='ServicePrincipal'].{role:roleDefinitionName, principal:principalId}" -o table31```3233---3435## Rate Limiting Issues3637### 429 Token Limit Exceeded3839**Symptom**: Requests blocked with `429 Too Many Requests` from `azure-openai-token-limit` policy.4041**Solutions**:42431. **Increase limit**: Raise `tokens-per-minute` value442. **Add more backends**: Load balance across regions for higher aggregate TPM453. **Enable semantic caching**: Reduce actual token consumption by serving cached responses464. **Switch counter-key**: Use per-user instead of global to prevent one user from exhausting the pool4748```xml49<!-- Per-user instead of global -->50<azure-openai-token-limit51tokens-per-minute="50000"52counter-key="@(context.Request.Headers.GetValueOrDefault("X-User-Id", context.Subscription.Id))"53estimate-prompt-tokens="true" />54```5556### 429 from Azure OpenAI (Not APIM)5758**Symptom**: Backend returns `429` even though APIM token limits are not exceeded.5960**Cause**: Azure OpenAI's own TPM quota is exhausted.6162**Solutions**:63641. Increase Azure OpenAI deployment TPM quota in the portal652. Add load balancing across multiple Azure OpenAI instances663. Use retry with backoff:6768```xml69<retry condition="@(context.Response.StatusCode == 429)" count="3" interval="10">70<forward-request />71</retry>72```7374---7576## Semantic Caching Issues7778### No Cache Hits7980**Symptom**: Semantic cache is configured but cache hit rate is 0%.8182**Causes & Solutions**:8384| Cause | Fix |85|-------|-----|86| `score-threshold` too high | Lower from 0.9 to 0.7 (more matches) |87| Embeddings backend misconfigured | Verify backend URL and auth |88| Redis not configured | Deploy Azure Cache for Redis Enterprise with RediSearch |89| Streaming requests | Semantic caching doesn't work with `"stream": true` |9091**Verify caching is working**:9293```bash94# Check cache-related headers in response95curl -v -X POST "${GATEWAY_URL}/openai/deployments/<deployment>/chat/completions?api-version=2024-02-01" \96-H "Content-Type: application/json" \97-H "Ocp-Apim-Subscription-Key: <key>" \98-d '{"messages": [{"role": "user", "content": "What is Azure?"}], "max_tokens": 100}'99100# Look for: x-cache-status header in response101```102103### Cache Returns Stale Data104105**Solution**: Reduce `duration` in `azure-openai-semantic-cache-store`:106107```xml108<!-- Shorter TTL for frequently changing knowledge -->109<azure-openai-semantic-cache-store duration="300" /> <!-- 5 minutes -->110```111112---113114## Content Safety Issues115116### False Positives (Legitimate Content Blocked)117118**Symptom**: Normal business content is being blocked by content safety policy.119120**Solutions**:1211221. **Increase thresholds** (less strict):123124```xml125<llm-content-safety backend-id="contentsafety-backend">126<category name="Hate" threshold="5" /> <!-- Was 4, now less strict -->127<category name="Sexual" threshold="5" />128<category name="SelfHarm" threshold="5" />129<category name="Violence" threshold="5" />130</llm-content-safety>131```1321332. **Log blocked content** for review:134135```xml136<on-error>137<choose>138<when condition="@(context.LastError.Source == "llm-content-safety")">139<trace source="content-safety" severity="warning">140@{141return new JObject(142new JProperty("blocked", true),143new JProperty("subscription", context.Subscription.Id),144new JProperty("timestamp", DateTime.UtcNow)145).ToString();146}147</trace>148<return-response>149<set-status code="400" reason="Content Filtered" />150<set-body>{"error": "Content filtered by safety policy"}</set-body>151</return-response>152</when>153</choose>154</on-error>155```156157### Content Safety Backend Error158159**Symptom**: `500` error from `llm-content-safety` policy.160161**Causes**:162163| Cause | Fix |164|-------|-----|165| Content Safety resource not deployed | Deploy Azure AI Content Safety resource |166| Backend URL wrong | Check `contentsafety-backend` URL matches resource endpoint |167| Missing RBAC | Grant APIM "Cognitive Services User" on the Content Safety resource |168| Region mismatch | Content Safety must be in a supported region |169170---171172## Backend Configuration Issues173174### Backend Not Found175176**Symptom**: `500` error with "Backend not found" message.177178```bash179# Verify backend exists180az apim backend list --service-name <apim> --resource-group <rg> \181--query "[].{id:name, url:url}" -o table182183# Check backend ID matches policy reference184```185186### Timeout on AI Requests187188**Symptom**: Requests timeout, especially for large context windows or complex prompts.189190**Solution**: Increase timeout in `<backend>`:191192```xml193<backend>194<!-- Default is 30s, increase for large AI requests -->195<forward-request timeout="120" />196</backend>197```198199---200201## Diagnostic Tools202203### APIM Tracing204205Enable request tracing for debugging policy flow:206207```bash208# Get tracing subscription key209az apim subscription list --service-name <apim> --resource-group <rg> \210--query "[?displayName=='Built-in all-access subscription'].primaryKey" -o tsv211212# Send request with tracing213curl -X POST "${GATEWAY_URL}/..." \214-H "Ocp-Apim-Trace: true" \215-H "Ocp-Apim-Subscription-Key: <built-in-key>"216```217218### Application Insights219220If APIM is connected to Application Insights:221222```kql223// Failed AI gateway requests224requests225| where success == false226| where url contains "openai"227| project timestamp, resultCode, duration, url228| order by timestamp desc229| take 20230231// Token metrics over time232customMetrics233| where name == "Total Tokens"234| summarize TotalTokens = sum(value) by bin(timestamp, 1h)235| render timechart236237// Content safety blocks238traces239| where message contains "content-safety"240| project timestamp, message, customDimensions241| order by timestamp desc242```243244### Health Check245246Quick validation that the AI Gateway is functioning:247248```bash249# 1. Check APIM is running250az apim show --name <apim> --resource-group <rg> --query "provisioningState" -o tsv251# Expected: Succeeded252253# 2. Check backends254az apim backend list --service-name <apim> --resource-group <rg> -o table255256# 3. Test endpoint257curl -s -o /dev/null -w "%{http_code}" "${GATEWAY_URL}/openai/deployments/<deployment>/chat/completions?api-version=2024-02-01" \258-H "Ocp-Apim-Subscription-Key: <key>" \259-H "Content-Type: application/json" \260-d '{"messages": [{"role": "user", "content": "ping"}], "max_tokens": 5}'261# Expected: 200262```263264---265266## References267268- [APIM Diagnostics](https://learn.microsoft.com/azure/api-management/diagnose-solve-problems)269- [AI Gateway Monitoring](https://learn.microsoft.com/azure/api-management/genai-gateway-capabilities#monitoring-and-analytics)270- [APIM Error Handling](https://learn.microsoft.com/azure/api-management/api-management-error-handling-policies)271