Source from repo

Azure AI Gateway

Design and configure Azure API Management as an AI Gateway for LLM traffic routing and rate limiting

microsoftGitHub microsoftOfficialSource repo Original GitHub link Publisher page

Files

Skill

n/a

Size

39.5 KB

Entrypoint

SKILL.md

Format

git-repo

Open file

SKILL.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown130 linesEntrypointFree

SKILL.md

1---
2name: azure-aigateway
3description: "Configure Azure API Management as an AI Gateway for AI models, MCP tools, and agents. WHEN: semantic caching, token limit, content safety, load balancing, AI model governance, MCP rate limiting, jailbreak detection, add Azure OpenAI backend, add AI Foundry model, test AI gateway, LLM policies, configure AI backend, token metrics, AI cost control, convert API to MCP, import OpenAPI to gateway."
4license: MIT
5metadata:
6  author: Microsoft
7  version: "0.0.0-placeholder"
8compatibility: Requires Azure CLI (az) for configuration and testing
9---
10 
11# Azure AI Gateway
12 
13Configure Azure API Management (APIM) as an AI Gateway for governing AI models, MCP tools, and agents.
14 
15> **To deploy APIM**, use the **azure-prepare** skill. See [APIM deployment guide](https://learn.microsoft.com/azure/api-management/get-started-create-service-instance).
16 
17## When to Use This Skill
18 
19| Category | Triggers |
20|----------|----------|
21| **Model Governance** | "semantic caching", "token limits", "load balance AI", "track token usage" |
22| **Tool Governance** | "rate limit MCP", "protect my tools", "configure my tool", "convert API to MCP" |
23| **Agent Governance** | "content safety", "jailbreak detection", "filter harmful content" |
24| **Configuration** | "add Azure OpenAI backend", "configure my model", "add AI Foundry model" |
25| **Testing** | "test AI gateway", "call OpenAI through gateway" |
26 
27---
28 
29## Quick Reference
30 
31| Policy | Purpose | Details |
32|--------|---------|---------|
33| `azure-openai-token-limit` | Cost control | [Model Policies](references/policies.md#token-rate-limiting) |
34| `azure-openai-semantic-cache-lookup/store` | 60-80% cost savings | [Model Policies](references/policies.md#semantic-caching) |
35| `azure-openai-emit-token-metric` | Observability | [Model Policies](references/policies.md#token-metrics) |
36| `llm-content-safety` | Safety & compliance | [Agent Policies](references/policies.md#content-safety) |
37| `rate-limit-by-key` | MCP/tool protection | [Tool Policies](references/policies.md#request-rate-limiting) |
38 
39---
40 
41## Get Gateway Details
42 
43```bash
44# Get gateway URL
45az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv
46 
47# List backends (AI models)
48az apim backend list --service-name <apim-name> --resource-group <rg> \
49  --query "[].{id:name, url:url}" -o table
50 
51# Get subscription key
52az apim subscription keys list \
53  --service-name <apim-name> --resource-group <rg> --subscription-id <sub-id>
54```
55 
56---
57 
58## Test AI Endpoint
59 
60```bash
61GATEWAY_URL=$(az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv)
62 
63curl -X POST "${GATEWAY_URL}/openai/deployments/<deployment>/chat/completions?api-version=2024-02-01" \
64  -H "Content-Type: application/json" \
65  -H "Ocp-Apim-Subscription-Key: <key>" \
66  -d '{"messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100}'
67```
68 
69---
70 
71## Common Tasks
72 
73### Add AI Backend
74 
75See [references/patterns.md](references/patterns.md#pattern-1-add-ai-model-backend) for full steps.
76 
77```bash
78# Discover AI resources
79az cognitiveservices account list --query "[?kind=='OpenAI']" -o table
80 
81# Create backend
82az apim backend create --service-name <apim> --resource-group <rg> \
83  --backend-id openai-backend --protocol http --url "https://<aoai>.openai.azure.com/openai"
84 
85# Grant access (managed identity)
86az role assignment create --assignee <apim-principal-id> \
87  --role "Cognitive Services User" --scope <aoai-resource-id>
88```
89 
90### Apply AI Governance Policy
91 
92Recommended policy order in `<inbound>`:
93 
941. **Authentication** - Managed identity to backend
952. **Semantic Cache Lookup** - Check cache before calling AI
963. **Token Limits** - Cost control
974. **Content Safety** - Filter harmful content
985. **Backend Selection** - Load balancing
996. **Metrics** - Token usage tracking
100 
101See [references/policies.md](references/policies.md#combining-policies) for complete example.
102 
103---
104 
105## Troubleshooting
106 
107| Issue | Solution |
108|-------|----------|
109| Token limit 429 | Increase `tokens-per-minute` or add load balancing |
110| No cache hits | Lower `score-threshold` to 0.7 |
111| Content false positives | Increase category thresholds (5-6) |
112| Backend auth 401 | Grant APIM "Cognitive Services User" role |
113 
114See [references/troubleshooting.md](references/troubleshooting.md) for details.
115 
116---
117 
118## References
119 
120- [**Detailed Policies**](references/policies.md) - Full policy examples
121- [**Configuration Patterns**](references/patterns.md) - Step-by-step patterns
122- [**Troubleshooting**](references/troubleshooting.md) - Common issues
123- [AI-Gateway Samples](https://github.com/Azure-Samples/AI-Gateway)
124- [GenAI Gateway Docs](https://learn.microsoft.com/azure/api-management/genai-gateway-capabilities)
125 
126## SDK Quick References
127 
128- **Content Safety**: [Python](references/sdk/azure-ai-contentsafety-py.md) | [TypeScript](references/sdk/azure-ai-contentsafety-ts.md)
129- **API Management**: [Python](references/sdk/azure-mgmt-apimanagement-py.md) | [.NET](references/sdk/azure-mgmt-apimanagement-dotnet.md)
130

Marketplace

Source from repo

Azure AI Gateway

Design and configure Azure API Management as an AI Gateway for LLM traffic routing and rate limiting

microsoftGitHub microsoftOfficialSource repo Original GitHub link Publisher page

Files

Skill

n/a

Size

39.5 KB

Entrypoint

SKILL.md

Format

git-repo

Open file

SKILL.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown130 linesEntrypointFree

SKILL.md

1---
2name: azure-aigateway
3description: "Configure Azure API Management as an AI Gateway for AI models, MCP tools, and agents. WHEN: semantic caching, token limit, content safety, load balancing, AI model governance, MCP rate limiting, jailbreak detection, add Azure OpenAI backend, add AI Foundry model, test AI gateway, LLM policies, configure AI backend, token metrics, AI cost control, convert API to MCP, import OpenAPI to gateway."
4license: MIT
5metadata:
6  author: Microsoft
7  version: "0.0.0-placeholder"
8compatibility: Requires Azure CLI (az) for configuration and testing
9---
10 
11# Azure AI Gateway
12 
13Configure Azure API Management (APIM) as an AI Gateway for governing AI models, MCP tools, and agents.
14 
15> **To deploy APIM**, use the **azure-prepare** skill. See [APIM deployment guide](https://learn.microsoft.com/azure/api-management/get-started-create-service-instance).
16 
17## When to Use This Skill
18 
19| Category | Triggers |
20|----------|----------|
21| **Model Governance** | "semantic caching", "token limits", "load balance AI", "track token usage" |
22| **Tool Governance** | "rate limit MCP", "protect my tools", "configure my tool", "convert API to MCP" |
23| **Agent Governance** | "content safety", "jailbreak detection", "filter harmful content" |
24| **Configuration** | "add Azure OpenAI backend", "configure my model", "add AI Foundry model" |
25| **Testing** | "test AI gateway", "call OpenAI through gateway" |
26 
27---
28 
29## Quick Reference
30 
31| Policy | Purpose | Details |
32|--------|---------|---------|
33| `azure-openai-token-limit` | Cost control | [Model Policies](references/policies.md#token-rate-limiting) |
34| `azure-openai-semantic-cache-lookup/store` | 60-80% cost savings | [Model Policies](references/policies.md#semantic-caching) |
35| `azure-openai-emit-token-metric` | Observability | [Model Policies](references/policies.md#token-metrics) |
36| `llm-content-safety` | Safety & compliance | [Agent Policies](references/policies.md#content-safety) |
37| `rate-limit-by-key` | MCP/tool protection | [Tool Policies](references/policies.md#request-rate-limiting) |
38 
39---
40 
41## Get Gateway Details
42 
43```bash
44# Get gateway URL
45az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv
46 
47# List backends (AI models)
48az apim backend list --service-name <apim-name> --resource-group <rg> \
49  --query "[].{id:name, url:url}" -o table
50 
51# Get subscription key
52az apim subscription keys list \
53  --service-name <apim-name> --resource-group <rg> --subscription-id <sub-id>
54```
55 
56---
57 
58## Test AI Endpoint
59 
60```bash
61GATEWAY_URL=$(az apim show --name <apim-name> --resource-group <rg> --query "gatewayUrl" -o tsv)
62 
63curl -X POST "${GATEWAY_URL}/openai/deployments/<deployment>/chat/completions?api-version=2024-02-01" \
64  -H "Content-Type: application/json" \
65  -H "Ocp-Apim-Subscription-Key: <key>" \
66  -d '{"messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100}'
67```
68 
69---
70 
71## Common Tasks
72 
73### Add AI Backend
74 
75See [references/patterns.md](references/patterns.md#pattern-1-add-ai-model-backend) for full steps.
76 
77```bash
78# Discover AI resources
79az cognitiveservices account list --query "[?kind=='OpenAI']" -o table
80 
81# Create backend
82az apim backend create --service-name <apim> --resource-group <rg> \
83  --backend-id openai-backend --protocol http --url "https://<aoai>.openai.azure.com/openai"
84 
85# Grant access (managed identity)
86az role assignment create --assignee <apim-principal-id> \
87  --role "Cognitive Services User" --scope <aoai-resource-id>
88```
89 
90### Apply AI Governance Policy
91 
92Recommended policy order in `<inbound>`:
93 
941. **Authentication** - Managed identity to backend
952. **Semantic Cache Lookup** - Check cache before calling AI
963. **Token Limits** - Cost control
974. **Content Safety** - Filter harmful content
985. **Backend Selection** - Load balancing
996. **Metrics** - Token usage tracking
100 
101See [references/policies.md](references/policies.md#combining-policies) for complete example.
102 
103---
104 
105## Troubleshooting
106 
107| Issue | Solution |
108|-------|----------|
109| Token limit 429 | Increase `tokens-per-minute` or add load balancing |
110| No cache hits | Lower `score-threshold` to 0.7 |
111| Content false positives | Increase category thresholds (5-6) |
112| Backend auth 401 | Grant APIM "Cognitive Services User" role |
113 
114See [references/troubleshooting.md](references/troubleshooting.md) for details.
115 
116---
117 
118## References
119 
120- [**Detailed Policies**](references/policies.md) - Full policy examples
121- [**Configuration Patterns**](references/patterns.md) - Step-by-step patterns
122- [**Troubleshooting**](references/troubleshooting.md) - Common issues
123- [AI-Gateway Samples](https://github.com/Azure-Samples/AI-Gateway)
124- [GenAI Gateway Docs](https://learn.microsoft.com/azure/api-management/genai-gateway-capabilities)
125 
126## SDK Quick References
127 
128- **Content Safety**: [Python](references/sdk/azure-ai-contentsafety-py.md) | [TypeScript](references/sdk/azure-ai-contentsafety-ts.md)
129- **API Management**: [Python](references/sdk/azure-mgmt-apimanagement-py.md) | [.NET](references/sdk/azure-mgmt-apimanagement-dotnet.md)
130

Azure AI Gateway

SKILL.md

Preparing the source view

Azure AI Gateway

SKILL.md