Source from repo
Microsoft Foundry Skill

Deploy, evaluate, and manage AI agents end-to-end on Microsoft Azure AI Foundry
microsoftGitHub microsoftOfficialSource repo Original GitHub link Publisher page
Files
Skill
n/a
Size
560.1 KB
Entrypoint
SKILL.md
Format
git-repo
Open file
models/deploy-model/SKILL.md

Syntax-highlighted preview of this file as included in the skill package.
Rendered Source
markdown145 linesFree
models/deploy-model/SKILL.md
1---
2name: deploy-model
3description: "Unified Azure OpenAI model deployment skill with intelligent intent-based routing. Handles quick preset deployments, fully customized deployments (version/SKU/capacity/RAI policy), and capacity discovery across regions and projects. USE FOR: deploy model, deploy gpt, create deployment, model deployment, deploy openai model, set up model, provision model, find capacity, check model availability, where can I deploy, best region for model, capacity analysis. DO NOT USE FOR: listing existing deployments (use foundry_models_deployments_list MCP tool), deleting deployments, agent creation (use agent/create), project creation (use project/create)."
4license: MIT
5metadata:
6  author: Microsoft
7  version: "1.0.0"
8---
9 
10# Deploy Model
11 
12Unified entry point for all Azure OpenAI model deployment workflows. Analyzes user intent and routes to the appropriate deployment mode.
13 
14## Quick Reference
15 
16| Mode | When to Use | Sub-Skill |
17|------|-------------|-----------|
18| **Preset** | Quick deployment, no customization needed | [preset/SKILL.md](preset/SKILL.md) |
19| **Customize** | Full control: version, SKU, capacity, RAI policy | [customize/SKILL.md](customize/SKILL.md) |
20| **Capacity Discovery** | Find where you can deploy with specific capacity | [capacity/SKILL.md](capacity/SKILL.md) |
21 
22## Intent Detection
23 
24Analyze the user's prompt and route to the correct mode:
25 
26```
27User Prompt
28    │
29    ├─ Simple deployment (no modifiers)
30    │  "deploy gpt-4o", "set up a model"
31    │  └─> PRESET mode
32    │
33    ├─ Customization keywords present
34    │  "custom settings", "choose version", "select SKU",
35    │  "set capacity to X", "configure content filter",
36    │  "PTU deployment", "with specific quota"
37    │  └─> CUSTOMIZE mode
38    │
39    ├─ Capacity/availability query
40    │  "find where I can deploy", "check capacity",
41    │  "which region has X capacity", "best region for 10K TPM",
42    │  "where is this model available"
43    │  └─> CAPACITY DISCOVERY mode
44    │
45    └─ Ambiguous (has capacity target + deploy intent)
46       "deploy gpt-4o with 10K capacity to best region"
47       └─> CAPACITY DISCOVERY first → then PRESET or CUSTOMIZE
48```
49 
50### Routing Rules
51 
52| Signal in Prompt | Route To | Reason |
53|------------------|----------|--------|
54| Just model name, no options | **Preset** | User wants quick deployment |
55| "custom", "configure", "choose", "select" | **Customize** | User wants control |
56| "find", "check", "where", "which region", "available" | **Capacity** | User wants discovery |
57| Specific capacity number + "best region" | **Capacity → Preset** | Discover then deploy quickly |
58| Specific capacity number + "custom" keywords | **Capacity → Customize** | Discover then deploy with options |
59| "PTU", "provisioned throughput" | **Customize** | PTU requires SKU selection |
60| "optimal region", "best region" (no capacity target) | **Preset** | Region optimization is preset's specialty |
61 
62### Multi-Mode Chaining
63 
64Some prompts require two modes in sequence:
65 
66**Pattern: Capacity → Deploy**
67When a user specifies a capacity requirement AND wants deployment:
681. Run **Capacity Discovery** to find regions/projects with sufficient quota
692. Present findings to user
703. Ask: "Would you like to deploy with **quick defaults** or **customize settings**?"
714. Route to **Preset** or **Customize** based on answer
72 
73> 💡 **Tip:** If unsure which mode the user wants, default to **Preset** (quick deployment). Users who want customization will typically use explicit keywords like "custom", "configure", or "with specific settings".
74 
75## Project Selection (All Modes)
76 
77Before any deployment, resolve which project to deploy to. This applies to **all** modes (preset, customize, and after capacity discovery).
78 
79### Resolution Order
80 
811. **Check `PROJECT_RESOURCE_ID` env var** — if set, use it as the default
822. **Check user prompt** — if user named a specific project or region, use that
833. **If neither** — query the user's projects and suggest the current one
84 
85### Confirmation Step (Required)
86 
87**Always confirm the target before deploying.** Show the user what will be used and give them a chance to change it:
88 
89```
90Deploying to:
91  Project:  <project-name>
92  Region:   <region>
93  Resource: <resource-group>
94 
95Is this correct? Or choose a different project:
96  1. ✅ Yes, deploy here (default)
97  2. 📋 Show me other projects in this region
98  3. 🌍 Choose a different region
99```
100 
101If user picks option 2, show top 5 projects in that region:
102 
103```
104Projects in <region>:
105  1. project-alpha (rg-alpha)
106  2. project-beta (rg-beta)
107  3. project-gamma (rg-gamma)
108  ...
109```
110 
111> ⚠️ **Never deploy without showing the user which project will be used.** This prevents accidental deployments to the wrong resource.
112 
113## Pre-Deployment Validation (All Modes)
114 
115Before presenting any deployment options (SKU, capacity), always validate both of these:
116 
1171. **Model supports the SKU** — query the model catalog to confirm the selected model+version supports the target SKU:
118   ```bash
119   az cognitiveservices model list --location <region> --subscription <sub-id> -o json
120   ```
121   Filter for the model, extract `.model.skus[].name` to get supported SKUs.
122 
1232. **Subscription has available quota** — check that the user's subscription has unallocated quota for the SKU+model combination:
124   ```bash
125   az cognitiveservices usage list --location <region> --subscription <sub-id> -o json
126   ```
127   Match by usage name pattern `OpenAI.<SKU>.<model-name>` (e.g., `OpenAI.GlobalStandard.gpt-4o`). Compute `available = limit - currentValue`.
128 
129> ⚠️ **Warning:** Only present options that pass both checks. Do NOT show hardcoded SKU lists — always query dynamically. SKUs with 0 available quota should be shown as ❌ informational items, not selectable options.
130 
131> 💡 **Quota management:** For quota increase requests, usage monitoring, and troubleshooting quota errors, defer to the [quota skill](../../quota/quota.md) instead of duplicating that guidance inline.
132 
133## Prerequisites
134 
135All deployment modes require:
136- Azure CLI installed and authenticated (`az login`)
137- Active Azure subscription with deployment permissions
138- Azure AI Foundry project resource ID (or agent will help discover it via `PROJECT_RESOURCE_ID` env var)
139 
140## Sub-Skills
141 
142- **[preset/SKILL.md](preset/SKILL.md)** — Quick deployment to optimal region with sensible defaults
143- **[customize/SKILL.md](customize/SKILL.md)** — Interactive guided flow with full configuration control
144- **[capacity/SKILL.md](capacity/SKILL.md)** — Discover available capacity across regions and projects
145
Preparing the source view

Microsoft Foundry Skill

models/deploy-model/SKILL.md