Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Deploy, evaluate, and manage AI agents end-to-end on Microsoft Azure AI Foundry
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
models/deploy-model/SKILL.md
1---2name: deploy-model3description: "Unified Azure OpenAI model deployment skill with intelligent intent-based routing. Handles quick preset deployments, fully customized deployments (version/SKU/capacity/RAI policy), and capacity discovery across regions and projects. USE FOR: deploy model, deploy gpt, create deployment, model deployment, deploy openai model, set up model, provision model, find capacity, check model availability, where can I deploy, best region for model, capacity analysis. DO NOT USE FOR: listing existing deployments (use foundry_models_deployments_list MCP tool), deleting deployments, agent creation (use agent/create), project creation (use project/create)."4license: MIT5metadata:6author: Microsoft7version: "1.0.0"8---910# Deploy Model1112Unified entry point for all Azure OpenAI model deployment workflows. Analyzes user intent and routes to the appropriate deployment mode.1314## Quick Reference1516| Mode | When to Use | Sub-Skill |17|------|-------------|-----------|18| **Preset** | Quick deployment, no customization needed | [preset/SKILL.md](preset/SKILL.md) |19| **Customize** | Full control: version, SKU, capacity, RAI policy | [customize/SKILL.md](customize/SKILL.md) |20| **Capacity Discovery** | Find where you can deploy with specific capacity | [capacity/SKILL.md](capacity/SKILL.md) |2122## Intent Detection2324Analyze the user's prompt and route to the correct mode:2526```27User Prompt28│29├─ Simple deployment (no modifiers)30│ "deploy gpt-4o", "set up a model"31│ └─> PRESET mode32│33├─ Customization keywords present34│ "custom settings", "choose version", "select SKU",35│ "set capacity to X", "configure content filter",36│ "PTU deployment", "with specific quota"37│ └─> CUSTOMIZE mode38│39├─ Capacity/availability query40│ "find where I can deploy", "check capacity",41│ "which region has X capacity", "best region for 10K TPM",42│ "where is this model available"43│ └─> CAPACITY DISCOVERY mode44│45└─ Ambiguous (has capacity target + deploy intent)46"deploy gpt-4o with 10K capacity to best region"47└─> CAPACITY DISCOVERY first → then PRESET or CUSTOMIZE48```4950### Routing Rules5152| Signal in Prompt | Route To | Reason |53|------------------|----------|--------|54| Just model name, no options | **Preset** | User wants quick deployment |55| "custom", "configure", "choose", "select" | **Customize** | User wants control |56| "find", "check", "where", "which region", "available" | **Capacity** | User wants discovery |57| Specific capacity number + "best region" | **Capacity → Preset** | Discover then deploy quickly |58| Specific capacity number + "custom" keywords | **Capacity → Customize** | Discover then deploy with options |59| "PTU", "provisioned throughput" | **Customize** | PTU requires SKU selection |60| "optimal region", "best region" (no capacity target) | **Preset** | Region optimization is preset's specialty |6162### Multi-Mode Chaining6364Some prompts require two modes in sequence:6566**Pattern: Capacity → Deploy**67When a user specifies a capacity requirement AND wants deployment:681. Run **Capacity Discovery** to find regions/projects with sufficient quota692. Present findings to user703. Ask: "Would you like to deploy with **quick defaults** or **customize settings**?"714. Route to **Preset** or **Customize** based on answer7273> 💡 **Tip:** If unsure which mode the user wants, default to **Preset** (quick deployment). Users who want customization will typically use explicit keywords like "custom", "configure", or "with specific settings".7475## Project Selection (All Modes)7677Before any deployment, resolve which project to deploy to. This applies to **all** modes (preset, customize, and after capacity discovery).7879### Resolution Order80811. **Check `PROJECT_RESOURCE_ID` env var** — if set, use it as the default822. **Check user prompt** — if user named a specific project or region, use that833. **If neither** — query the user's projects and suggest the current one8485### Confirmation Step (Required)8687**Always confirm the target before deploying.** Show the user what will be used and give them a chance to change it:8889```90Deploying to:91Project: <project-name>92Region: <region>93Resource: <resource-group>9495Is this correct? Or choose a different project:961. ✅ Yes, deploy here (default)972. 📋 Show me other projects in this region983. 🌍 Choose a different region99```100101If user picks option 2, show top 5 projects in that region:102103```104Projects in <region>:1051. project-alpha (rg-alpha)1062. project-beta (rg-beta)1073. project-gamma (rg-gamma)108...109```110111> ⚠️ **Never deploy without showing the user which project will be used.** This prevents accidental deployments to the wrong resource.112113## Pre-Deployment Validation (All Modes)114115Before presenting any deployment options (SKU, capacity), always validate both of these:1161171. **Model supports the SKU** — query the model catalog to confirm the selected model+version supports the target SKU:118```bash119az cognitiveservices model list --location <region> --subscription <sub-id> -o json120```121Filter for the model, extract `.model.skus[].name` to get supported SKUs.1221232. **Subscription has available quota** — check that the user's subscription has unallocated quota for the SKU+model combination:124```bash125az cognitiveservices usage list --location <region> --subscription <sub-id> -o json126```127Match by usage name pattern `OpenAI.<SKU>.<model-name>` (e.g., `OpenAI.GlobalStandard.gpt-4o`). Compute `available = limit - currentValue`.128129> ⚠️ **Warning:** Only present options that pass both checks. Do NOT show hardcoded SKU lists — always query dynamically. SKUs with 0 available quota should be shown as ❌ informational items, not selectable options.130131> 💡 **Quota management:** For quota increase requests, usage monitoring, and troubleshooting quota errors, defer to the [quota skill](../../quota/quota.md) instead of duplicating that guidance inline.132133## Prerequisites134135All deployment modes require:136- Azure CLI installed and authenticated (`az login`)137- Active Azure subscription with deployment permissions138- Azure AI Foundry project resource ID (or agent will help discover it via `PROJECT_RESOURCE_ID` env var)139140## Sub-Skills141142- **[preset/SKILL.md](preset/SKILL.md)** — Quick deployment to optimal region with sensible defaults143- **[customize/SKILL.md](customize/SKILL.md)** — Interactive guided flow with full configuration control144- **[capacity/SKILL.md](capacity/SKILL.md)** — Discover available capacity across regions and projects145