Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Deploy, evaluate, and manage AI agents end-to-end on Microsoft Azure AI Foundry
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
models/deploy-model/customize/SKILL.md
1---2name: customize3description: "Interactive guided deployment flow for Azure OpenAI models with full customization control. Step-by-step selection of model version, SKU (GlobalStandard/Standard/ProvisionedManaged), capacity, RAI policy (content filter), and advanced options (dynamic quota, priority processing, spillover). USE FOR: custom deployment, customize model deployment, choose version, select SKU, set capacity, configure content filter, RAI policy, deployment options, detailed deployment, advanced deployment, PTU deployment, provisioned throughput. DO NOT USE FOR: quick deployment to optimal region (use preset)."4license: MIT5metadata:6author: Microsoft7version: "1.0.1"8---910# Customize Model Deployment1112Interactive guided workflow for deploying Azure OpenAI models with full customization control over version, SKU, capacity, content filtering, and advanced options.1314## Quick Reference1516| Property | Description |17|----------|-------------|18| **Flow** | Interactive step-by-step guided deployment |19| **Customization** | Version, SKU, Capacity, RAI Policy, Advanced Options |20| **SKU Support** | GlobalStandard, Standard, ProvisionedManaged, DataZoneStandard |21| **Best For** | Precise control over deployment configuration |22| **Authentication** | Azure CLI (`az login`) |23| **Tools** | Azure CLI, MCP tools (optional) |2425## When to Use This Skill2627Use this skill when you need **precise control** over deployment configuration:2829- ✅ **Choose specific model version** (not just latest)30- ✅ **Select deployment SKU** (GlobalStandard vs Standard vs PTU)31- ✅ **Set exact capacity** within available range32- ✅ **Configure content filtering** (RAI policy selection)33- ✅ **Enable advanced features** (dynamic quota, priority processing, spillover)34- ✅ **PTU deployments** (Provisioned Throughput Units)3536**Alternative:** Use `preset` for quick deployment to the best available region with automatic configuration.3738### Comparison: customize vs preset3940| Feature | customize | preset |41|---------|---------------------|----------------------------|42| **Focus** | Full customization control | Optimal region selection |43| **Version Selection** | User chooses from available | Uses latest automatically |44| **SKU Selection** | User chooses (GlobalStandard/Standard/PTU) | GlobalStandard only |45| **Capacity** | User specifies exact value | Auto-calculated (50% of available) |46| **RAI Policy** | User selects from options | Default policy only |47| **Region** | Current region first, falls back to all regions if no capacity | Checks capacity across all regions upfront |48| **Use Case** | Precise deployment requirements | Quick deployment to best region |4950## Prerequisites5152- Azure subscription with Cognitive Services Contributor or Owner role53- Azure AI Foundry project resource ID (format: `/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{account}/projects/{project}`)54- Azure CLI installed and authenticated (`az login`)55- Optional: Set `PROJECT_RESOURCE_ID` environment variable5657## Workflow Overview5859### Complete Flow (14 Phases)6061```621. Verify Authentication632. Get Project Resource ID643. Verify Project Exists654. Get Model Name (if not provided)665. List Model Versions → User Selects676. List SKUs for Version → User Selects687. Get Capacity Range → User Configures697b. If no capacity: Cross-Region Fallback → Query all regions → User selects region/project708. List RAI Policies → User Selects719. Configure Advanced Options (if applicable)7210. Configure Version Upgrade Policy7311. Generate Deployment Name7412. Review Configuration7513. Execute Deployment & Monitor76```7778### Fast Path (Defaults)7980If user accepts all defaults (latest version, GlobalStandard SKU, recommended capacity, default RAI policy, standard upgrade policy), deployment completes in ~5 interactions.8182---8384## Phase Summaries8586> ⚠️ **MUST READ:** Before executing any phase, load [references/customize-workflow.md](references/customize-workflow.md) for the full scripts and implementation details. The summaries below describe *what* each phase does — the reference file contains the *how* (CLI commands, quota patterns, capacity formulas, cross-region fallback logic).8788| Phase | Action | Key Details |89|-------|--------|-------------|90| **1. Verify Auth** | Check `az account show`; prompt `az login` if needed | Verify correct subscription is active |91| **2. Get Project ID** | Read `PROJECT_RESOURCE_ID` env var or prompt user | ARM resource ID format required |92| **3. Verify Project** | Parse resource ID, call `az cognitiveservices account show` | Extracts subscription, RG, account, project, region |93| **4. Get Model** | List models via `az cognitiveservices account list-models` | User selects from available or enters custom name |94| **5. Select Version** | Query versions for chosen model | Recommend latest; user picks from list |95| **6. Select SKU** | Query model catalog + subscription quota, show only deployable SKUs | ⚠️ Never hardcode SKU lists — always query live data |96| **7. Configure Capacity** | Query capacity API, validate min/max/step, user enters value | Cross-region fallback if no capacity in current region |97| **8. Select RAI Policy** | Present content filter options | Default: `Microsoft.DefaultV2` |98| **9. Advanced Options** | Dynamic quota (GlobalStandard), priority processing (PTU), spillover | SKU-dependent availability |99| **10. Upgrade Policy** | Choose: OnceNewDefaultVersionAvailable / OnceCurrentVersionExpired / NoAutoUpgrade | Default: auto-upgrade on new default |100| **11. Deployment Name** | Auto-generate unique name, allow custom override | Validates format: `^[\w.-]{2,64}$` |101| **12. Review** | Display full config summary, confirm before proceeding | User approves or cancels |102| **13. Deploy & Monitor** | `az cognitiveservices account deployment create`, poll status | Timeout after 5 min; show endpoint + portal link |103104105---106107## Error Handling108109### Common Issues and Resolutions110111| Error | Cause | Resolution |112|-------|-------|------------|113| **Model not found** | Invalid model name | List available models with `az cognitiveservices account list-models` |114| **Version not available** | Version not supported for SKU | Select different version or SKU |115| **Insufficient quota** | Capacity > available quota | Skill auto-searches all regions; fails only if no region has quota |116| **SKU not supported** | SKU not available in region | Cross-region fallback searches other regions automatically |117| **Capacity out of range** | Invalid capacity value | **PREVENTED**: Skill validates min/max/step at input (Phase 7) |118| **Deployment name exists** | Name conflict | Auto-incremented name generation |119| **Authentication failed** | Not logged in | Run `az login` |120| **Permission denied** | Insufficient permissions | Assign Cognitive Services Contributor role |121| **Capacity query fails** | API/permissions/network error | **DEPLOYMENT BLOCKED**: Will not proceed without valid quota data |122123### Troubleshooting Commands124125```bash126# Check deployment status127az cognitiveservices account deployment show --name <account> --resource-group <rg> --deployment-name <name>128129# List all deployments130az cognitiveservices account deployment list --name <account> --resource-group <rg> -o table131132# Check quota usage133az cognitiveservices usage list --name <account> --resource-group <rg>134135# Delete failed deployment136az cognitiveservices account deployment delete --name <account> --resource-group <rg> --deployment-name <name>137```138139---140141## Selection Guides & Advanced Topics142143> For SKU comparison tables, PTU sizing formulas, and advanced option details, load [references/customize-guides.md](references/customize-guides.md).144145**SKU selection:** GlobalStandard (production/HA) → Standard (dev/test) → ProvisionedManaged (high-volume/guaranteed throughput) → DataZoneStandard (data residency).146147**Capacity:** TPM-based SKUs range from 1K (dev) to 100K+ (large production). PTU-based use formula: `(Input TPM × 0.001) + (Output TPM × 0.002) + (Requests/min × 0.1)`.148149**Advanced options:** Dynamic quota (GlobalStandard only), priority processing (PTU only, extra cost), spillover (overflow to backup deployment).150151---152153## Related Skills154155- **preset** - Quick deployment to best region with automatic configuration156- **microsoft-foundry** - Parent skill for all Azure AI Foundry operations157- **[quota](../../../quota/quota.md)** — For quota viewing, increase requests, and troubleshooting quota errors, defer to this skill instead of duplicating guidance158- **rbac** - Manage permissions and access control159160---161162## Notes163164- Set `PROJECT_RESOURCE_ID` environment variable to skip prompt165- Not all SKUs available in all regions; capacity varies by subscription/region/model166- Custom RAI policies can be configured in Azure Portal167- Automatic version upgrades occur during maintenance windows168- Use Azure Monitor and Application Insights for production deployments