Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Build and deploy AI applications on Azure AI Foundry using Microsoft's model catalog and AI services
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
models/deploy-model/customize/SKILL.md
1---2name: customize3description: "Interactive guided deployment flow for Azure OpenAI models with full customization control. Step-by-step selection of model version, SKU (GlobalStandard/Standard/ProvisionedManaged), capacity, RAI policy (content filter), and advanced options (dynamic quota, priority processing, spillover). USE FOR: custom deployment, customize model deployment, choose version, select SKU, set capacity, configure content filter, RAI policy, deployment options, detailed deployment, advanced deployment, PTU deployment, provisioned throughput. DO NOT USE FOR: quick deployment to optimal region (use preset)."4license: MIT5metadata:6author: Microsoft7version: "1.0.1"8---910# Customize Model Deployment1112Interactive guided workflow for deploying Azure OpenAI models with full customization control over version, SKU, capacity, content filtering, and advanced options.1314## Quick Reference1516| Property | Description |17|----------|-------------|18| **Flow** | Interactive step-by-step guided deployment |19| **Customization** | Version, SKU, Capacity, RAI Policy, Advanced Options |20| **SKU Support** | GlobalStandard, Standard, ProvisionedManaged, DataZoneStandard |21| **Best For** | Precise control over deployment configuration |22| **Authentication** | Azure CLI (`az login`) |23| **Tools** | Azure CLI, MCP tools (optional) |2425## When to Use This Skill2627Use this skill when you need **precise control** over deployment configuration:2829- ✅ **Choose specific model version** (not just latest)30- ✅ **Select deployment SKU** (GlobalStandard vs Standard vs PTU)31- ✅ **Set exact capacity** within available range32- ✅ **Configure content filtering** (RAI policy selection)33- ✅ **Enable advanced features** (dynamic quota, priority processing, spillover)34- ✅ **PTU deployments** (Provisioned Throughput Units)3536**Alternative:** Use `preset` for quick deployment to the best available region with automatic configuration.3738### Comparison: customize vs preset3940| Feature | customize | preset |41|---------|---------------------|----------------------------|42| **Focus** | Full customization control | Optimal region selection |43| **Version Selection** | User chooses from available | Uses latest automatically |44| **SKU Selection** | User chooses (GlobalStandard/Standard/PTU) | GlobalStandard only |45| **Capacity** | User specifies exact value | Auto-calculated (50% of available) |46| **RAI Policy** | User selects from options | Default policy only |47| **Region** | Current region first, falls back to all regions if no capacity | Checks capacity across all regions upfront |48| **Use Case** | Precise deployment requirements | Quick deployment to best region |4950## Prerequisites5152- Azure subscription with Cognitive Services Contributor or Owner role53- Azure AI Foundry project resource ID (format: `/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{account}/projects/{project}`)54- Azure CLI installed and authenticated (`az login`)55- Optional: Set `PROJECT_RESOURCE_ID` environment variable5657## Workflow Overview5859### Complete Flow (14 Phases)6061```621. Verify Authentication632. Get Project Resource ID643. Verify Project Exists654. Get Model Name (if not provided)665. List Model Versions → User Selects676. List SKUs for Version → User Selects687. Get Capacity Range → User Configures697b. If no capacity: Cross-Region Fallback → Query all regions → User selects region/project708. List RAI Policies → User Selects719. Configure Advanced Options (if applicable)7210. Configure Version Upgrade Policy7311. Generate Deployment Name7412. Review Configuration7513. Execute Deployment & Monitor76```7778### Fast Path (Defaults)7980If user accepts all defaults (latest version, GlobalStandard SKU, recommended capacity, default RAI policy, standard upgrade policy), deployment completes in ~5 interactions.8182---8384## Phase Summaries8586> ⚠️ **MUST READ:** Before executing any phase, load [references/customize-workflow.md](references/customize-workflow.md) for the full scripts and implementation details. The summaries below describe *what* each phase does — the reference file contains the *how* (CLI commands, quota patterns, capacity formulas, cross-region fallback logic).8788| Phase | Action | Key Details |89|-------|--------|-------------|90| **1. Verify Auth** | Check `az account show`; prompt `az login` if needed | Verify correct subscription is active |91| **2. Get Project ID** | Read `PROJECT_RESOURCE_ID` env var or prompt user | ARM resource ID format required |92| **3. Verify Project** | Parse resource ID, call `az cognitiveservices account show` | Extracts subscription, RG, account, project, region |93| **4. Get Model** | List models via `az cognitiveservices account list-models` | User selects from available or enters custom name |94| **5. Select Version** | Query versions for chosen model | Recommend latest; user picks from list |95| **6. Select SKU** | Query model catalog + subscription quota, show only deployable SKUs | ⚠️ Never hardcode SKU lists — always query live data |96| **7. Configure Capacity** | Query capacity API, validate min/max/step, user enters value | Cross-region fallback if no capacity in current region |97| **8. Select RAI Policy** | Present content filter options | Default: `Microsoft.DefaultV2` |98| **9. Advanced Options** | Dynamic quota (GlobalStandard), priority processing (PTU), spillover | SKU-dependent availability |99| **10. Upgrade Policy** | Choose: OnceNewDefaultVersionAvailable / OnceCurrentVersionExpired / NoAutoUpgrade | Default: auto-upgrade on new default |100| **11. Deployment Name** | Auto-generate unique name, allow custom override | Validates format: `^[\w.-]{2,64}$` |101| **12. Review** | Display full config summary, confirm before proceeding | User approves or cancels |102| **13. Deploy & Monitor** | `az cognitiveservices account deployment create`, poll status | Timeout after 5 min; show endpoint + portal link |103104105---106107## Error Handling108109### Common Issues and Resolutions110111| Error | Cause | Resolution |112|-------|-------|------------|113| **Model not found** | Invalid model name | List available models with `az cognitiveservices account list-models` |114| **Version not available** | Version not supported for SKU | Select different version or SKU |115| **Insufficient quota** | Capacity > available quota | Skill auto-searches all regions; fails only if no region has quota |116| **SKU not supported** | SKU not available in region | Cross-region fallback searches other regions automatically |117| **Capacity out of range** | Invalid capacity value | **PREVENTED**: Skill validates min/max/step at input (Phase 7) |118| **Deployment name exists** | Name conflict | Auto-incremented name generation |119| **Authentication failed** | Not logged in | Run `az login` |120| **Permission denied** | Insufficient permissions | Assign Cognitive Services Contributor role |121| **Capacity query fails** | API/permissions/network error | **DEPLOYMENT BLOCKED**: Will not proceed without valid quota data |122123### Troubleshooting Commands124125```bash126# Check deployment status127az cognitiveservices account deployment show --name <account> --resource-group <rg> --deployment-name <name>128129# List all deployments130az cognitiveservices account deployment list --name <account> --resource-group <rg> -o table131132# Check quota usage133az cognitiveservices usage list --name <account> --resource-group <rg>134135# Delete failed deployment136az cognitiveservices account deployment delete --name <account> --resource-group <rg> --deployment-name <name>137```138139---140141## Selection Guides & Advanced Topics142143> For SKU comparison tables, PTU sizing formulas, and advanced option details, load [references/customize-guides.md](references/customize-guides.md).144145**SKU selection:** GlobalStandard (production/HA) → Standard (dev/test) → ProvisionedManaged (high-volume/guaranteed throughput) → DataZoneStandard (data residency).146147**Capacity:** TPM-based SKUs range from 1K (dev) to 100K+ (large production). PTU-based use formula: `(Input TPM × 0.001) + (Output TPM × 0.002) + (Requests/min × 0.1)`.148149**Advanced options:** Dynamic quota (GlobalStandard only), priority processing (PTU only, extra cost), spillover (overflow to backup deployment).150151---152153## Related Skills154155- **preset** - Quick deployment to best region with automatic configuration156- **microsoft-foundry** - Parent skill for all Azure AI Foundry operations157- **[quota](../../../quota/quota.md)** — For quota viewing, increase requests, and troubleshooting quota errors, defer to this skill instead of duplicating guidance158- **rbac** - Manage permissions and access control159160---161162## Notes163164- Set `PROJECT_RESOURCE_ID` environment variable to skip prompt165- Not all SKUs available in all regions; capacity varies by subscription/region/model166- Custom RAI policies can be configured in Azure Portal167- Automatic version upgrades occur during maintenance windows168- Use Azure Monitor and Application Insights for production deployments