Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Build and deploy AI applications on Azure AI Foundry using Microsoft's model catalog and AI services
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
finetuning/SKILL.md
1---2name: finetuning3description: "Fine-tune models on Azure AI Foundry using SFT (supervised), DPO (preference), or RFT (reinforcement with graders). Covers dataset preparation, training job submission, deployment, and evaluation. USE FOR: fine-tune, SFT, DPO, RFT, training data, grader, distillation, fine-tuned model, training job, large file upload, calibrate grader, deploy fine-tuned model, evaluate fine-tuned model. DO NOT USE FOR: general model deployment without fine-tuning (use deploy-model), agent creation (use agents), prompt optimization without training (use prompt-optimizer)."4license: MIT5metadata:6author: Microsoft7version: "0.0.0-placeholder"8---910# Fine-Tuning on Azure AI Foundry1112Fine-tune models using SFT (supervised), DPO (preference), or RFT (reinforcement with graders). Covers dataset prep, training, deployment, and evaluation.1314## When to Use1516Use this sub-skill when the user asks about:17- Fine-tuning a model (SFT, DPO, or RFT)18- Preparing, validating, or formatting training data19- Submitting, monitoring, or diagnosing training jobs20- Calibrating graders or pass thresholds for RFT21- Deploying or evaluating a fine-tuned model22- Choosing between training types (SFT vs DPO vs RFT)23- Distillation, synthetic data generation, or dataset quality scoring24- Large file uploads for training data25- Cleaning up fine-tuning resources (files, deployments)2627**Do NOT use for:** General model deployment without fine-tuning (use deploy-model), agent creation (use agents), prompt optimization without training (use prompt-optimizer).2829## Workflows3031| Stage | Guide |32|-------|-------|33| **Quick start** | [workflows/quickstart.md](workflows/quickstart.md) |34| **Full pipeline** | [workflows/full-pipeline.md](workflows/full-pipeline.md) |35| **Create data** | [workflows/dataset-creation.md](workflows/dataset-creation.md) |36| **Iterate** | [workflows/iterative-training.md](workflows/iterative-training.md) |37| **Diagnose** | [workflows/diagnose-poor-results.md](workflows/diagnose-poor-results.md) |3839## References4041| Topic | File |42|-------|------|43| SFT vs DPO vs RFT | [references/training-types.md](references/training-types.md) |44| Hyperparameters | [references/hyperparameters.md](references/hyperparameters.md) |45| Data formats | [references/dataset-formats.md](references/dataset-formats.md) |46| Grader design (RFT) | [references/grader-design.md](references/grader-design.md) |47| Reward hacking | [references/reward-hacking.md](references/reward-hacking.md) |48| Agentic RFT (tools) | [references/agentic-rft.md](references/agentic-rft.md) |49| Deployment | [references/deployment.md](references/deployment.md) |50| Training curves | [references/training-curves.md](references/training-curves.md) |51| Evaluation | [references/evaluation.md](references/evaluation.md) |52| Vision fine-tuning | [references/vision-fine-tuning.md](references/vision-fine-tuning.md) |53| Large file uploads | [references/large-file-uploads.md](references/large-file-uploads.md) |54| Platform gotchas | [references/platform-gotchas.md](references/platform-gotchas.md) |5556## Scripts5758| Script | Purpose |59|--------|---------|60| `scripts/submit_training.py` | Submit SFT/DPO/RFT jobs |61| `scripts/monitor_training.py` | Poll job until completion |62| `scripts/calibrate_grader.py` | Find optimal RFT pass_threshold |63| `scripts/check_training.py` | Analyze curves, list checkpoints |64| `scripts/deploy_model.py` | Deploy via ARM REST API |65| `scripts/evaluate_model.py` | LLM judge evaluation |66| `scripts/convert_dataset.py` | Convert between SFT/DPO/RFT formats |67| `scripts/generate_distillation_data.py` | Generate synthetic training data |68| `scripts/score_dataset.py` | Quality scoring on training data |69| `scripts/cleanup.py` | Delete old files and deployments |70| `scripts/validate/` | Data validators (SFT, DPO, RFT) + stats |7172## Rules73741. **Always baseline first** — evaluate the base model before fine-tuning752. **Validate data** before submitting — run `scripts/validate/validate_sft.py`763. **Calibrate RFT graders** — target 25-50% failure rate on the base model774. **Evaluate checkpoints** — don't blindly deploy the final one785. **Measure token cost** alongside accuracy when comparing models7980## Quick Reference8182| Task | Command |83|------|---------|84| Validate SFT data | `python scripts/validate/validate_sft.py data.jsonl` |85| Submit SFT job | `python scripts/submit_training.py --model gpt-4.1-mini --training-file train.jsonl --validation-file val.jsonl --type sft` |86| Monitor job | `python scripts/monitor_training.py --job-id ftjob-xxx` |87| Analyze curves | `python scripts/check_training.py --job-id ftjob-xxx` |88| Deploy model | `python scripts/deploy_model.py --model-id ft:gpt-4.1-mini:... --name my-eval` |89| Evaluate model | `python scripts/evaluate_model.py --deployment-name my-eval --test-file test.jsonl` |9091## Error Handling9293| Error | Cause | Fix |94|-------|-------|-----|95| "API version not supported" | Older `openai` SDK on `/v1/` endpoint | Upgrade to `openai>=1.0` |96| "does not support fine-tuning with Standard TrainingType" | OSS model needs `globalStandard` | Use `--use-rest` flag or script auto-falls back |97| Job stuck in post-training eval | Under-provisioned tool endpoint (RFT) | Scale to S2+, enable Always On |98| "DeploymentNotReady" after ARM succeeds | ARM/data-plane race condition | Delete and recreate deployment, wait 5 min |99| Content safety block at deployment | PII-dense training data | Remove problematic document types |100