Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Deploy, evaluate, and manage AI agents end-to-end on Microsoft Azure AI Foundry
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
finetuning/SKILL.md
1---2name: finetuning3description: "Fine-tune models on Azure AI Foundry using SFT (supervised), DPO (preference), or RFT (reinforcement with graders). Covers dataset preparation, training job submission, deployment, and evaluation. USE FOR: fine-tune, SFT, DPO, RFT, training data, grader, distillation, fine-tuned model, training job, large file upload, calibrate grader, deploy fine-tuned model, evaluate fine-tuned model. DO NOT USE FOR: general model deployment without fine-tuning (use deploy-model), agent creation (use agents), prompt optimization without training (use prompt-optimizer)."4license: MIT5metadata:6author: Microsoft7version: "0.0.0-placeholder"8---910# Fine-Tuning on Azure AI Foundry1112Fine-tune models using SFT (supervised), DPO (preference), or RFT (reinforcement with graders). Covers dataset prep, training, deployment, and evaluation.1314## When to Use1516Use this sub-skill when the user asks about:17- Fine-tuning a model (SFT, DPO, or RFT)18- Preparing, validating, or formatting training data19- Submitting, monitoring, or diagnosing training jobs20- Calibrating graders or pass thresholds for RFT21- Deploying or evaluating a fine-tuned model22- Choosing between training types (SFT vs DPO vs RFT)23- Distillation, synthetic data generation, or dataset quality scoring24- Large file uploads for training data25- Cleaning up fine-tuning resources (files, deployments)2627**Do NOT use for:** General model deployment without fine-tuning (use deploy-model), agent creation (use agents), prompt optimization without training (use prompt-optimizer).2829## Workflows3031| Stage | Guide |32|-------|-------|33| **Quick start** | [workflows/quickstart.md](workflows/quickstart.md) |34| **Full pipeline** | [workflows/full-pipeline.md](workflows/full-pipeline.md) |35| **Create data** | [workflows/dataset-creation.md](workflows/dataset-creation.md) |36| **Iterate** | [workflows/iterative-training.md](workflows/iterative-training.md) |37| **Diagnose** | [workflows/diagnose-poor-results.md](workflows/diagnose-poor-results.md) |3839## References4041| Topic | File |42|-------|------|43| SFT vs DPO vs RFT | [references/training-types.md](references/training-types.md) |44| Hyperparameters | [references/hyperparameters.md](references/hyperparameters.md) |45| Data formats | [references/dataset-formats.md](references/dataset-formats.md) |46| Grader design (RFT) | [references/grader-design.md](references/grader-design.md) |47| Reward hacking | [references/reward-hacking.md](references/reward-hacking.md) |48| Agentic RFT (tools) | [references/agentic-rft.md](references/agentic-rft.md) |49| Deployment | [references/deployment.md](references/deployment.md) |50| Training curves | [references/training-curves.md](references/training-curves.md) |51| Evaluation | [references/evaluation.md](references/evaluation.md) |52| Vision fine-tuning | [references/vision-fine-tuning.md](references/vision-fine-tuning.md) |53| Large file uploads | [references/large-file-uploads.md](references/large-file-uploads.md) |54| Platform gotchas | [references/platform-gotchas.md](references/platform-gotchas.md) |5556## Scripts5758| Script | Purpose |59|--------|---------|60| `scripts/submit_training.py` | Submit SFT/DPO/RFT jobs |61| `scripts/monitor_training.py` | Poll job until completion |62| `scripts/calibrate_grader.py` | Find optimal RFT pass_threshold |63| `scripts/check_training.py` | Analyze curves, list checkpoints |64| `scripts/deploy_model.py` | Deploy via ARM REST API |65| `scripts/evaluate_model.py` | LLM judge evaluation |66| `scripts/convert_dataset.py` | Convert between SFT/DPO/RFT formats |67| `scripts/generate_distillation_data.py` | Generate synthetic training data |68| `scripts/score_dataset.py` | Quality scoring on training data |69| `scripts/cleanup.py` | Delete old files and deployments |70| `scripts/validate/` | Data validators (SFT, DPO, RFT) + stats |7172## Rules73741. **Always baseline first** — evaluate the base model before fine-tuning752. **Validate data** before submitting — run `scripts/validate/validate_sft.py`763. **Calibrate RFT graders** — target 25-50% failure rate on the base model774. **Evaluate checkpoints** — don't blindly deploy the final one785. **Measure token cost** alongside accuracy when comparing models7980## Quick Reference8182| Task | Command |83|------|---------|84| Validate SFT data | `python scripts/validate/validate_sft.py data.jsonl` |85| Submit SFT job | `python scripts/submit_training.py --model gpt-4.1-mini --training-file train.jsonl --validation-file val.jsonl --type sft` |86| Monitor job | `python scripts/monitor_training.py --job-id ftjob-xxx` |87| Analyze curves | `python scripts/check_training.py --job-id ftjob-xxx` |88| Deploy model | `python scripts/deploy_model.py --model-id ft:gpt-4.1-mini:... --name my-eval` |89| Evaluate model | `python scripts/evaluate_model.py --deployment-name my-eval --test-file test.jsonl` |9091## Error Handling9293| Error | Cause | Fix |94|-------|-------|-----|95| "API version not supported" | Older `openai` SDK on `/v1/` endpoint | Upgrade to `openai>=1.0` |96| "does not support fine-tuning with Standard TrainingType" | OSS model needs `globalStandard` | Use `--use-rest` flag or script auto-falls back |97| Job stuck in post-training eval | Under-provisioned tool endpoint (RFT) | Scale to S2+, enable Always On |98| "DeploymentNotReady" after ARM succeeds | ARM/data-plane race condition | Delete and recreate deployment, wait 5 min |99| Content safety block at deployment | PII-dense training data | Remove problematic document types |100