Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Build and deploy AI applications on Azure AI Foundry using Microsoft's model catalog and AI services
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
finetuning/workflows/full-pipeline.md
1# Full Pipeline Workflow23End-to-end fine-tuning on Azure AI Foundry in 9 phases.45## Prerequisites67- Azure AI Foundry resource with fine-tuning enabled8- Python 3.10+ with `openai` and `requests`9- Azure CLI (`az`) authenticated10- A clear task definition: what should the model do differently after fine-tuning?1112## Phase 1: Define the Task1314Answer before touching data or models:15161. **What task?** (e.g., "translate natural language to Python code")172. **What does good output look like?** Write 5 examples by hand.183. **What does bad output look like?** Write 3 anti-examples.194. **How will you measure success?** Define evaluation dimensions (see `references/grader-design.md`).205. **Which base model?** Pick 1-3 candidates from the supported model list.2122## Phase 2: Prepare the Dataset2324### Option A: You Have Data251. Convert to SFT JSONL format (see `references/dataset-formats.md`)262. Split: 80% train, 10% validation, 10% held-out test273. Remove or fix low-quality examples2829### Option B: Synthetic Data301. Generate using LLM prompts (see `workflows/dataset-creation.md`)312. Convert to SFT JSONL with `scripts/convert_dataset.py`3233### Option C: Hybrid (Seed + Synthetic)341. Use existing data as seed, generate synthetic variations352. Merge, deduplicate, and quality-filter3637**Checkpoint**: You should have `training.jsonl`, `validation.jsonl`, and `test.jsonl` (never used for training).3839## Phase 3: Establish Baselines40411. Deploy base model (or use existing deployment)422. Record scores — this is your "zero" that every fine-tune must beat4344## Phase 4: Choose Training Type4546See `references/training-types.md` for the full decision framework.4748| Condition | Training Type |49|-----------|--------------|50| Have input-output pairs | SFT |51| Can write a grading function | RFT (reasoning models only) |52| Need style alignment | DPO |5354Most projects start with SFT. Move to RFT/DPO only if SFT isn't sufficient.5556## Phase 5: Upload and Submit Training5758Use `scripts/submit_training.py` or the API directly. See `references/hyperparameters.md` for starting HP values.5960**Foundry CLI** alternative (no Python):61```bash62azd ai finetuning jobs submit -f ./fine-tune-job.yaml63```6465## Phase 6: Monitor and Analyze66671. Wait for completion or use `scripts/monitor_training.py`682. Analyze training curves with `scripts/check_training.py`693. Read `references/training-curves.md` to interpret results704. Check for overfitting — consider deploying an earlier checkpoint if detected7172## Phase 7: Evaluate Fine-Tuned Model73741. Deploy fine-tuned model (see `references/deployment.md` for format/SKU)752. Compare against baseline and previous experiments763. Delete deployment after evaluation7778## Phase 8: Iterate7980Follow `workflows/iterative-training.md`:81- Adjust hyperparameters based on training curves82- Try different data subsets or augmentations83- Test different base models84- Track everything in your leaderboard8586## Phase 9: Ship8788When the model convincingly beats baseline:891. Deploy with production-appropriate capacity902. Monitor with Application Insights913. Periodically re-evaluate against test set for regression924. Retrain as new data becomes available93