Source from repo

Microsoft Foundry Skill

Build and deploy AI applications on Azure AI Foundry using Microsoft's model catalog and AI services

microsoftGitHub microsoftOfficialSource repo Original GitHub link Publisher page

Files

155

Skill

n/a

Size

976.3 KB

Entrypoint

SKILL.md

Format

git-repo

Open file

finetuning/workflows/full-pipeline.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown93 linesFree

finetuning/workflows/full-pipeline.md

1# Full Pipeline Workflow
2 
3End-to-end fine-tuning on Azure AI Foundry in 9 phases.
4 
5## Prerequisites
6 
7- Azure AI Foundry resource with fine-tuning enabled
8- Python 3.10+ with `openai` and `requests`
9- Azure CLI (`az`) authenticated
10- A clear task definition: what should the model do differently after fine-tuning?
11 
12## Phase 1: Define the Task
13 
14Answer before touching data or models:
15 
161. **What task?** (e.g., "translate natural language to Python code")
172. **What does good output look like?** Write 5 examples by hand.
183. **What does bad output look like?** Write 3 anti-examples.
194. **How will you measure success?** Define evaluation dimensions (see `references/grader-design.md`).
205. **Which base model?** Pick 1-3 candidates from the supported model list.
21 
22## Phase 2: Prepare the Dataset
23 
24### Option A: You Have Data
251. Convert to SFT JSONL format (see `references/dataset-formats.md`)
262. Split: 80% train, 10% validation, 10% held-out test
273. Remove or fix low-quality examples
28 
29### Option B: Synthetic Data
301. Generate using LLM prompts (see `workflows/dataset-creation.md`)
312. Convert to SFT JSONL with `scripts/convert_dataset.py`
32 
33### Option C: Hybrid (Seed + Synthetic)
341. Use existing data as seed, generate synthetic variations
352. Merge, deduplicate, and quality-filter
36 
37**Checkpoint**: You should have `training.jsonl`, `validation.jsonl`, and `test.jsonl` (never used for training).
38 
39## Phase 3: Establish Baselines
40 
411. Deploy base model (or use existing deployment)
422. Record scores — this is your "zero" that every fine-tune must beat
43 
44## Phase 4: Choose Training Type
45 
46See `references/training-types.md` for the full decision framework.
47 
48| Condition | Training Type |
49|-----------|--------------|
50| Have input-output pairs | SFT |
51| Can write a grading function | RFT (reasoning models only) |
52| Need style alignment | DPO |
53 
54Most projects start with SFT. Move to RFT/DPO only if SFT isn't sufficient.
55 
56## Phase 5: Upload and Submit Training
57 
58Use `scripts/submit_training.py` or the API directly. See `references/hyperparameters.md` for starting HP values.
59 
60**Foundry CLI** alternative (no Python):
61```bash
62azd ai finetuning jobs submit -f ./fine-tune-job.yaml
63```
64 
65## Phase 6: Monitor and Analyze
66 
671. Wait for completion or use `scripts/monitor_training.py`
682. Analyze training curves with `scripts/check_training.py`
693. Read `references/training-curves.md` to interpret results
704. Check for overfitting — consider deploying an earlier checkpoint if detected
71 
72## Phase 7: Evaluate Fine-Tuned Model
73 
741. Deploy fine-tuned model (see `references/deployment.md` for format/SKU)
752. Compare against baseline and previous experiments
763. Delete deployment after evaluation
77 
78## Phase 8: Iterate
79 
80Follow `workflows/iterative-training.md`:
81- Adjust hyperparameters based on training curves
82- Try different data subsets or augmentations
83- Test different base models
84- Track everything in your leaderboard
85 
86## Phase 9: Ship
87 
88When the model convincingly beats baseline:
891. Deploy with production-appropriate capacity
902. Monitor with Application Insights
913. Periodically re-evaluate against test set for regression
924. Retrain as new data becomes available
93

Preparing the source view

Microsoft Foundry Skill

finetuning/workflows/full-pipeline.md