Source from repo

Microsoft Foundry Skill

Build and deploy AI applications on Azure AI Foundry using Microsoft's model catalog and AI services

microsoftGitHub microsoftOfficialSource repo Original GitHub link Publisher page

Files

155

Skill

n/a

Size

976.3 KB

Entrypoint

SKILL.md

Format

git-repo

Open file

finetuning/references/agentic-rft.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown99 linesFree

finetuning/references/agentic-rft.md

1# Agentic RFT — Tool Calling
2 
3Train reasoning models (o4-mini) for agentic scenarios where the model invokes external tools during chain-of-thought reasoning.
4 
5> ⚠️ **Access required**: Agentic RFT with tool calling and GPT-5 RFT are behind feature flags. You must request access through the Azure AI Foundry portal or your Microsoft account team. o4-mini RFT without tools is generally available.
6 
7## Tool Definition Format
8 
9```python
10tools = [
11    {
12        "name": "search",
13        "server_url": "https://your-function-app.azurewebsites.net/api/tools",
14        "headers": {
15            "Authorization": "Bearer <your-key>"
16        }
17    },
18    {
19        "name": "get_by_id",
20        "server_url": "https://your-function-app.azurewebsites.net/api/tools",
21        "headers": {
22            "Authorization": "Bearer <your-key>"
23        }
24    }
25]
26```
27 
28## Submitting an Agentic RFT Job
29 
30```python
31job = client.fine_tuning.jobs.create(
32    model="o4-mini-2025-04-16",
33    training_file=train.id,
34    validation_file=valid.id,
35    method={
36        "type": "reinforcement",
37        "reinforcement": {
38            "grader": grader,
39            "tools": tools,
40            "max_episode_steps": 10,
41            "hyperparameters": {
42                "eval_interval": 5,
43                "eval_samples": 10,
44                "compute_multiplier": 1.5,
45                "reasoning_effort": "medium"
46            }
47        }
48    }
49)
50```
51 
52## Tool Response Format
53 
54Your tool endpoint must return:
55 
56```json
57{
58    "type": "function_call_output",
59    "call_id": "call_12345xyz",
60    "output": "The result of the tool call...",
61    "id": "fc_12345xyz"
62}
63```
64 
65## Tool Endpoint Requirements
66 
67| Constraint | Limit |
68|-----------|-------|
69| Recommended throughput | 50 QPS |
70| Max input payload | 1 MB |
71| Max return payload | 1 MB (413 error if exceeded) |
72| Timeout | 10 minutes |
73| Parallel calls | Supported — handle race conditions |
74| Retry on 5xx | 3 attempts, then rollout discarded |
75| On 4xx | Error serialized and shown to model |
76 
77**Infrastructure**: Use Always On, sufficient compute (S2+), multiple instances. Under-provisioned endpoints can cause jobs to hang during post-training eval.
78 
79## RFT Hyperparameters
80 
81| Parameter | Description | Recommended Start |
82|-----------|-------------|-------------------|
83| `reasoning_effort` | `"low"`, `"medium"`, `"high"` | `"medium"` |
84| `compute_multiplier` | Scales rollouts per step | `1.5` |
85| `learning_rate_multiplier` | Scales the learning rate | `1.0` |
86| `n_epochs` | Data passes | `2–3` |
87| `eval_interval` | Eval every N steps | `5` |
88| `eval_samples` | Validation examples per eval | `10` |
89| `max_episode_steps` | Max tool calls + reasoning steps per rollout | `5–10` |
90 
91**Notes:** Higher LR increases output verbosity without improving accuracy. Compute multiplier 1.5 balances rollout quality and training time. Platform may early-stop before all epochs.
92 
93## When to Use Agentic RFT
94 
95- Model needs to **decide when to call tools** (not just follow instructions)
96- Task involves **multi-step reasoning** with external data lookups
97- Model needs to learn **tool selection** — choosing the right tool for the job
98- Standard RFT (without tools) can't capture the agentic behavior
99

Preparing the source view

Microsoft Foundry Skill

finetuning/references/agentic-rft.md