Source from repo

Microsoft Foundry Skill

Deploy, evaluate, and manage AI agents end-to-end on Microsoft Azure AI Foundry

microsoftGitHub microsoftOfficialSource repo Original GitHub link Publisher page

Files

154

Skill

n/a

Size

976.2 KB

Entrypoint

SKILL.md

Format

git-repo

Open file

finetuning/references/agentic-rft.md

Syntax-highlighted preview of this file as included in the skill package.

Rendered Source

markdown99 linesFree

finetuning/references/agentic-rft.md

1# Agentic RFT — Tool Calling
2 
3Train reasoning models (o4-mini) for agentic scenarios where the model invokes external tools during chain-of-thought reasoning.
4 
5> ⚠️ **Access required**: Agentic RFT with tool calling and GPT-5 RFT are behind feature flags. You must request access through the Azure AI Foundry portal or your Microsoft account team. o4-mini RFT without tools is generally available.
6 
7## Tool Definition Format
8 
9```python
10tools = [
11    {
12        "name": "search",
13        "server_url": "https://your-function-app.azurewebsites.net/api/tools",
14        "headers": {
15            "Authorization": "Bearer <your-key>"
16        }
17    },
18    {
19        "name": "get_by_id",
20        "server_url": "https://your-function-app.azurewebsites.net/api/tools",
21        "headers": {
22            "Authorization": "Bearer <your-key>"
23        }
24    }
25]
26```
27 
28## Submitting an Agentic RFT Job
29 
30```python
31job = client.fine_tuning.jobs.create(
32    model="o4-mini-2025-04-16",
33    training_file=train.id,
34    validation_file=valid.id,
35    method={
36        "type": "reinforcement",
37        "reinforcement": {
38            "grader": grader,
39            "tools": tools,
40            "max_episode_steps": 10,
41            "hyperparameters": {
42                "eval_interval": 5,
43                "eval_samples": 10,
44                "compute_multiplier": 1.5,
45                "reasoning_effort": "medium"
46            }
47        }
48    }
49)
50```
51 
52## Tool Response Format
53 
54Your tool endpoint must return:
55 
56```json
57{
58    "type": "function_call_output",
59    "call_id": "call_12345xyz",
60    "output": "The result of the tool call...",
61    "id": "fc_12345xyz"
62}
63```
64 
65## Tool Endpoint Requirements
66 
67| Constraint | Limit |
68|-----------|-------|
69| Recommended throughput | 50 QPS |
70| Max input payload | 1 MB |
71| Max return payload | 1 MB (413 error if exceeded) |
72| Timeout | 10 minutes |
73| Parallel calls | Supported — handle race conditions |
74| Retry on 5xx | 3 attempts, then rollout discarded |
75| On 4xx | Error serialized and shown to model |
76 
77**Infrastructure**: Use Always On, sufficient compute (S2+), multiple instances. Under-provisioned endpoints can cause jobs to hang during post-training eval.
78 
79## RFT Hyperparameters
80 
81| Parameter | Description | Recommended Start |
82|-----------|-------------|-------------------|
83| `reasoning_effort` | `"low"`, `"medium"`, `"high"` | `"medium"` |
84| `compute_multiplier` | Scales rollouts per step | `1.5` |
85| `learning_rate_multiplier` | Scales the learning rate | `1.0` |
86| `n_epochs` | Data passes | `2–3` |
87| `eval_interval` | Eval every N steps | `5` |
88| `eval_samples` | Validation examples per eval | `10` |
89| `max_episode_steps` | Max tool calls + reasoning steps per rollout | `5–10` |
90 
91**Notes:** Higher LR increases output verbosity without improving accuracy. Compute multiplier 1.5 balances rollout quality and training time. Platform may early-stop before all epochs.
92 
93## When to Use Agentic RFT
94 
95- Model needs to **decide when to call tools** (not just follow instructions)
96- Task involves **multi-step reasoning** with external data lookups
97- Model needs to learn **tool selection** — choosing the right tool for the job
98- Standard RFT (without tools) can't capture the agentic behavior
99

Microsoft Foundry Skill

finetuning/references/agentic-rft.md

Preparing the source view

Microsoft Foundry Skill

finetuning/references/agentic-rft.md