Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Deploy, evaluate, and manage AI agents end-to-end on Microsoft Azure AI Foundry
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
foundry-agent/agent-optimizer/references/eval-yaml.md
1# eval.yaml Guidance23Create `eval.yaml` directly when the conversation or `.foundry/agent-metadata*.yaml` already selected the dataset/evaluators. Otherwise ask whether to run `azd ai agent eval generate` or let optimize use built-in defaults.45## Include67```yaml8name: <suite-or-optimization-name>9agent:10name: <agent-name>11kind: hosted12version: "<agent-version>"13model: <baseline-model-deployment-name>14config: .agent_configs/baseline/metadata.yaml15dataset:16local_uri: <path-to-jsonl>17# name: <foundry-dataset-name>18# version: "<dataset-version>"19# validation_dataset:20# name: <validation-dataset-name>21# version: "<validation-version>"22evaluators:23- <evaluator-name>24- name: <custom-evaluator-name>25version: "<evaluator-version>"26local_uri: <local-evaluator-json>27options:28eval_model: <existing-chat-model-deployment-name>29optimization_model: <allowed-optimizer-model-deployment-name>30max_candidates: 431optimization_config:32model_search_space:33- <target-model-deployment-name>34```3536Use existing model deployments for `agent.model` and `options.eval_model`; do not assume `gpt-4o`.3738For `options.optimization_model`, first verify that the target Foundry project has a deployment whose name is in this allowlist:3940- `GPT-5`41- `GPT-5.1`42- `GPT-5.2`43- `GPT-5.4`44- `GPT-5.5`45- `DeepSeek-V4-Pro`46- `DeepSeek-V-3.2`4748If none exist, ask the user to deploy one before configuring optimization. Use `options.optimization_config.model_search_space` only for target model candidates that exist in the project; it may include the baseline model when the user wants it compared.4950## Generate evals when inputs are missing5152Prefer `eval generate` over older init flows:5354```bash55azd ai agent eval generate --dataset <path-to-jsonl>56azd ai agent eval generate --reset-defaults57```5859After generation, run `azd ai agent optimize --optimize-model <allowed-optimizer-model-deployment-name>` from the azd project; optimize auto-detects the generated `eval.yaml`.6061## Skip6263Do not add these fields unless the user explicitly asks and understands the tradeoff:6465- `target_attributes`66- `budget`67- `min_improvement`68- `pass_threshold`69- `keep_versions`70- `generation_instruction`71- `max_samples`72- `trace_days`73- legacy `dataset_file`, `dataset_reference`, or `validation_reference` when writing a new file7475Keep `target_attributes` omitted so azd can auto-detect optimizable attributes.7677## Source mapping7879| Source | eval.yaml field |80|--------|-----------------|81| effective azd context | `agent.name`, `agent.version`, `agent.kind` |82| baseline config | `agent.model`, `agent.config` |83| selected local dataset JSONL | `dataset.local_uri` |84| selected remote/local dataset | `dataset.name`, `dataset.version`, `dataset.local_uri` |85| selected validation dataset | `validation_dataset` |86| selected Foundry/local evaluators | `evaluators[]` |87| selected judge/eval deployment | `options.eval_model` |88| selected optimizer deployment | `options.optimization_model` |89| selected target model candidates | `options.optimization_config.model_search_space` |9091Treat older `dataset_file`, `dataset_reference`, `validation_reference`, `max_iterations`, and `optimization_config.model` as legacy inputs when reading existing files, but write new files with the current contract above.92