Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Build and deploy AI applications on Azure AI Foundry using Microsoft's model catalog and AI services
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
foundry-agent/agent-optimizer/references/eval-yaml.md
1# eval.yaml Guidance23Create `eval.yaml` directly when the conversation or `.foundry/agent-metadata*.yaml` already selected the dataset/evaluators. Otherwise ask whether to run `azd ai agent eval generate` or let optimize use built-in defaults.45## Include67```yaml8name: <suite-or-optimization-name>9agent:10name: <agent-name>11kind: hosted12version: "<agent-version>"13model: <baseline-model-deployment-name>14config: .agent_configs/baseline/metadata.yaml15dataset:16local_uri: <path-to-jsonl>17# name: <foundry-dataset-name>18# version: "<dataset-version>"19# validation_dataset:20# name: <validation-dataset-name>21# version: "<validation-version>"22evaluators:23- <evaluator-name>24- name: <custom-evaluator-name>25version: "<evaluator-version>"26local_uri: <local-evaluator-json>27options:28eval_model: <existing-chat-model-deployment-name>29optimization_model: <allowed-optimizer-model-deployment-name>30max_candidates: 431optimization_config:32model_search_space:33- <target-model-deployment-name>34```3536Use existing model deployments for `agent.model` and `options.eval_model`; do not assume `gpt-4o`.3738For `options.optimization_model`, first verify that the target Foundry project has a deployment whose name is in this allowlist:3940- `GPT-5`41- `GPT-5.1`42- `GPT-5.2`43- `GPT-5.4`44- `GPT-5.5`45- `DeepSeek-V4-Pro`46- `DeepSeek-V-3.2`4748If none exist, ask the user to deploy one before configuring optimization. Use `options.optimization_config.model_search_space` only for target model candidates that exist in the project; it may include the baseline model when the user wants it compared.4950## Generate evals when inputs are missing5152Prefer `eval generate` over older init flows:5354```bash55azd ai agent eval generate --dataset <path-to-jsonl>56azd ai agent eval generate --reset-defaults57```5859After generation, run `azd ai agent optimize --optimize-model <allowed-optimizer-model-deployment-name>` from the azd project; optimize auto-detects the generated `eval.yaml`.6061## Skip6263Do not add these fields unless the user explicitly asks and understands the tradeoff:6465- `target_attributes`66- `budget`67- `min_improvement`68- `pass_threshold`69- `keep_versions`70- `generation_instruction`71- `max_samples`72- `trace_days`73- legacy `dataset_file`, `dataset_reference`, or `validation_reference` when writing a new file7475Keep `target_attributes` omitted so azd can auto-detect optimizable attributes.7677## Source mapping7879| Source | eval.yaml field |80|--------|-----------------|81| effective azd context | `agent.name`, `agent.version`, `agent.kind` |82| baseline config | `agent.model`, `agent.config` |83| selected local dataset JSONL | `dataset.local_uri` |84| selected remote/local dataset | `dataset.name`, `dataset.version`, `dataset.local_uri` |85| selected validation dataset | `validation_dataset` |86| selected Foundry/local evaluators | `evaluators[]` |87| selected judge/eval deployment | `options.eval_model` |88| selected optimizer deployment | `options.optimization_model` |89| selected target model candidates | `options.optimization_config.model_search_space` |9091Treat older `dataset_file`, `dataset_reference`, `validation_reference`, `max_iterations`, and `optimization_config.model` as legacy inputs when reading existing files, but write new files with the current contract above.92