Coding Agents
Coding agent targets evaluate AI coding assistants and CLI-based agents. These targets require a judge_target to run LLM-based evaluators.
Claude
Section titled “Claude”targets: - name: claude_agent provider: claude workspace_template: ./workspace-templates/my-project judge_target: azure_base| Field | Required | Description |
|---|---|---|
workspace_template | No | Path to workspace template directory |
cwd | No | Working directory (mutually exclusive with workspace_template) |
judge_target | Yes | LLM target for evaluation |
Codex CLI
Section titled “Codex CLI”targets: - name: codex_target provider: codex workspace_template: ./workspace-templates/my-project judge_target: azure_base| Field | Required | Description |
|---|---|---|
workspace_template | No | Path to workspace template directory |
cwd | No | Working directory (mutually exclusive with workspace_template) |
judge_target | Yes | LLM target for evaluation |
Pi Coding Agent
Section titled “Pi Coding Agent”targets: - name: pi_target provider: pi-coding-agent workspace_template: ./workspace-templates/my-project judge_target: azure_base| Field | Required | Description |
|---|---|---|
workspace_template | No | Path to workspace template directory |
cwd | No | Working directory (mutually exclusive with workspace_template) |
judge_target | Yes | LLM target for evaluation |
VS Code / Copilot
Section titled “VS Code / Copilot”targets: - name: vscode_dev provider: vscode workspace_template: ${{ WORKSPACE_PATH }} judge_target: azure_base| Field | Required | Description |
|---|---|---|
executable | No | Path to VS Code binary. Supports ${{ ENV_VAR }} syntax or literal paths. Defaults to code (or code-insiders for the insiders provider). |
workspace_template | Yes | Path to workspace template directory |
judge_target | Yes | LLM target for evaluation |
Using a custom executable path:
targets: - name: vscode_dev provider: vscode executable: ${{ VSCODE_CMD }} workspace_template: ${{ WORKSPACE_PATH }} judge_target: azure_baseVS Code Insiders
Section titled “VS Code Insiders”targets: - name: vscode_insiders provider: vscode-insiders workspace_template: ${{ WORKSPACE_PATH }} judge_target: azure_baseSame configuration as VS Code.
Custom CLI Agent
Section titled “Custom CLI Agent”Evaluate any command-line agent:
targets: - name: local_agent provider: cli command_template: 'python agent.py --prompt {PROMPT}' workspace_template: ./workspace-templates/my-project judge_target: azure_base| Field | Required | Description |
|---|---|---|
command_template | Yes | Command to run. {PROMPT} is replaced with the input. |
workspace_template | No | Path to workspace template directory |
cwd | No | Working directory (mutually exclusive with workspace_template) |
judge_target | Yes | LLM target for evaluation |
Mock Provider
Section titled “Mock Provider”For testing the evaluation harness without calling real providers:
targets: - name: mock_target provider: mock