Start a new evaluation for a completed run
GET /api/dataframer/evaluations/{evaluation_id}/ until status is COMPLETED or FAILED.SUCCEEDED status before an evaluation can be created.API Key authentication. Format: "Bearer YOUR_API_KEY"
Request body for creating an evaluation
ID of the completed run to evaluate. Run must be in SUCCEEDED status.
AI model to use for evaluation. Defaults to gemini/gemini-3-pro-preview.
anthropic/claude-opus-4-5, anthropic/claude-opus-4-5-thinking, anthropic/claude-sonnet-4-5, anthropic/claude-sonnet-4-5-thinking, anthropic/claude-haiku-4-5, deepseek-ai/DeepSeek-V3.1, deepseek-ai/DeepSeek-R1-0528-tput, Qwen/Qwen2.5-72B-Instruct-Turbo, moonshotai/Kimi-K2-Instruct, openai/gpt-oss-120b, openai/gpt-4.1, gemini/gemini-2.5-pro, gemini/gemini-3-pro-preview, gemini/gemini-3-pro-preview-thinking Evaluation started successfully
Full evaluation details including distribution analysis and sample classifications
Unique identifier for the evaluation
ID of the run being evaluated
Current status of the evaluation
PENDING, PROCESSING, SUCCEEDED, FAILED Overall conformance score (0-100) measuring how well generated samples match the spec's expected distributions. Null until evaluation completes.
Human-readable explanation of the conformance score and any notable deviations
Per-property comparison of expected vs observed distributions. Null until evaluation completes.
Classification results for each generated sample. Empty until evaluation completes.
When evaluation processing started
When evaluation completed
Error message if evaluation failed
Email of the user who created the evaluation
When the evaluation was created
Time taken to complete the evaluation in seconds
ID of the company that owns this evaluation
Human-readable status display
Description of areas where samples conform well to the spec
Description of areas where samples deviate from the spec
Internal trace information including task_id and evaluation model used
ID of the user who created this evaluation
When the evaluation was last updated