Skip to main content
POST
/
api
/
dataframer
/
evaluations
curl --request POST \
  --url https://df-api.dataframer.ai/api/dataframer/evaluations/ \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "run_id": "a98715da-921d-4326-bbf8-208f8bcc2956"
}
'
{
  "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "run_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "status": "PENDING",
  "conformance_score": 123,
  "conformance_explanation": "<string>",
  "distribution_analysis": [
    {
      "property_name": "<string>",
      "total_samples": 123,
      "expected_distributions": {
        "positive": 40,
        "negative": 30,
        "neutral": 30
      },
      "observed_distributions": {
        "positive": 45,
        "negative": 28,
        "neutral": 27
      },
      "total_samples_analyzed": 123
    }
  ],
  "sample_classifications": [
    {
      "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
      "evaluation_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
      "sample_identifier": "<string>",
      "classifications": {
        "sentiment": "positive",
        "topic": "technology",
        "length": "medium"
      },
      "sub_file_classifications": {
        "patient_id": {
          "Patient A": [
            "notes.txt",
            "labs.txt"
          ]
        },
        "condition": {
          "diabetes": [
            "notes.txt"
          ]
        }
      },
      "created_at": "2023-11-07T05:31:56Z"
    }
  ],
  "started_at": "2023-11-07T05:31:56Z",
  "completed_at": "2023-11-07T05:31:56Z",
  "error_message": "<string>",
  "created_by_email": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "duration_seconds": 123,
  "company_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "status_display": "<string>",
  "conformant_areas": "<string>",
  "non_conformant_areas": "<string>",
  "trace": {},
  "created_by": 123,
  "updated_at": "2023-11-07T05:31:56Z"
}
Async operation: This endpoint returns immediately with an evaluation ID. Poll GET /api/dataframer/evaluations/{evaluation_id}/ until status is COMPLETED or FAILED.
The run must be in SUCCEEDED status before an evaluation can be created.

Authorizations

Authorization
string
header
required

API Key authentication. Format: "Bearer YOUR_API_KEY"

Body

application/json

Request body for creating an evaluation

run_id
string<uuid>
required

ID of the completed run to evaluate. Run must be in SUCCEEDED status.

evaluation_model
enum<string>
default:gemini/gemini-3-pro-preview

AI model to use for evaluation. Defaults to gemini/gemini-3-pro-preview.

Available options:
anthropic/claude-opus-4-5,
anthropic/claude-opus-4-5-thinking,
anthropic/claude-sonnet-4-5,
anthropic/claude-sonnet-4-5-thinking,
anthropic/claude-haiku-4-5,
deepseek-ai/DeepSeek-V3.1,
deepseek-ai/DeepSeek-R1-0528-tput,
Qwen/Qwen2.5-72B-Instruct-Turbo,
moonshotai/Kimi-K2-Instruct,
openai/gpt-oss-120b,
openai/gpt-4.1,
gemini/gemini-2.5-pro,
gemini/gemini-3-pro-preview,
gemini/gemini-3-pro-preview-thinking

Response

Evaluation started successfully

Full evaluation details including distribution analysis and sample classifications

id
string<uuid>

Unique identifier for the evaluation

run_id
string<uuid>

ID of the run being evaluated

status
enum<string>

Current status of the evaluation

Available options:
PENDING,
PROCESSING,
SUCCEEDED,
FAILED
conformance_score
number | null

Overall conformance score (0-100) measuring how well generated samples match the spec's expected distributions. Null until evaluation completes.

conformance_explanation
string | null

Human-readable explanation of the conformance score and any notable deviations

distribution_analysis
object[] | null

Per-property comparison of expected vs observed distributions. Null until evaluation completes.

sample_classifications
object[]

Classification results for each generated sample. Empty until evaluation completes.

started_at
string<date-time> | null

When evaluation processing started

completed_at
string<date-time> | null

When evaluation completed

error_message
string | null

Error message if evaluation failed

created_by_email
string

Email of the user who created the evaluation

created_at
string<date-time>

When the evaluation was created

duration_seconds
number | null

Time taken to complete the evaluation in seconds

company_id
string<uuid>

ID of the company that owns this evaluation

status_display
string

Human-readable status display

conformant_areas
string | null

Description of areas where samples conform well to the spec

non_conformant_areas
string | null

Description of areas where samples deviate from the spec

trace
object

Internal trace information including task_id and evaluation model used

created_by
integer

ID of the user who created this evaluation

updated_at
string<date-time>

When the evaluation was last updated