Skip to main content
POST
/
api
/
dataframer
/
runs
Python
import os
from dataframer import Dataframer

client = Dataframer(
    api_key=os.environ.get("DATAFRAMER_API_KEY"),  # This is the default and can be omitted
)
run = client.dataframer.runs.create(
    number_of_samples=1,
    spec_id="182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
)
print(run.id)
{
  "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a"
}
Async operation: This endpoint returns immediately with a run ID. Poll GET /api/dataframer/runs/{id}/ until status changes from PENDING/IN_PROGRESS to SUCCEEDED or FAILED.

Authorizations

Authorization
string
header
required

API Key authentication. Format: "Bearer YOUR_API_KEY"

Body

application/json

Request body for creating a generation run

spec_id
string<uuid>
required

ID of the spec to use for generation. Spec must be in SUCCEEDED status.

number_of_samples
integer
required

Number of samples to generate

Required range: 1 <= x <= 20000
spec_version
integer

Version number to use (optional, defaults to latest)

Required range: x >= 1
generation_model
enum<string>
default:anthropic/claude-sonnet-4-6

Model for generation. Use -thinking suffix to enable thinking mode. For databricks/ models, you must also provide databricks_client_id, databricks_client_secret, and databricks_api_base.

Available options:
anthropic/claude-opus-4-6,
anthropic/claude-opus-4-6-thinking,
anthropic/claude-sonnet-4-6,
anthropic/claude-sonnet-4-6-thinking,
anthropic/claude-haiku-4-5,
anthropic/claude-haiku-4-5-thinking,
deepseek-ai/DeepSeek-V3.1,
moonshotai/Kimi-K2-Instruct,
openai/gpt-oss-120b,
deepseek-ai/DeepSeek-R1-0528-tput,
Qwen/Qwen2.5-72B-Instruct-Turbo,
gemini/gemini-3-pro-preview,
gemini/gemini-3-pro-preview-thinking,
databricks/databricks-claude-3-7-sonnet,
databricks/databricks-claude-haiku-4-5,
databricks/databricks-claude-opus-4-1,
databricks/databricks-claude-opus-4-5,
databricks/databricks-claude-opus-4-6,
databricks/databricks-claude-sonnet-4,
databricks/databricks-claude-sonnet-4-5,
databricks/databricks-gemini-2-5-flash,
databricks/databricks-gemini-2-5-pro,
databricks/databricks-gemini-3-flash,
databricks/databricks-gemini-3-pro,
databricks/databricks-gpt-5
outline_model
enum<string>
default:anthropic/claude-sonnet-4-6-thinking

Model for outline generation

Available options:
anthropic/claude-opus-4-6,
anthropic/claude-opus-4-6-thinking,
anthropic/claude-sonnet-4-6,
anthropic/claude-sonnet-4-6-thinking,
anthropic/claude-haiku-4-5,
anthropic/claude-haiku-4-5-thinking,
deepseek-ai/DeepSeek-V3.1,
moonshotai/Kimi-K2-Instruct,
openai/gpt-oss-120b,
deepseek-ai/DeepSeek-R1-0528-tput,
Qwen/Qwen2.5-72B-Instruct-Turbo,
gemini/gemini-3-pro-preview,
gemini/gemini-3-pro-preview-thinking,
databricks/databricks-claude-3-7-sonnet,
databricks/databricks-claude-haiku-4-5,
databricks/databricks-claude-opus-4-1,
databricks/databricks-claude-opus-4-5,
databricks/databricks-claude-opus-4-6,
databricks/databricks-claude-sonnet-4,
databricks/databricks-claude-sonnet-4-5,
databricks/databricks-gemini-2-5-flash,
databricks/databricks-gemini-2-5-pro,
databricks/databricks-gemini-3-flash,
databricks/databricks-gemini-3-pro,
databricks/databricks-gpt-5
revision_model
enum<string>
default:anthropic/claude-sonnet-4-6-thinking

Model for revisions and filtering (only used if revision_types or filtering_types is set)

Available options:
anthropic/claude-opus-4-6,
anthropic/claude-opus-4-6-thinking,
anthropic/claude-sonnet-4-6,
anthropic/claude-sonnet-4-6-thinking,
anthropic/claude-haiku-4-5,
anthropic/claude-haiku-4-5-thinking,
deepseek-ai/DeepSeek-V3.1,
moonshotai/Kimi-K2-Instruct,
openai/gpt-oss-120b,
deepseek-ai/DeepSeek-R1-0528-tput,
Qwen/Qwen2.5-72B-Instruct-Turbo,
gemini/gemini-3-pro-preview,
gemini/gemini-3-pro-preview-thinking,
databricks/databricks-claude-3-7-sonnet,
databricks/databricks-claude-haiku-4-5,
databricks/databricks-claude-opus-4-1,
databricks/databricks-claude-opus-4-5,
databricks/databricks-claude-opus-4-6,
databricks/databricks-claude-sonnet-4,
databricks/databricks-claude-sonnet-4-5,
databricks/databricks-gemini-2-5-flash,
databricks/databricks-gemini-2-5-pro,
databricks/databricks-gemini-3-flash,
databricks/databricks-gemini-3-pro,
databricks/databricks-gpt-5
revision_types
enum<string>[]

List of revision types to apply. Valid values: 'coherence_flow' (fix formatting/flow issues), 'consistency' (fix internal contradictions), 'distinguishability' (seeded only — blend with seed style), 'conformance' (verify property compliance). Defaults to empty (no revisions).

Available options:
coherence_flow,
consistency,
distinguishability,
conformance
filtering_types
enum<string>[]

List of filtering quality gates. Valid values: 'structural' (reject severe format issues), 'conformance' (reject property violations). Documents that fail are regenerated. Defaults to empty (no filtering).

Available options:
structural,
conformance
max_revision_cycles
integer
default:1

Maximum number of revision cycles. 2-3 is a solid pick for complex documents requiring internal consistency, e.g. financial reports, invoices, etc.; increase to 3-5 for highest quality or when generated data has issues.

Required range: 1 <= x <= 5
seed_shuffling_level
enum<string>
default:none

(advanced) How to shuffle seed examples between samples

Available options:
none,
sample,
field,
prompt
max_examples_in_prompt
integer

(advanced) Maximum number of seed examples to include in prompts. By default, only as many seeds as fit in 10K tokens are used. Use this to override the default.

Required range: x >= 1
unified_multifield
boolean
default:true

(advanced) Use unified multifield generation. This helps to reduce the generation cost by processing all fields together rather than one by one.

generation_thinking_budget
integer
default:1024

Token budget for extended thinking during generation. Only applies to models with -thinking suffix.

Required range: x >= 1024
outline_thinking_budget
integer
default:1024

Token budget for extended thinking during outline generation. Only applies to models with -thinking suffix.

Required range: x >= 1024
revision_thinking_budget
integer
default:1024

Token budget for extended thinking during revisions. Only applies to models with -thinking suffix.

Required range: x >= 1024
databricks_client_id
string

Databricks service principal application (client) ID. Required when using databricks/models.

databricks_client_secret
string

Databricks service principal secret. Required when using databricks/models.

databricks_api_base
string

Databricks Model Serving endpoint URL (e.g. https://adb-xxx.azuredatabricks.net/serving-endpoints). Required when using databricks/models.

skip_outline
boolean
default:true

Skip outline and part-by-part generation. Generates a document draft directly in a single call. Faster and cheaper but not suitable for very long documents.

tools
string[]

List of tools available to the LLM during generation. Currently supported: 'calculator' (sandboxed Python for numerical verification). Roughly doubles cost and time. Defaults to empty.

enable_revisions
boolean
default:false
deprecated

Use revision_types and filtering_types instead. When set to true without those fields, enables all revision types and structural filtering.

Response

Run created and submitted

id
string<uuid>

Run ID for polling status