Skip to main content
POST
/
api
/
dataframer
/
specs
Python
import os
from dataframer import Dataframer

client = Dataframer(
    api_key=os.environ.get("DATAFRAMER_API_KEY"),  # This is the default and can be omitted
)
spec = client.dataframer.specs.create(
    name="Customer Support Conversations Spec",
    dataset_id="a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    extrapolate_axes=False,
    extrapolate_values=True,
    generate_distributions=True,
    spec_generation_model_name="anthropic/claude-sonnet-4-5",
)
print(spec.id)
{
  "id": "550e8400-e29b-41d4-a716-446655440000"
}
Async operation: This endpoint returns immediately with a spec ID. Poll GET /api/dataframer/specs/{id}/ until status is SUCCEEDED or FAILED.
Supports both seeded specs (provide dataset_id to analyze a seed dataset) and seedless specs (provide generation_objectives to describe what data to generate).

Authorizations

Authorization
string
header
required

API Key authentication. Format: "Bearer YOUR_API_KEY"

Body

application/json
name
string
required

Name for the new spec (must be unique)

generation_objectives
string

Custom objectives or instructions for data generation that directly influence contents of the generated spec. Required for seedless specs (when dataset_id is omitted).

dataset_id
string<uuid>

ID of the seed dataset to generate spec from. Omit for seedless spec creation.

spec_generation_model_name
enum<string>
default:anthropic/claude-sonnet-4-6-thinking

AI model to use for spec generation. For databricks/ models, you must also provide databricks_client_id, databricks_client_secret, and databricks_api_base.

Available options:
anthropic/claude-opus-4-6,
anthropic/claude-opus-4-6-thinking,
anthropic/claude-sonnet-4-6,
anthropic/claude-sonnet-4-6-thinking,
anthropic/claude-haiku-4-5,
anthropic/claude-haiku-4-5-thinking,
deepseek-ai/DeepSeek-V3.1,
moonshotai/Kimi-K2-Instruct,
openai/gpt-oss-120b,
deepseek-ai/DeepSeek-R1-0528-tput,
Qwen/Qwen2.5-72B-Instruct-Turbo,
gemini/gemini-3-pro-preview,
gemini/gemini-3-pro-preview-thinking,
databricks/databricks-claude-3-7-sonnet,
databricks/databricks-claude-haiku-4-5,
databricks/databricks-claude-opus-4-1,
databricks/databricks-claude-opus-4-5,
databricks/databricks-claude-opus-4-6,
databricks/databricks-claude-sonnet-4,
databricks/databricks-claude-sonnet-4-5,
databricks/databricks-gemini-2-5-flash,
databricks/databricks-gemini-2-5-pro,
databricks/databricks-gemini-3-flash,
databricks/databricks-gemini-3-pro,
databricks/databricks-gpt-5
generate_distributions
boolean
default:true

When true, the spec will include generated probability distributions for each property value; when false, each property will have a uniform distribution.

generate_conditional_distributions
boolean
default:false

Generate conditional distributions showing how property values vary based on other properties. Requires generate_distributions to be true.

extrapolate_values
boolean
default:true

Extrapolate new values beyond existing data ranges. Not applicable for seedless specs.

extrapolate_axes
boolean
default:false

Extrapolate to new axes/dimensions not present in seed data. Not applicable for seedless specs.

description
string

Description of the spec's purpose (optional, for data organization purposes only)

databricks_client_id
string

Databricks service principal application (client) ID. Required when using databricks/ models.

databricks_client_secret
string

Databricks service principal secret. Required when using databricks/ models.

databricks_api_base
string

Databricks Model Serving endpoint URL (e.g. https://adb-xxx.azuredatabricks.net/serving-endpoints). Required when using databricks/ models.

Response

Spec creation started. Poll GET /api/dataframer/specs/{id}/ until status is SUCCEEDED or FAILED.

id
string<uuid>

ID of the newly created spec