Skip to main content

API Key Rotation Endpoint

PDF Support

New Features

🔑 Programmatic API Key Rotation Endpoint

Dataframer now exposes a dedicated API key rotation endpoint for programmatic usage, with built-in rate limiting.Details:
  • Endpoint: api-key/rotate on the Dataframer API
  • Authentication: requires a valid API key passed as a Bearer token in the Authorization header (not a JWT)
  • Response: returns a freshly generated API key, a masked version of the key, and the new expiration timestamp
  • Safety: rate limited to 5 requests per hour per IP to prevent accidental or abusive rotation storms
Use this when you want to rotate keys directly from infrastructure or CI without going through the UI.Example request:
curl -X POST https://df-api.dataframer.ai/api/users/api-key/rotate \
  -H "Authorization: Bearer YOUR_CURRENT_API_KEY"
Sample response:
{
  "api_key": "26e5...........................................624e",
  "masked_key": "26e5********************************************************624e",
  "expires_at": "2027-02-20T20:21:59.820057+00:00"
}

📄 PDF Support

Dataframer now supports PDFs as a first-class file type across datasets and generation workflows.Highlights:
  • Dataset ingestion: .pdf is now accepted anywhere you upload dataset files, alongside .txt, .md, .json, .csv, and .jsonl
  • PDF-aware processing: internal pipelines recognize PDF files and handle them with dedicated logic for storage, conversion, and preview
  • Template prompts for PDFs: you can pass a pdf_template_prompt to control the visual style of generated PDFs (e.g., “Professional corporate style with blue headers”), with validation and tests in place to keep prompts within safe limits
Combined with the existing PDF generation capabilities, this makes it much easier to build fully PDF-native workflows end to end.

Blog Posts & Studies

Databricks Integration

Cost & Time Estimates

Public API

MCP Server

PDF Generation

New Features

📝 New Blog Posts & Studies

New research and tutorials on the Dataframer blog:

🧱 Databricks Integration

Full integration with Databricks for data ingestion, generation, and model hosting. See the Databricks integration guide for a full walkthrough.Capabilities:
  • pydataframer-databricks — new Python package for working with Dataframer directly from Databricks notebooks. Includes DatabricksConnector for fetching sample data from Unity Catalog tables and loading generated data back into Delta tables via service principal M2M OAuth.
  • Databricks native models — Databricks-hosted models can be used for specs, generation, evaluation, and chat. A Dataframer admin configures service principal credentials once in the Dataframer UI, and any team member can then select databricks/ models without passing credentials in API calls

💰 Cost & Time Estimates

See estimated cost and generation time before starting a run. Because Dataframer uses an agentic generation workflow with multiple LLM calls per sample, costs were previously difficult to predict. The estimator uses a simulated model of the full generation pipeline to produce forecasts before you commit to a run.How it works:
  • Estimates update live as you adjust sample count, model, dataset type, and other parameters on the Create Run page
  • Accounts for all stages of generation: outline, content, revision cycles, and evaluation

🔌 Public API

Stable public REST API for programmatic access to the full Dataframer workflow — datasets, specs, generation, evaluation, and red-teaming. The API went through a major overhaul to reach a stable, consistent interface.Highlights:
  • Python SDK (pydataframer) with typed methods for every endpoint
  • Thoroughly documented in the API Reference with Python code examples for every endpoint

🤖 MCP Server

Dataframer is now available as an MCP (Model Context Protocol) server, allowing AI assistants like Claude Code, Cursor, and other MCP-compatible clients to interact with the platform directly.Capabilities:
  • Upload datasets, create specs, generate data, and download results — all through natural conversation with an AI assistant
  • Unlike the raw API, MCP also provides your AI assistant with detailed instructions on how to use Dataframer effectively — so it can guide you through the entire workflow conversationally
  • See API & MCP for setup instructions

📄 PDF Generation

Generate synthetic PDF documents with custom styling. Describe the visual style you want (e.g., “professional corporate style with blue headers”) and Dataframer generates styled PDFs automatically.Capabilities:
  • Full PDF input/output — use PDF seed examples and generate new PDF documents
  • Custom styling via a natural language prompt that controls headers, fonts, colors, and layout

Seedless Generation

Admin Tools

Billing System

ToolBox - SQL

Gemini 3 Pro

New Features

🌱 Seedless Generation

Generate high-quality synthetic data without requiring any seed examples. Simply describe what you want and let Dataframer create it from scratch.How to create a spec (blueprint for the data) without uploading examples:
  1. Select “Seedless” as the specification type in the spec creation wizard
  2. Provide a spec name and generation objectives
  3. Set your target token range (e.g., 2,000-5,000 tokens)

👥 Admin Tools

New internal administration capabilities for managing teams and users.Features:
  • Role-based access control with Admin and User roles
  • Admins can promote/demote users between Admin and User roles
  • Company-wide user visibility and management from the Profile page

💳 Billing System

Usage-based billing with transparent pricing and detailed invoicing.How it works:
  • Calendar month billing cycles (1st to last day of each month)
  • Run Details page now shows the cost of your run
  • Failed task cost exclusion - you’re not charged for failed runs
  • Team and Enterprise plan types

🗃️ ToolBox - SQL Execution Environment

Multi-database SQL validation engine for generating high-quality Text-to-SQL datasets.Capabilities:
  • Validates both schema DDL and query SQL
  • Parallel testing against 3 databases: PostgreSQL, MySQL, SQLite
  • REST API integration for programmatic access

🤖 Gemini 3 Pro Support

Full integration of Google’s latest Gemini 3 Pro models across the platform.Capabilities:
  • Minimal reasoning mode (gemini/gemini-3-pro-preview) and high reasoning mode (gemini/gemini-3-pro-preview-thinking)
  • 1 million token context window
  • Available for spec analysis, generation, evaluation, red-teaming, and chat