Skip to main content

Introduction

The Dataframer API is a RESTful HTTP API that enables programmatic access to all Dataframer features. Build synthetic data generation into your workflows, applications, and automation pipelines.

Base URL

All API requests are made to:
https://df-api.dataframer.ai/api/dataframer

Authentication

All requests require Bearer token authentication:
curl -H "Authorization: Bearer YOUR_API_KEY" \
  https://df-api.dataframer.ai/api/dataframer/datasets/
See Authentication for details on obtaining your API key.

Core Resources

Datasets

Upload and manage seed data files. Endpoints:
  • POST /datasets/create/ - Create dataset with files
  • GET /datasets/ - List datasets
  • GET /datasets/{id}/ - Get dataset details
  • POST /datasets/{id}/add_files/ - Add files to dataset
  • DELETE /datasets/{id}/ - Delete dataset
Learn more →

Specifications

Generate and manage data specifications. Endpoints:
  • POST /analyze/ - Generate specification from dataset
  • GET /analyze/status/{task_id} - Check analysis status
  • GET /specs/ - List specifications
  • GET /specs/{id}/ - Get specification details
  • PATCH /specs/{id}/ - Update specification
Learn more →

Generation

Create synthetic samples. Endpoints:
  • POST /generate/ - Start sample generation
  • GET /generate/status/{task_id} - Check generation status
  • GET /generate/retrieve/{task_id} - Download samples
  • POST /generate/retrieve/{task_id} - Get specific samples
Learn more →

Evaluation

Assess sample quality. Endpoints:
  • POST /evaluations/ - Create evaluation
  • GET /evaluations/status/{task_id} - Check evaluation status
  • GET /evaluations/{id}/ - Get evaluation results
  • GET /evaluations/{id}/samples/ - Get sample-level details
  • POST /evaluations/{id}/chat/ - Chat about evaluation
Learn more →

Request Format

Content Types

JSON requests:
curl -X POST 'https://df-api.dataframer.ai/api/dataframer/analyze/' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{"dataset_id": "...", "name": "..."}'
File uploads (multipart/form-data):
curl -X POST 'https://df-api.dataframer.ai/api/dataframer/datasets/create/' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -F 'name=My Dataset' \
  -F 'dataset_type=SINGLE_FILE' \
  -F '[email protected]'

Response Format

All responses are JSON: Success (200 OK):
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "name": "Customer Dataset",
  "status": "ready"
}
Error (400 Bad Request):
{
  "error": "Invalid dataset type",
  "details": {
    "dataset_type": ["Must be one of: SINGLE_FILE, MULTI_FILE, MULTI_FOLDER"]
  }
}

Status Codes

CodeMeaningDescription
200OKRequest succeeded
201CreatedResource created successfully
400Bad RequestInvalid request parameters
401UnauthorizedMissing or invalid API key
403ForbiddenAPI key lacks permissions
404Not FoundResource doesn’t exist
429Too Many RequestsRate limit exceeded
500Internal Server ErrorServer error

Asynchronous Operations

Long-running operations (analysis, generation, evaluation) return immediately with a task ID: 1. Start operation:
POST /generate/
 {"task_id": "gen_abc123", "status": "PENDING"}
2. Poll status:
GET /generate/status/gen_abc123
 {"status": "RUNNING", "progress": 45}
3. Retrieve results:
GET /generate/retrieve/gen_abc123
 ZIP file with samples

Rate Limits

Current limits per API key:
  • 100 requests per minute
  • 1000 requests per hour
  • 5 concurrent long-running tasks
Rate limit headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1640995200

Pagination

List endpoints support pagination:
GET /datasets/?page=1&page_size=20
Response:
{
  "count": 150,
  "next": "https://df-api.dataframer.ai/api/dataframer/datasets/?page=2",
  "previous": null,
  "results": [...]
}

Filtering and Sorting

Filter and sort list results:
# Filter by dataset type
GET /datasets/?dataset_type=SINGLE_FILE

# Sort by creation date
GET /datasets/?ordering=-created_at

# Combine filters
GET /datasets/?dataset_type=SINGLE_FILE&ordering=-created_at

Error Handling

Handle errors gracefully:
import requests

response = requests.post(url, headers=headers, json=data)

if response.status_code == 200:
    result = response.json()
    print("Success:", result)
elif response.status_code == 400:
    errors = response.json()
    print("Validation error:", errors)
elif response.status_code == 401:
    print("Authentication error: Check API key")
elif response.status_code == 429:
    print("Rate limited: Wait before retrying")
else:
    print(f"Error {response.status_code}: {response.text}")

Idempotency

Certain operations support idempotency keys to prevent duplicate processing:
curl -X POST 'https://df-api.dataframer.ai/api/dataframer/generate/' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Idempotency-Key: unique-key-12345' \
  -H 'Content-Type: application/json' \
  -d '{"spec_id": "...", "number_of_samples": 10}'

Webhooks

Configure webhooks to receive notifications (coming soon):
  • Generation completed
  • Analysis finished
  • Evaluation ready

API Versioning

The current API version is v1. The version is included in the URL path:
https://df-api.dataframer.ai/api/dataframer/...
Future versions will use:
https://df-api.dataframer.ai/api/v2/dataframer/...

SDKs and Libraries

Official SDKs (coming soon):
  • Python SDK
  • Node.js SDK
  • Go SDK
Use standard HTTP libraries for now.

Next Steps