Create seed dataset

cURL

curl --request POST \
  --url https://df-api.dataframer.ai/api/dataframer/seed-datasets/create/ \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'name=<string>' \
  --form dataset_type=SINGLE_FILE \
  --form 'files=<string>' \
  --form 'description=<string>' \
  --form 'folder_names=<string>' \
  --form files.items='@example-file'

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "name": "Customer Reviews 2025",
  "description": "Product reviews with files",
  "dataset_type": "SINGLE_FILE",
  "file_count": 1,
  "folder_count": 0,
  "created_at": "2025-01-15T10:30:00Z",
  "updated_at": "2025-01-15T10:30:00Z",
  "created_by_email": "[email protected]",
  "short_sample_compatibility": {
    "is_short_samples_compatible": false,
    "is_long_samples_compatible": true,
    "reason": null
  }
}

POST

api

dataframer

seed-datasets

create

cURL

curl --request POST \
  --url https://df-api.dataframer.ai/api/dataframer/seed-datasets/create/ \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form 'name=<string>' \
  --form dataset_type=SINGLE_FILE \
  --form 'files=<string>' \
  --form 'description=<string>' \
  --form 'folder_names=<string>' \
  --form files.items='@example-file'

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "name": "Customer Reviews 2025",
  "description": "Product reviews with files",
  "dataset_type": "SINGLE_FILE",
  "file_count": 1,
  "folder_count": 0,
  "created_at": "2025-01-15T10:30:00Z",
  "updated_at": "2025-01-15T10:30:00Z",
  "created_by_email": "[email protected]",
  "short_sample_compatibility": {
    "is_short_samples_compatible": false,
    "is_long_samples_compatible": true,
    "reason": null
  }
}

You can upload one of the following types of datasets:

SINGLE_FILE: exactly one file containing all the samples
MULTI_FILE: multiple files, where each file is a separate sample
MULTI_FOLDER: multiple folders, where each folder is a separate sample and only contains files with no nested folders

File constraints by dataset type

Type	Allowed formats	Size limit	Count limit
SINGLE_FILE	CSV, JSON, JSONL	50MB	1 file, min 2 rows
MULTI_FILE	TXT, MD, JSON, CSV, JSONL, PDF	1MB/file, 50MB total	2-1000 files
MULTI_FOLDER	TXT, MD, JSON, CSV, JSONL, PDF	1MB/file, 50MB total	2-1000 files, 20/folder, min 2 folders

Authorizations

Authorization

string

header

required

API Key authentication. Format: "Bearer YOUR_API_KEY"

Body

multipart/form-data

name

string

required

Dataset name (must be unique)

dataset_type

enum<string>

required

Available options:

SINGLE_FILE,

MULTI_FILE,

MULTI_FOLDER

files

file[]

required

Files to upload. SINGLE_FILE: exactly 1 file. MULTI_FILE: 2+ files. MULTI_FOLDER: 2+ files with corresponding folder_names.

Minimum array length: 2

description

string

Optional dataset description

folder_names

string[]

Folder names for MULTI_FOLDER datasets. This is a parallel array with files: folder_names[i] specifies which folder files[i] belongs to (e.g., if files=['a.txt', 'b.txt'] and folder_names=['doc1', 'doc2'], then a.txt goes in doc1, b.txt goes in doc2). Minimum 2 unique folder names required.

Minimum array length: 2

Response

Seed dataset created successfully

string<uuid>

Unique identifier for the dataset

name

string

Dataset name

description

string | null

Optional description of the dataset contents or purpose

dataset_type

enum<string>

Type of dataset structure. SINGLE_FILE: one CSV/JSON/JSONL file with tabular data. MULTI_FILE: multiple individual text files. MULTI_FOLDER: files organized into folders where each folder represents one sample.

Available options:

SINGLE_FILE,

MULTI_FILE,

MULTI_FOLDER

created_at

string<date-time>

Timestamp when the dataset was created

updated_at

string<date-time>

Timestamp when the dataset was last modified

created_by_email

string

Email address of the user who created the dataset

files

object[]

List of all files in the dataset

Show child attributes

folder_count

integer

Total number of folders in the dataset

file_count

integer

Total number of files in the dataset

short_sample_compatibility

object

Information about which generation modes are compatible with this dataset

Show child attributes

Create seed dataset from ZIP

⌘I

Seed Datasets

Specs (data blueprints)

Data Generation Runs

Evaluations

Red Teaming

File constraints by dataset type

Authorizations

Body

Response

Seed Datasets

Specs (data blueprints)

Data Generation Runs

Evaluations

Red Teaming

​File constraints by dataset type

Authorizations

Body

Response

File constraints by dataset type