Create seed dataset from ZIP

import os from dataframer import Dataframer client = Dataframer( api_key=os.environ.get("DATAFRAMER_API_KEY"), # This is the default and can be omitted ) response = client.dataframer.seed_datasets.create_from_zip( name="name", zip_file=b"Example data", ) print(response.id)

{ "id": "550e8400-e29b-41d4-a716-446655440000", "name": "Support Ticket Conversations", "description": "Customer support chat logs organized by ticket", "dataset_type": "MULTI_FOLDER", "file_count": 10, "folder_count": 2, "created_at": "2025-01-15T10:30:00Z", "updated_at": "2025-01-15T10:30:00Z", "created_by_email": "[email protected]" }

File constraints by dataset type

Type	Allowed formats	Size limit	Count limit
SINGLE_FILE	CSV, JSON, JSONL	50MB	1 file, min 2 rows
MULTI_FILE	TXT, MD, JSON, CSV, JSONL, PDF	1MB/file, 50MB total	2-1000 files
MULTI_FOLDER	TXT, MD, JSON, CSV, JSONL, PDF	1MB/file, 50MB total	2-1000 files, 20/folder, min 2 folders

Authorizations

Authorization

string

header

required

API Key authentication. Format: "Bearer YOUR_API_KEY"

Body

multipart/form-data

name

string

required

Dataset name (unique within company)

zip_file

file

required

ZIP file containing dataset files. For MULTI_FOLDER datasets, the order files appear in the ZIP determines their order within each folder — earlier files should be the ones that later files may depend on.

description

string

Optional dataset description

Response

Seed dataset created successfully

string<uuid>

read-only

Unique identifier for the dataset

name

string

Dataset name

description

string | null

Optional description of the dataset contents or purpose

dataset_type

enum<string>

Type of dataset structure. SINGLE_FILE: one CSV/JSON/JSONL file with tabular data. MULTI_FILE: multiple individual text files. MULTI_FOLDER: files organized into folders where each folder represents one sample.

Available options:

SINGLE_FILE,

MULTI_FILE,

MULTI_FOLDER

created_at

string<date-time>

read-only

Timestamp when the dataset was created

updated_at

string<date-time>

read-only

Timestamp when the dataset was last modified

created_by_email

string

read-only

Email address of the user who created the dataset

files

object[]

read-only

List of all files in the dataset

Show child attributes

folder_count

integer

read-only

Total number of folders in the dataset

file_count

integer

read-only

Total number of files in the dataset

sample_count

integer | null

read-only

Number of data samples in the dataset. Only populated for SINGLE_FILE datasets (e.g. number of rows in a CSV).

​File constraints by dataset type

Authorizations

Body

Response

File constraints by dataset type