> ## Documentation Index
> Fetch the complete documentation index at: https://docs.dataframer.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Create seed dataset from ZIP

> Create a seed dataset by uploading a ZIP file

The system automatically detects the dataset type based on ZIP structure:

* **SINGLE\_FILE**: ZIP contains exactly one file containing all the samples
* **MULTI\_FILE**: ZIP contains multiple files at root level, where each file is a separate sample
* **MULTI\_FOLDER**: ZIP contains multiple folders, where each folder is a separate sample and only contains files with no nested folders

## File constraints by dataset type

| Type          | Allowed formats                | Size limit           | Count limit                            |
| ------------- | ------------------------------ | -------------------- | -------------------------------------- |
| SINGLE\_FILE  | CSV, JSON, JSONL               | 50MB                 | 1 file, min 2 rows                     |
| MULTI\_FILE   | TXT, MD, JSON, CSV, JSONL, PDF | 1MB/file, 50MB total | 2-1000 files                           |
| MULTI\_FOLDER | TXT, MD, JSON, CSV, JSONL, PDF | 1MB/file, 50MB total | 2-1000 files, 20/folder, min 2 folders |


## OpenAPI

````yaml POST /api/dataframer/seed-datasets/create-from-zip/
openapi: 3.0.0
info:
  title: DataFramer API
  version: 0.1.0
  description: ''
  termsOfService: https://www.aimon.ai/docs/privacy-policy.pdf
  contact:
    name: DataFramer Support
    email: info@dataframer.ai
  license:
    name: Proprietary
  x-logo:
    url: https://dataframer.ai/logo.png
    altText: DataFramer AI
  x-stainless:
    package-name: aimon-dataframer
    namespace:
      - aimon
      - dataframer
servers:
  - url: https://df-api.dataframer.ai
    description: Production server
security:
  - BearerAuth: []
tags:
  - name: Seed Datasets
    description: Manage seed datasets for generation
  - name: Specs
    description: Data specifications for sample generation
  - name: Runs
    description: Generation runs and results
  - name: Evaluations
    description: Evaluate generated sample quality
  - name: Red Teaming
    description: Security testing and adversarial prompts
  - name: Spec Creation
    description: Create specs from datasets or from scratch (seedless)
  - name: Generation
    description: Synthetic data generation
  - name: API Keys
    description: API key management and rotation
  - name: Health
    description: Health check endpoints
  - name: Models
    description: Available AI models
externalDocs:
  description: Complete API Guide
  url: https://docs.dataframer.ai/dataframer
paths:
  /api/dataframer/seed-datasets/create-from-zip/:
    post:
      tags:
        - Seed Datasets
      summary: Create seed dataset from ZIP
      description: >-
        Create a seed dataset by uploading a ZIP file.


        The system automatically detects the dataset type based on ZIP
        structure:

        - **SINGLE_FILE**: ZIP contains exactly one file containing all the
        samples

        - **MULTI_FILE**: ZIP contains multiple files at root level, where each
        file is a separate sample

        - **MULTI_FOLDER**: ZIP contains multiple folders, where each folder is
        a separate sample and only contains files with no nested folders


        ## File constraints by dataset type


        | Type | Allowed formats | Size limit | Count limit |

        |------|----------------|------------|-------------|

        | SINGLE_FILE | CSV, JSON, JSONL | 50MB | 1 file, min 2 rows |

        | MULTI_FILE | TXT, MD, JSON, CSV, JSONL, PDF | 1MB/file, 50MB total |
        2-1000 files |

        | MULTI_FOLDER | TXT, MD, JSON, CSV, JSONL, PDF | 1MB/file, 50MB total |
        2-1000 files, 20/folder, min 2 folders |
      operationId: api_dataframer_datasets_create_from_zip
      requestBody:
        required: true
        content:
          multipart/form-data:
            schema:
              type: object
              required:
                - name
                - zip_file
              properties:
                name:
                  type: string
                  description: Dataset name (unique within company)
                description:
                  type: string
                  description: Optional dataset description
                zip_file:
                  type: string
                  format: binary
                  description: >-
                    ZIP file containing dataset files. For MULTI_FOLDER
                    datasets, the order files appear in the ZIP determines their
                    order within each folder — earlier files should be the ones
                    that later files may depend on.
            encoding:
              zip_file:
                contentType: application/zip
      responses:
        '201':
          description: Seed dataset created successfully
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/DataframerDataset'
              examples:
                dataset_from_zip:
                  value:
                    id: 550e8400-e29b-41d4-a716-446655440000
                    name: Support Ticket Conversations
                    description: Customer support chat logs organized by ticket
                    dataset_type: MULTI_FOLDER
                    file_count: 10
                    folder_count: 2
                    created_at: '2025-01-15T10:30:00Z'
                    updated_at: '2025-01-15T10:30:00Z'
                    created_by_email: sarah.chen@acme.com
        '400':
          description: Bad Request - Invalid ZIP structure or file validation errors
        '401':
          description: Unauthorized
      x-codeSamples:
        - lang: JavaScript
          source: >-
            import fs from 'fs';

            import Dataframer from 'dataframer';


            const client = new Dataframer({
              apiKey: process.env['DATAFRAMER_API_KEY'], // This is the default and can be omitted
            });


            const response = await
            client.dataframer.seedDatasets.createFromZip({
              name: 'name',
              zip_file: fs.createReadStream('path/to/file'),
            });


            console.log(response.id);
        - lang: Python
          source: |-
            import os
            from dataframer import Dataframer

            client = Dataframer(
                api_key=os.environ.get("DATAFRAMER_API_KEY"),  # This is the default and can be omitted
            )
            response = client.dataframer.seed_datasets.create_from_zip(
                name="name",
                zip_file=b"Example data",
            )
            print(response.id)
components:
  schemas:
    DataframerDataset:
      type: object
      properties:
        id:
          type: string
          format: uuid
          readOnly: true
          description: Unique identifier for the dataset
        name:
          type: string
          description: Dataset name
        description:
          type: string
          nullable: true
          description: Optional description of the dataset contents or purpose
        dataset_type:
          type: string
          enum:
            - SINGLE_FILE
            - MULTI_FILE
            - MULTI_FOLDER
          description: >-
            Type of dataset structure. SINGLE_FILE: one CSV/JSON/JSONL file with
            tabular data. MULTI_FILE: multiple individual text files.
            MULTI_FOLDER: files organized into folders where each folder
            represents one sample.
        created_at:
          type: string
          format: date-time
          readOnly: true
          description: Timestamp when the dataset was created
        updated_at:
          type: string
          format: date-time
          readOnly: true
          description: Timestamp when the dataset was last modified
        created_by_email:
          type: string
          readOnly: true
          description: Email address of the user who created the dataset
        files:
          type: array
          items:
            $ref: '#/components/schemas/File'
          readOnly: true
          description: List of all files in the dataset
        folder_count:
          type: integer
          readOnly: true
          description: Total number of folders in the dataset
        file_count:
          type: integer
          readOnly: true
          description: Total number of files in the dataset
        sample_count:
          type: integer
          nullable: true
          readOnly: true
          description: >-
            Number of data samples in the dataset. Only populated for
            SINGLE_FILE datasets (e.g. number of rows in a CSV).
    File:
      type: object
      properties:
        id:
          type: string
          format: uuid
          readOnly: true
          description: Unique identifier for the file
        file_type:
          type: string
          enum:
            - json
            - jsonl
            - csv
            - md
            - txt
            - pdf
          description: >-
            File format. json: single JSON object or array. jsonl:
            newline-delimited JSON records. csv: comma-separated values. md:
            Markdown text. txt: plain text. pdf: PDF document.
        size_bytes:
          type: integer
          nullable: true
          description: File size in bytes
        sha256:
          type: string
          nullable: true
          description: SHA-256 hash of the file contents for integrity verification
        path:
          type: string
          readOnly: true
          description: >-
            Full path of the file. For files in folders, includes folder name
            (e.g., 'folder_name/file.txt'). For files at root level, just the
            filename.
  securitySchemes:
    BearerAuth:
      type: http
      scheme: bearer
      bearerFormat: API Key
      description: 'API Key authentication. Format: "Bearer YOUR_API_KEY"'

````