Multi-Folder Workflow

Open in Google Colab

Run this tutorial interactively in Google Colab

Overview

Multi-folder datasets are used when each sample consists of multiple related files organized in folders. This tutorial covers the complete workflow from creating a multi-folder dataset to generating and evaluating complex multi-file samples using the Dataframer Python SDK.

What You’ll Learn

When to use multi-folder datasets
Setting up the Dataframer SDK
Creating multi-folder datasets with files
Generating specifications via AI analysis
Generating multi-file samples
Evaluating and downloading generated folders

Prerequisites

Python 3.8 or higher
Understanding of dataset types
Multiple related files to use as seed data
API key for authentication (set as DATAFRAMER_API_KEY environment variable)

Use Cases

Multi-folder datasets are perfect for: Medical Records (EHR)

patient_case_001/
├── chest_xray_report.txt
└── discharge_summary.md

patient_case_002/
├── blood_work.txt
└── clinical_notes.md

Project Documentation

project_alpha/
├── README.md
├── requirements.txt
└── design_doc.md

project_beta/
├── README.md
├── api_spec.md
└── deployment_guide.txt

Multi-Language Content

article_001/
├── english.txt
├── spanish.txt
└── metadata.md

article_002/
├── english.txt
├── french.txt
└── metadata.md

Step 1: Install and Setup SDK

Install the Dataframer SDK

pip install --upgrade pydataframer dotenv pyyaml

Setup and Initialize Client

from pathlib import Path
from dataframer import Dataframer
import os

# Initialize the Dataframer client
client = Dataframer(
    api_key=os.getenv('DATAFRAMER_API_KEY')
)

print("✓ Dataframer client initialized successfully")
print(f"  Using base URL: {client.base_url}")

# Check SDK version
import dataframer
print(f"Dataframer SDK version: {dataframer.__version__}")

Step 2: Prepare Folder Structure

Folder Structure Requirements

Before creating your dataset, review these requirements carefully: ✅ Do:

Create folders at the root level (Dataset → patient_case_001, patient_case_002, etc.)
Include at least 2 sample folders
Use supported file types: .md, .txt, .json, .csv, .jsonl (see all formats)
Keep files under 1MB each
Stay under 50MB total dataset size
Limit to 20 files per folder, 1000 files total
Maintain consistent structure across folders

❌ Don’t:

Put files directly in root (must be in folders)
Exceed 2 folder levels (Dataset → subfolder → files only)
Include empty folders
Use unsupported file types (.pdf, .docx, .xlsx, etc.)
Mix different structures between folders

Create Organized Folders

Each folder represents one complete sample. For a medical EHR dataset, we’ll create two patient case folders:

from pathlib import Path

# Create patient case folders
Path("Dataset/patient_case_001").mkdir(parents=True, exist_ok=True)
Path("Dataset/patient_case_002").mkdir(parents=True, exist_ok=True)

print("✓ Created patient case folders")

Add Files to Each Folder

from pathlib import Path

# Create directories
Path("Dataset/patient_case_001").mkdir(parents=True, exist_ok=True)
Path("Dataset/patient_case_002").mkdir(parents=True, exist_ok=True)

# Patient Case 001 - Chest X-ray Report
with open("Dataset/patient_case_001/chest_xray_report.txt", "w") as f:
    f.write("Patient: Patient 1\n")
    f.write("Date: January 15, 2024\n")
    f.write("Findings: Clear lung fields, no infiltrates or masses detected.\n")
    f.write("Impression: Normal chest radiograph.\n")

# Patient Case 001 - Discharge Summary
with open("Dataset/patient_case_001/discharge_summary.md", "w") as f:
    f.write("# Discharge Summary\n\n")
    f.write("**Patient:** Patient 1\n")
    f.write("**Date:** January 15, 2024\n\n")
    f.write("## Clinical Course\n")
    f.write("Patient presented for routine checkup. All vitals within normal range.\n\n")
    f.write("## Discharge Diagnosis\n")
    f.write("Healthy status maintained.\n")

# Patient Case 002 - Blood Work
with open("Dataset/patient_case_002/blood_work.txt", "w") as f:
    f.write("Patient: Patient 2\n")
    f.write("Date: January 20, 2024\n")
    f.write("WBC: 7.2 K/uL (Normal)\n")
    f.write("RBC: 4.8 M/uL (Normal)\n")
    f.write("Platelets: 250 K/uL (Normal)\n")

# Patient Case 002 - Clinical Notes
with open("Dataset/patient_case_002/clinical_notes.md", "w") as f:
    f.write("# Clinical Notes\n\n")
    f.write("**Patient:** Patient 2\n")
    f.write("**Date:** January 20, 2024\n\n")
    f.write("## Visit Reason\n")
    f.write("Annual physical examination.\n\n")
    f.write("## Assessment\n")
    f.write("Patient in excellent health. All laboratory values within normal limits.\n")

print("✓ Created 2 patient case folders with files")

Verify Your Folder Structure

from pathlib import Path

dataset_folder_path = Path("Dataset")

if dataset_folder_path.exists():
    print(f"✓ Found Dataset folder at: {dataset_folder_path.absolute()}")
    print(f"\nFolder structure:")
    
    for patient_folder in sorted(dataset_folder_path.iterdir()):
        if patient_folder.is_dir():
            print(f"\n📁 {patient_folder.name}/")
            for file in sorted(patient_folder.iterdir()):
                size_kb = file.stat().st_size / 1024
                print(f"   📄 {file.name} ({size_kb:.1f} KB)")
else:
    print(f"Dataset folder not found")

Step 3: Create Multi-Folder Dataset

The SDK provides a simple ZIP-based upload method.

Create ZIP and Upload

import zipfile
import io

dataset_name = "EHR_patient_records_demo"

# Create a ZIP file in memory containing the entire Dataset folder
zip_buffer = io.BytesIO()
with zipfile.ZipFile(zip_buffer, 'w', zipfile.ZIP_DEFLATED) as zip_file:
    for patient_folder in sorted(dataset_folder_path.iterdir()):
        if patient_folder.is_dir():
            for file_path in sorted(patient_folder.iterdir()):
                if file_path.is_file():
                    # Add file to ZIP with folder structure preserved
                    arcname = f"{patient_folder.name}/{file_path.name}"
                    zip_file.write(file_path, arcname)
                    print(f"📦 Added: {arcname}")

zip_buffer.seek(0)

# Upload ZIP file - backend auto-detects structure and validates
print("\n🚀 Uploading dataset...")
dataset_response = client.dataframer.datasets.create_from_zip(
    name=dataset_name,
    description="Electronic Health Records (EHR) dataset with multiple patient cases",
    zip_file=zip_buffer
)

print(f"\n✅ Dataset created: {dataset_name}")
print(f"   ID: {dataset_response.id}")
print(f"   Type: {dataset_response.dataset_type} (auto-detected)")
print(f"   Files: {dataset_response.file_count} | Folders: {dataset_response.folder_count}")

# Store the dataset_id for later use
dataset_id = dataset_response.id

List All Datasets

# List all datasets to verify creation
datasets = client.dataframer.datasets.list()

print("=" * 80)
print(f"Found {len(datasets)} dataset(s)")
print("=" * 80)

for i, dataset in enumerate(datasets, 1):
    print(f"\n📁 Dataset {i}:")
    print(f"  Name: {dataset.name}")
    print(f"  ID: {dataset.id}")
    print(f"  Type: {dataset.dataset_type_display}")
    print(f"  Files: {dataset.file_count} | Folders: {dataset.folder_count}")
    print(f"  Created: {dataset.created_at.strftime('%Y-%m-%d %H:%M:%S')}")

Retrieve Dataset Details

# Get detailed information about the dataset
dataset_info = client.dataframer.datasets.retrieve(dataset_id=dataset_id)

print("📋 Dataset Information:")
print("=" * 80)
print(f"ID: {dataset_info.id}")
print(f"Name: {dataset_info.name}")
print(f"Type: {dataset_info.dataset_type} ({dataset_info.dataset_type_display})")
print(f"Description: {dataset_info.description}")
print(f"Created: {dataset_info.created_at}")
print()
print(f"📁 Contents:")
print(f"  Files: {dataset_info.file_count}")
print(f"  Folders: {dataset_info.folder_count}")
print()
print(f"🔧 Compatibility:")
compat = dataset_info.short_sample_compatibility
print(f"  Short samples: {'✅' if compat.is_short_samples_compatible else '❌'}")
print(f"  Long samples: {'✅' if compat.is_long_samples_compatible else '❌'}")
if compat.reason:
    print(f"  Reason: {compat.reason}")
print("=" * 80)

Step 4: Generate a Spec using the Analysis API

Use an LLM to automatically analyze your dataset and create a spec. This process reads all files, analyzes patterns in the input files and generates a spec (blueprint) that will be used for synthetic data generation.

Start Analysis

# Create a spec by analyzing the dataset
spec_name = f"Spec for {dataset_info.name}"

analysis_result = client.dataframer.analyze.create(
    dataset_id=dataset_id,
    name=spec_name,

    # Optional: Specify the AI model for analysis
    analysis_model_name="claude-sonnet-4-5",
    # Note: This model can be used to generate evals data but not data to train competing models

    # Optional: Analysis configuration
    extrapolate_values=True,        # Extrapolate new values beyond existing ranges
    generate_distributions=True,    # Generate statistical distributions
)

print(f"✓ Created spec '{spec_name}' via AI analysis")
print(f"  Task ID: {analysis_result.task_id}")
print(f"  Status: {analysis_result.status}")

Recommended AI Models:

claude-sonnet-4-5* (recommended for quality)
claude-sonnet-4-5-thinking*
claude-haiku-4-5* (fast & cheap)
deepseek-ai/DeepSeek-V3.1
moonshotai/Kimi-K2-Instruct
openai/gpt-oss-120b (slow)
deepseek-ai/DeepSeek-R1-0528-tput (slow)
Qwen/Qwen2.5-72B-Instruct-Turbo

* These models can be used to generate evals data but not data to train competing models.

Check Analysis Status

The analysis is asynchronous and may take 3-10 minutes for multi-folder datasets:

# Poll for analysis completion
task_id = analysis_result.task_id
status = client.dataframer.analyze.get_status(task_id=task_id)

print(f"Analysis status: {status['status']}")
# Status values: PENDING, RUNNING, COMPLETED, FAILED

List All Specs

# Retrieve all specs to get the spec ID
specs = client.dataframer.specs.list()

print("Available specs:")
for spec in specs:
    print(f"  • {spec.name} (ID: {spec.id})")
    print(f"    Dataset: {spec.dataset_name}")
    print(f"    Latest version: {spec.latest_version}")

# Store the spec_id for generation
spec_id = specs[0].id  # Use your newly created spec

Review Generated Specification

The specification captures the structure and patterns from your dataset:

# Get the latest version details
spec = client.dataframer.specs.retrieve(spec_id=spec_id)
versions = client.dataframer.specs.versions.list(spec_id=spec_id)

if len(versions) > 0:
    latest_version = client.dataframer.specs.versions.retrieve(
        spec_id=spec_id,
        version_id=versions[0].id
    )
    
    print(f"Latest version: {latest_version.version}")
    print(f"Config YAML length: {len(latest_version.config_yaml)} chars")
    
    # Parse the config to see data properties
    import yaml
    config = yaml.safe_load(latest_version.config_yaml)
    
    # Access spec data
    spec_data = config.get('spec', config)
    
    print(f"\nData property variations:")
    if 'data_property_variations' in spec_data:
        for prop in spec_data['data_property_variations']:
            print(f"  • {prop['property_name']}: {len(prop['property_values'])} values")

Step 5: Update Spec (Optional)

You can update the spec by modifying the config YAML and creating a new version:

Modify Specification Config

import yaml

# Get the latest version
versions = client.dataframer.specs.versions.list(spec_id=spec_id)
latest_version = client.dataframer.specs.versions.retrieve(
    spec_id=spec_id,
    version_id=versions[0].id
)

# Parse the current config
current_config = yaml.safe_load(latest_version.config_yaml)
spec_data = current_config.get('spec', current_config)

# Example: Add a new data property variation for EHR
if 'data_property_variations' in spec_data:
    new_property = {
        'property_name': 'Case Severity',
        'property_values': ['Critical', 'Severe', 'Moderate', 'Mild'],
        'base_distributions': {
            'Critical': 10,
            'Severe': 25,
            'Moderate': 40,
            'Mild': 25
        },
        'conditional_distributions': {}
    }
    spec_data['data_property_variations'].append(new_property)
    print(f"✓ Added new property: {new_property['property_name']}")
    
    # Update requirements for medical context
    if 'requirements' in spec_data:
        spec_data['requirements'] += (
            "\n\nGenerated patient cases must maintain medical accuracy "
            "and include appropriate clinical correlations between symptoms, "
            "test results, and diagnoses."
        )
        print(f"✓ Updated requirements for medical context")

# Convert back to YAML
new_config_yaml = yaml.dump(current_config, default_flow_style=False, sort_keys=False)

# Update the spec (creates a new version automatically)
updated_spec = client.dataframer.specs.update(
    spec_id=spec_id,
    config_yaml=new_config_yaml,  # Your edits
    results_yaml=latest_version.results_yaml,  # Historical reference
    orig_results_yaml=latest_version.orig_results_yaml,  # Backup
    runtime_params=latest_version.runtime_params  # Metadata
)

print(f"\n✓ Updated spec. New version: {updated_spec.latest_version}")

YAML Fields Explained:

config_yaml: Your updated configuration (used for generation) - REQUIRED
results_yaml: Original AI analysis from version 1 (never changes)
orig_results_yaml: Backup of original analysis
runtime_params: Metadata about analysis/generation

Step 6: Generate Multi-Folder Samples

Generate new synthetic patient record folders based on your specification:

Start Generation Run

# Create a generation run
generation_result = client.dataframer.generate.create(
    spec_id=spec_id,
    
    generation_model="claude-sonnet-4-5",
    # Note: This model can be used to generate evals data but not data to train competing models
    number_of_samples=5,
    sample_type="long",  # Multi-folder requires "long" for proper file relationships
    
    ## Advanced configuration for long samples

    outline_model="claude-sonnet-4-5",
    # Note: This model can be used to generate evals data but not data to train competing models
    enable_revisions=True,
    max_revision_cycles=2,
    outline_thinking_budget=2000,
    
    revision_model="claude-sonnet-4-5",
    # Note: This model can be used to generate evals data but not data to train competing models
    revision_thinking_budget=1500,
)

print(f"✓ Started generation run")
print(f"  Task ID: {generation_result.task_id}")
print(f"  Run ID: {generation_result.run_id}")
print(f"  Status: {generation_result.status}")

# Store for later use
task_id = generation_result.task_id
run_id = generation_result.run_id

Available Generation Models:

claude-sonnet-4-5* (recommended for quality)
claude-sonnet-4-5-thinking*
claude-haiku-4-5* (fast & cheap)
deepseek-ai/DeepSeek-V3.1
moonshotai/Kimi-K2-Instruct
openai/gpt-oss-120b (slow)
deepseek-ai/DeepSeek-R1-0528-tput (slow)
Qwen/Qwen2.5-72B-Instruct-Turbo

* These models can be used to generate evals data but not data to train competing models.

Multi-folder generation requires long sample type for proper file relationships. Generation takes 5-15 minutes per sample folder.

Monitor Generation Status

# Check generation status
status = client.dataframer.generate.retrieve_status(task_id=task_id)
print(f"Generation status: {status['status']}")
# Status values: PENDING, RUNNING, COMPLETED, FAILED

List All Runs

# View all generation runs
runs = client.dataframer.runs.list()

print("All generation runs:")
for run in runs:
    print(f"  • Run {run.id}: {run.status}")
    print(f"    Spec: {run.spec_name}")
    print(f"    Samples: {run.number_of_samples}")

Retrieve Run Status

# Get detailed run status
run_status = client.dataframer.runs.status(run_id=run_id)
print(f"Run status: {run_status['status']}")

Step 7: Evaluate Generated Samples

Once the generateion is complete, the evaluation step can be used to automatically tag the generated samples with the appropriate labels, observe distributions of the data across the various attributes and run adhoc queries against the generated dataset.

Create Evaluation

# Create an evaluation for the completed run
# Note: Run must be in 'SUCCEEDED' status with generated files

print(f"Creating evaluation for run: {run_id}")

evaluation = client.dataframer.evaluations.create(
    run_id=run_id,
    evaluation_model="claude-sonnet-4-5"
    # Note: This model can be used to generate evals data but not data to train competing models
)

print(f"\nEvaluation created")
print(f"  Evaluation ID: {evaluation.id}")
print(f"  Status: {evaluation.status}")
print(f"  Created at: {evaluation.created_at}")

# Store the evaluation_id for later use
evaluation_id = evaluation.id

Check Evaluation Status

# Poll this endpoint until status is 'COMPLETED' or 'FAILED'
evaluation_results = client.dataframer.evaluations.retrieve(
    evaluation_id=evaluation_id
)

print(f"Evaluation Status: {evaluation_results.status}")

if evaluation_results.status == 'FAILED':
    print(f"\n❌ Evaluation failed")
    if evaluation_results.error_message:
        print(f"  Error: {evaluation_results.error_message}")
elif evaluation_results.status == 'COMPLETED':
    print(f"\n✅ Evaluation completed successfully")

Evaluation Checks

The evaluation process analyzes:

All required files present in each folder
File formats are correct
Data consistency across files
Content quality of each file
Conformance to spec requirements
Distribution of data property variations

Step 8: Download Generated Folders with Metadata

After evaluation completes, download the generated files with their evaluation metadata.

List Generated Files

# Get generated files for the run
result = client.dataframer.runs.generated_files.list(run_id=run_id)

print("📁 Generated Files:")
print("=" * 80)
print(f"Run ID: {result.run_id}")
print(f"Total files: {len(result.generated_files)}")
print("=" * 80)

for i, file in enumerate(result.generated_files, 1):
    print(f"\n📄 File {i}:")
    print(f"  Name: {file.name}")
    print(f"  ID: {file.id}")
    print(f"  Status: {file.status}")
    print(f"  Size: {file.size} bytes")
    print(f"  Type: {file.type}")
    if file.status_details:
        print(f"  Details: {file.status_details}")
    if file.generation_model:
        print(f"  Model: {file.generation_model}")

Download All Files as ZIP

from pathlib import Path

print(f"📥 Downloading generated files with metadata as ZIP...")

# Download ZIP file from backend
# The ZIP contains:
# - All generated files with folder structure
# - .metadata files with evaluation tags/classifications
# - top_level.metadata with evaluation summary
downloaded_zip = client.dataframer.runs.generated_files.download_all(
    run_id=run_id
)

# Save ZIP file
output_file = Path(f"generated_samples_{run_id}.zip")
output_file.write_bytes(downloaded_zip.read())

print(f"\n✅ Download complete!")
print(f"📦 ZIP file: {output_file.absolute()}")

Cleanup: Delete Dataset

When you’re done with testing, you can delete the dataset. First, delete all specs that reference it:

## ⚠️ Warning: This action cannot be undone. All files will be permanently deleted.

## Step 1: Get all specs for this dataset
all_specs = client.dataframer.specs.list()
dataset_specs = [spec for spec in all_specs if spec.dataset_name == dataset_name]

print(f"Found {len(dataset_specs)} spec(s) referencing this dataset")

## Step 2: Delete all specs that reference this dataset
for spec in dataset_specs:
    print(f"  Deleting spec: {spec.name} (ID: {spec.id})")
    ## Uncomment to delete the spec
    # client.dataframer.specs.delete(spec_id=spec.id)
    # print(f" ✓ Deleted spec {spec.id}")

## Step 3: Delete the dataset
## Note: Cannot delete a dataset that is referenced by any specs.
## Uncomment to delete the dataset after deleting all specs
# client.dataframer.datasets.delete(dataset_id=dataset_id)
# print(f"✓ Deleted dataset {dataset_id}")

print("\n⚠️ Deletion is commented out for safety")
print("Uncomment the code above to delete when ready")

Common Issues

Unsupported file type error

Problem: Files are being skipped during uploadCauses:

Using unsupported file types

Solution:

Check the Folder Structure Requirements for supported file types
Convert your data to a supported format
Verify file extensions before upload
Check the skipped files list during upload

Dataset creation fails

Problem: Dataset creation fails with errorCauses:

No valid files to upload (all skipped)
Files exceed size limits
Too many files
Total size too large

Solution:

Ensure at least one supported file type exists
Review the Folder Structure Requirements for all limits
Split large files into smaller chunks
Reduce number of files per folder
Check total dataset size

Generated folders incomplete

Problem: Some folders missing filesCauses:

Unclear specification
Model timeout
Complex file relationships

Solution:

Explicitly list required files in spec
Provide clear file format examples
Simplify file relationships
Try generating fewer samples
Use sample_type="long" for multi-folder

File consistency issues

Problem: Data doesn’t match across files in a folderCauses:

Specification doesn’t emphasize consistency
Files treated independently

Solution:

Add consistency requirements to spec
Provide examples of consistent folders
Use explicit cross-file constraints in requirements
Update spec to emphasize relationships between files

Generation very slow

Expected: 5-15 minutes per folderIf much slower:

Multi-folder is inherently slower than single files
Large/complex files take longer
Check status endpoint for progress
Consider reducing sample count
Try using faster models like “claude-haiku-4-5” or open source alternatives

SDK import errors

Problem: Cannot import dataframer moduleCauses:

SDK not installed
Wrong virtual environment
Installation failed

Solution:

pip install --upgrade pydataframer dotenv pyyaml

Verify installation: pip show pydataframer
Check Python version (3.8+)

Best Practices

✅ Check requirements first: Review the Folder Structure Requirements before starting ✅ Consistent file naming: Use same filenames across all folders when possible ✅ Clear relationships: Document how files relate in specification requirements ✅ Explicit requirements: List all required files explicitly in the spec ✅ Test with small batches: Generate 3-5 folders first to verify quality ✅ Validate structure: Check folder structure before uploading ✅ Quality seed data: Provide high-quality, consistent examples ✅ Monitor file sizes: Keep files under 1MB for optimal performance ✅ Use appropriate models:

claude-sonnet-4-5* for highest quality
claude-sonnet-4-5-thinking*
claude-haiku-4-5* for fast & cheap generation
deepseek-ai/DeepSeek-V3.1
moonshotai/Kimi-K2-Instruct
openai/gpt-oss-120b (slow)
deepseek-ai/DeepSeek-R1-0528-tput (slow)
Qwen/Qwen2.5-72B-Instruct-Turbo

* These models can be used to generate evals data but not data to train competing models.

✅ Long sample type: Always use sample_type="long" for multi-folder generation

Next Steps

API Guides

Deep dive into API integration.

Dataset Guide

Learn more about dataset types.

Troubleshooting

Solutions to common problems.

Code Examples

Complete code examples.

UI Docs

API Core Concepts

API Tutorials

Troubleshooting

Open in Google Colab

​Overview

​What You’ll Learn

​Prerequisites

​Use Cases

​Step 1: Install and Setup SDK

​Install the Dataframer SDK

​Setup and Initialize Client

​Step 2: Prepare Folder Structure

​Folder Structure Requirements

​Create Organized Folders

​Add Files to Each Folder

​Verify Your Folder Structure

​Step 3: Create Multi-Folder Dataset

​Create ZIP and Upload

​List All Datasets

​Retrieve Dataset Details

​Step 4: Generate a Spec using the Analysis API

​Start Analysis

​Check Analysis Status

​List All Specs

​Review Generated Specification

​Step 5: Update Spec (Optional)

​Modify Specification Config

​Step 6: Generate Multi-Folder Samples

​Start Generation Run

​Monitor Generation Status

​List All Runs

​Retrieve Run Status

​Step 7: Evaluate Generated Samples

​Create Evaluation

​Check Evaluation Status

​Evaluation Checks

​Step 8: Download Generated Folders with Metadata

​List Generated Files

​Download All Files as ZIP

​Cleanup: Delete Dataset

​Common Issues

​Best Practices

​Next Steps

API Guides

Dataset Guide

Troubleshooting

Code Examples

Overview

What You’ll Learn

Prerequisites

Use Cases

Step 1: Install and Setup SDK

Install the Dataframer SDK

Setup and Initialize Client

Step 2: Prepare Folder Structure

Folder Structure Requirements

Create Organized Folders

Add Files to Each Folder

Verify Your Folder Structure

Step 3: Create Multi-Folder Dataset

Create ZIP and Upload

List All Datasets

Retrieve Dataset Details

Step 4: Generate a Spec using the Analysis API

Start Analysis

Check Analysis Status

List All Specs

Review Generated Specification

Step 5: Update Spec (Optional)

Modify Specification Config

Step 6: Generate Multi-Folder Samples

Start Generation Run

Monitor Generation Status

List All Runs

Retrieve Run Status

Step 7: Evaluate Generated Samples

Create Evaluation

Check Evaluation Status

Evaluation Checks

Step 8: Download Generated Folders with Metadata

List Generated Files

Download All Files as ZIP

Cleanup: Delete Dataset

Common Issues

Best Practices

Next Steps