Upload Data Source

All operations require authentication using Bearer tokens. Make sure you have your API credentials ready.

File and Format Requirements

Supported formats: CSV, TSV, JSON, JSONL, PDF, and raw text files
Column names:
- For CSV/TSV: Files must include a header row specifying column names
- For JSON/JSONL: Each object must use consistent field names
- For PDF: Text content will be extracted and organized into structured tables
Content: Any raw data that needs cleaning, extraction, or organization
Size limits:
- Minimum: 10 rows for meaningful processing
- Maximum: 1 million rows per data source (free tier: 10,000 rows)
- File size: Up to 100MB per file, 1GB total per data source

Best Practices

Data Quality

When uploading data that contains missing values, use proper formatting:

JSON/JSONL: Use the null keyword (without quotes) for missing values
CSV/TSV: Leave cells empty to represent NULL values
Mixed data types: Ensure numeric columns contain only numbers and properly formatted NULL values for optimal AI processing

Date and Time Formatting

For consistent handling of temporal information, use ISO 8601 format:

Dates: YYYY-MM-DD (e.g., 2024-03-14)
Timestamps: YYYY-MM-DDThh:mm:ssZ (e.g., 2024-03-14T15:30:00Z)

Our data parser will attempt to detect other date formats, but using the ISO standard ensures the most reliable processing and transformation.

Include Rich Metadata

When preparing your dataset, include all relevant metadata fields since Claro can efficiently handle datasets with many columns. These additional columns provide valuable context for AI processing and enable richer analysis. Consider including:

Timestamps and version information
Categories and classifications
Source information and identifiers
Status fields and flags
Ratings and quality scores
Any other contextual fields

Flatten Nested Data

If your data contains nested structures or objects (like JSON objects), flatten them into separate columns before uploading to Claro. For example, instead of having a single column containing {"status": "active", "priority": 3}, split it into two separate columns: status with value “active” and priority with value 3. This flat structure allows for better AI processing and analysis.

Upload Methods

API Upload

Use the Claro API to upload raw data sources programmatically:

curl -X POST https://secure-api.getclaro.ai/api/v2/datasources/upload \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "files[]=@your-data.csv"

{
  "message": "Data source uploaded successfully",
  "datasourceId": "550e8400-e29b-41d4-a716-446655440000",
  "fileName": "products.csv",
  "contentType": "text/csv",
  "fileSize": 245760,
  "status": "queued",
  "estimatedProcessingTime": "2-5 minutes"
}

Platform Dashboard Upload

Visit Platform Dashboard: Go to https://app.prod.getclaro.ai/dashboard/ and sign in
Upload Data Sources: Drag and drop your files or use the file picker to upload CSV, PDF, or other supported formats
Review Processing: Claro automatically extracts, cleans, and organizes your data into structured tables

Data Cleaning Pipeline

Once uploaded, Claro automatically:

Extracts content from PDFs and organizes into structured tables
Cleans CSV data, handles missing values, and standardizes formats
Validates file structure and detects data patterns
Organizes unstructured data into consistent table formats
Indexes cleaned data for fast retrieval and selection

Next Steps

After uploading your data sources:

Review Cleaned Data: Check the organized tables in your dashboard
Create Dataset: Select from your cleaned data sources and choose a dataset type/template
Configure Dataset: Set up the dataset for specific use cases (product catalog, supplier data, etc.)
Start AI Workflows: Begin classification, enrichment, or extraction tasks on your structured dataset

Manage Data Sources

View, access, and manage your uploaded data sources

Create Dataset

Create datasets from your cleaned data sources

Get Started

Claro Platform

API Documentation

Upload Data Source

File and Format Requirements

Best Practices

Data Quality

Date and Time Formatting

Include Rich Metadata

Flatten Nested Data

Upload Methods

API Upload

Platform Dashboard Upload

Data Cleaning Pipeline

Next Steps

Manage Data Sources

Create Dataset

Get Started

Claro Platform

API Documentation

​File and Format Requirements

​Best Practices

​Data Quality

​Date and Time Formatting

​Include Rich Metadata

​Flatten Nested Data

​Upload Methods

​API Upload

​Platform Dashboard Upload

​Data Cleaning Pipeline

​Next Steps

Manage Data Sources

Create Dataset

File and Format Requirements

Best Practices

Data Quality

Date and Time Formatting

Include Rich Metadata

Flatten Nested Data

Upload Methods

API Upload

Platform Dashboard Upload

Data Cleaning Pipeline

Next Steps