All operations require authentication using Bearer tokens. Make sure you have
your API credentials ready.
Standard Dataset Creation
Create Dataset from Data Sources
Create a new dataset with specified configuration using existing data sources or file uploads.
cURL
Python
JavaScript
Success Response
Validation Error
curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"type": "data_enrichment",
"name": "Product Enrichment Dataset",
"description": "Enrich product data with additional attributes and classifications",
"datasourceId": "$DATASOURCE_ID"
}'
Create Dataset with File Upload
Create a dataset by uploading files directly instead of using existing data sources.
Files uploaded during dataset creation are automatically processed and saved
as data sources, making them available for creating additional datasets in the
future.
cURL
Python
JavaScript
Success Response
File Upload Error
# For CSV files (single file)
curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/upload" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "type=data_enrichment" \
-F "name=Product Enrichment Dataset" \
-F "description=Enrich product data with additional attributes" \
-F "file=@products.csv"
# For PDF files (multiple files)
curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/upload" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "type=data_extraction" \
-F "name=Invoice Extraction Dataset" \
-F "description=Extract key fields from invoice documents" \
-F "files[]=@invoice1.pdf" \
-F "files[]=@invoice2.pdf"
Dataset Types and Configuration
Data Enrichment
Enhance existing data with additional attributes and classifications. Requires data source.
{
"type" : "data_enrichment" ,
"name" : "Product Classification Dataset" ,
"description" : "Classify products and add missing attributes for e-commerce catalog" ,
"datasourceId" : "your-datasource-id"
}
Extract structured data from unstructured sources like PDFs. Requires data sources.
{
"type" : "data_extraction" ,
"name" : "Invoice Data Extraction" ,
"description" : "Extract key fields from invoice documents for automated processing" ,
"datasourceId" : "your-datasource-id"
}
Extract location-based data within specified geographic boundaries. No data sources required.
{
"type" : "map_extraction" ,
"name" : "Restaurant Location Data" ,
"description" : "Find restaurants in downtown area for market analysis" ,
"mapDetails" : {
"latitude" : 40.7128 ,
"longitude" : -74.006 ,
"radiusMeters" : 5000
}
}
Custom Dataset (Blank Table)
Create a blank structured table with custom column definitions. No data sources required.
{
"type" : "custom_dataset" ,
"name" : "Customer Survey Responses" ,
"description" : "Collect and organize customer feedback for satisfaction analysis" ,
"columnDefinitions" : [
{
"name" : "customer_id" ,
"type" : "string" ,
"description" : "Unique customer identifier"
},
{
"name" : "satisfaction_score" ,
"type" : "number" ,
"description" : "Rating from 1-10"
},
{
"name" : "feedback_text" ,
"type" : "text" ,
"description" : "Open-ended feedback"
},
{
"name" : "survey_date" ,
"type" : "date" ,
"description" : "Date survey was completed"
}
]
}
AI-Powered Dataset Generation
Generate Sample Dataset with AI
Generate a sample dataset using AI with a prompt-based approach. This returns a preview for user confirmation before creating the actual dataset.
cURL
Python
JavaScript
Success Response
Generation Error
curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/generate" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Create a dataset of tech startup companies with funding information",
"sampleSize": 10
}'
Refine AI Dataset Generation (Optional)
After reviewing the sample dataset, you can optionally request corrections or additions before confirming the final dataset.
cURL
Python
JavaScript
Success Response
Refinement Error
curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/generate/ $DATASET_REQUEST_ID /refine" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "Add a column for company valuation and remove the location column"
}'
Confirm AI Dataset Generation
After reviewing the sample dataset (and optionally refining it), confirm creation of the full AI-generated dataset using the dataset request ID.
cURL
Python
JavaScript
Success Response
Request Expired
curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/ai-generate-confirm" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"datasetRequestId": "$DATASET_REQUEST_ID",
"fullSize": 1000
}'
Request Parameters
Create Dataset
Parameter Type Required Description typestring Yes Dataset type: data_enrichment, data_extraction, map_extraction, custom_dataset namestring Yes Dataset name (max 100 characters) descriptionstring Yes Purpose and use case description. Used as prompt for enrichment/extraction, search text for maps datasourceIdstring Conditional Datasource ID. Required for data_enrichment, data_extraction. Not used for map_extraction mapDetailsobject Conditional Required for map_extraction type mapDetails.latitudenumber Conditional Center latitude for map extraction mapDetails.longitudenumber Conditional Center longitude for map extraction mapDetails.radiusMetersnumber Conditional Extraction radius in meters (max 50000) columnDefinitionsarray Conditional Required for custom_dataset type
Create Dataset with File Upload
Parameter Type Required Description typestring Yes Dataset type (same as above) namestring Yes Dataset name descriptionstring Yes Purpose and use case description filefile Conditional Single CSV file upload files[]files Conditional Multiple PDF files upload mapDetailsobject Conditional Required for map_extraction type columnDefinitionsarray Conditional Required for custom_dataset type
AI Generate Sample Dataset
Parameter Type Required Description promptstring Yes Natural language description of desired dataset sampleSizenumber No Number of sample rows (default: 10, max: 50). Cannot be changed in refinement or confirmation steps
Refine AI Dataset Generation
Parameter Type Required Description idstring Yes Dataset request ID from generate response (in URL path) promptstring Yes Natural language description of corrections or additions needed
Confirm AI Dataset Generation
Parameter Type Required Description datasetRequestIdstring Yes Request ID from generate or generate//refine response fullSizenumber No Desired full dataset size (default: 1000, max: 100000)
Column Definition Schema
For custom datasets, define columns with the following structure:
{
"name" : "column_name" ,
"type" : "string|number|date|boolean|text" ,
"description" : "Column purpose and content description" ,
"required" : true ,
"defaultValue" : "optional_default"
}
Column Types
Type Description Example Use Cases stringShort text values (typically < 255 characters) Names, IDs, categories, status textLong text content (unlimited length) Descriptions, comments, articles numberNumeric values (integers and decimals) Prices, quantities, scores dateDate and timestamp values Created dates, deadlines booleanTrue/false values Active status, feature flags
Error Codes
Code Description VALIDATION_ERRORInvalid request parameters DATASOURCE_NOT_FOUNDReferenced datasource doesn’t exist GENERATION_FAILEDAI dataset generation failed REFINEMENT_FAILEDAI dataset refinement failed REQUEST_EXPIREDDataset request ID expired QUOTA_EXCEEDEDDataset creation limit reached INVALID_MAP_BOUNDSMap extraction coordinates invalid FILE_UPLOAD_ERRORFile upload failed or invalid format
Next Steps