Skip to main content
All operations require authentication using Bearer tokens. Make sure you have your API credentials ready.

List All Datasets

Retrieve a paginated list of all your datasets.
curl -X GET "https://secure-api.getclaro.ai/api/v2/datasets?page=1&limit=20" \
  -H "Authorization: Bearer YOUR_API_KEY"

Get Dataset Details

Retrieve detailed information about a specific dataset including column metadata and task types.
curl -X GET "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID" \
  -H "Authorization: Bearer YOUR_API_KEY"

Get Dataset Data

Retrieve paginated data from a specific dataset as a 2D array with column information.
curl -X GET "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/data?page=1&limit=50" \
  -H "Authorization: Bearer YOUR_API_KEY"

Update Cell Value

Update the value of a specific cell using its cell ID. When a cell is edited, its metadata will be reset.
Cell metadata cannot be updated directly by users and will be reset when the cell value is changed.
curl -X PATCH "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/cells/$CELL_ID" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "value": "Updated Widget Name"
  }'

Update Column Metadata

Update column information and metadata without changing the task type.
curl -X PATCH "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/columns/$COLUMN_ID" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "updated_product_name",
    "metadata": {
      "extractionPrompt": "Extract the full product name including brand",
      "confidence": 0.95
    }
  }'

Update Column Task Type

This operation will reset all values in the specified column. This action cannot be undone.
Update the task type of a column, which will reset all values in that column.
curl -X PUT "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/columns/$COLUMN_ID/task-type" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "taskType": "ai_enrich",
    "metadata": {
      "prompt": "Generate a detailed product description based on the product name and category"
    }
  }'
For extraction datasets, upload or link a new datasource to a specific cell in raw data columns.
curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/cells/$CELL_ID/link-datasource" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "datasourceId": "existing-datasource-id"
  }'
curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/cells/$CELL_ID/upload-datasource" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf"

Delete Dataset

Permanently delete a dataset and all its associated data.
curl -X DELETE "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID" \
  -H "Authorization: Bearer YOUR_API_KEY"

Export Dataset

Generate a download URL for the dataset in various formats.
curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/export" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "format": "csv",
    "rowCount": 1000,
    "columnIds": ["col_1", "col_3", "col_4"],
    "skip": 0,
    "includeMetadata": true,
    "expiresIn": 3600
  }'

Query Parameters

List Datasets

  • page (integer): Page number (default: 1)
  • limit (integer): Items per page (default: 20, max: 100)
  • type (string): Filter by dataset type (extraction, analysis, classification)
  • status (string): Filter by status (processing, completed, failed)

Get Dataset Data

  • page (integer): Page number (default: 1)
  • limit (integer): Rows per page (default: 50, max: 1000)
  • columns (string): Comma-separated column IDs to include
  • includeMetadata (boolean): Include row/column metadata (default: false)

Export Dataset

  • format (string): Export format (csv, json, xlsx)
  • rowCount (integer): Number of rows to export (required, max: 100000)
  • columnIds (array): Array of column IDs to include (optional, includes all if omitted)
  • skip (integer): Number of initial rows to skip (optional, default: 0)
  • includeMetadata (boolean): Include metadata in export
  • expiresIn (integer): URL expiration time in seconds (default: 3600, max: 86400)

Task Types

Task TypeDescriptionDataset RequirementMetadata Requirements
rawOriginal unprocessed data from datasources or user inputAnyNone
doc_extractionExtract information from corresponding cell in same rowExtraction onlyNone
web_enrichResponse from web scraping to enrich dataAnyNone
classificationYes/no or categorical classification of data in same rowAnyNone
ai_enrichAI-generated content based on row data using custom promptsAnyprompt required
places_countExtract places count from maps using coordinates or nearby area with radius from other cellsAnyprompt required

Error Codes

CodeDescription
UNAUTHORIZEDAuthentication required
DATASET_NOT_FOUNDDataset doesn’t exist
CELL_NOT_FOUNDCell ID doesn’t exist
COLUMN_NOT_FOUNDColumn ID doesn’t exist
ROW_NOT_FOUNDRow ID doesn’t exist
INVALID_TASK_TYPETask type not supported
RAW_COLUMN_REQUIREDOperation requires raw data column
PROCESSING_IN_PROGRESSDataset still being processed
INVALID_PARAMETERSInvalid query parameters
ACCESS_DENIEDInsufficient permissions

Next Steps