> ## Documentation Index
> Fetch the complete documentation index at: https://docs.getclaro.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Tasks

> Trigger and manage enrichment, extraction, map extraction, and AI generation tasks on your datasets. Process new rows, bulk operations, and monitor task progress.

<Note>
  All operations require authentication using Bearer tokens. Make sure you have
  your API credentials ready.
</Note>

<Warning>
  Datasets enter locked mode during task execution and cannot be modified until
  completion.
</Warning>

## Create New Rows Task

Generate and process new rows for your dataset with enrichment or AI generation.

<CodeGroup>
  ```bash cURL theme={null}
  curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/tasks/new-rows" \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "rowCount": 100,
      "webhookId": "$WEBHOOK_ID"
    }'
  ```

  ```python Python theme={null}
  import requests

  headers = {
      "Authorization": "Bearer YOUR_API_KEY",
      "Content-Type": "application/json"
  }

  data = {
      "rowCount": 100,
      "webhookId": "your-webhook-id"  # Optional webhook for completion notification
  }

  dataset_id = "your-dataset-id"  # Replace with your dataset ID

  response = requests.post(
      f"https://secure-api.getclaro.ai/api/v2/datasets/{dataset_id}/tasks/new-rows",
      headers=headers,
      json=data
  )
  ```

  ```javascript JavaScript theme={null}
  const datasetId = "your-dataset-id"; // Replace with your dataset ID
  const taskRequest = {
    rowCount: 100,
    webhookId: "your-webhook-id", // Optional webhook for completion notification
  };

  const response = await fetch(
    `https://secure-api.getclaro.ai/api/v2/datasets/${datasetId}/tasks/new-rows`,
    {
      method: "POST",
      headers: {
        Authorization: "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
      },
      body: JSON.stringify(taskRequest),
    }
  );
  ```

  ```json Success Response theme={null}
  {
    "taskId": "task_550e8400-e29b-41d4-a716-446655440000",
    "datasetId": "550e8400-e29b-41d4-a716-446655440000",
    "type": "new_rows",
    "status": "queued",
    "rowCount": 100,
    "estimatedDuration": "5-10 minutes",
    "createdAt": "2024-03-14T15:30:00Z",
    "webhookId": "6ba7b810-9dad-11d1-80b4-00c04fd430c8"
  }
  ```

  ```json Dataset Locked Error theme={null}
  {
    "error": "Dataset is locked",
    "code": "DATASET_LOCKED",
    "details": {
      "message": "Dataset is currently being processed by another task",
      "activeTaskId": "task_123e4567-e89b-12d3-a456-426614174000"
    }
  }
  ```
</CodeGroup>

## Create Bulk Processing Task

Process cells in your dataset with flexible targeting options. Process all empty cells, specific cells, entire columns/rows, or intersections of rows and columns.

<CodeGroup>
  ```bash cURL theme={null}
  # Process all empty cells in the dataset
  curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/tasks/bulk" \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "processOnlyEmpty": true,
      "webhookId": "$WEBHOOK_ID"
    }'

  # Process specific cells

  curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/tasks/bulk" \
   -H "Authorization: Bearer YOUR_API_KEY" \
   -H "Content-Type: application/json" \
   -d '{
  "cellIds": ["cell_1", "cell_2"],
  "webhookId": "$WEBHOOK_ID"
  }'

  # Process intersection of specific rows and columns

  curl -X POST "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/tasks/bulk" \
   -H "Authorization: Bearer YOUR_API_KEY" \
   -H "Content-Type: application/json" \
   -d '{
  "rowIds": ["row_1", "row_2"],
  "columnIds": ["col_product_name", "col_category"],
  "processOnlyEmpty": true,
  "webhookId": "$WEBHOOK_ID"
  }'

  ```

  ```python Python theme={null}
  import requests

  headers = {
      "Authorization": "Bearer YOUR_API_KEY",
      "Content-Type": "application/json"
  }

  # Example 1: Process all empty cells in the dataset
  data = {
      "processOnlyEmpty": True,  # Process all empty cells (default: true)
      "webhookId": "your-webhook-id"  # Optional webhook for completion notification
  }

  # Example 2: Process specific cells
  data = {
      "cellIds": ["cell_1", "cell_2"],  # Specific cell IDs to process
      "webhookId": "your-webhook-id"
  }

  # Example 3: Process intersection of rows and columns
  data = {
      "rowIds": ["row_1", "row_2"],  # Specific rows
      "columnIds": ["col_product_name", "col_category"],  # Specific columns
      "processOnlyEmpty": True,  # Only process empty cells in the intersection
      "webhookId": "your-webhook-id"
  }

  dataset_id = "your-dataset-id"  # Replace with your dataset ID

  response = requests.post(
      f"https://secure-api.getclaro.ai/api/v2/datasets/{dataset_id}/tasks/bulk",
      headers=headers,
      json=data
  )

  ```

  ```javascript JavaScript theme={null}
  const datasetId = "your-dataset-id"; // Replace with your dataset ID

  // Example 1: Process all empty cells in the dataset
  const processAllEmpty = {
    processOnlyEmpty: true, // Process all empty cells (default: true)
    webhookId: "your-webhook-id", // Optional webhook for completion notification
  };

  // Example 2: Process specific cells
  const processSpecificCells = {
    cellIds: ["cell_1", "cell_2"], // Specific cell IDs to process
    webhookId: "your-webhook-id",
  };

  // Example 3: Process intersection of rows and columns
  const processIntersection = {
    rowIds: ["row_1", "row_2"], // Specific rows
    columnIds: ["col_product_name", "col_category"], // Specific columns
    processOnlyEmpty: true, // Only process empty cells in the intersection
    webhookId: "your-webhook-id",
  };

  const response = await fetch(
    `https://secure-api.getclaro.ai/api/v2/datasets/${datasetId}/tasks/bulk`,
    {
      method: "POST",
      headers: {
        Authorization: "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
      },
      body: JSON.stringify(processAllEmpty), // Use any of the examples above
    }
  );
  ```

  ```json Success Response theme={null}
  {
    "taskId": "task_550e8400-e29b-41d4-a716-446655440000",
    "datasetId": "550e8400-e29b-41d4-a716-446655440000",
    "type": "bulk_processing",
    "status": "queued",
    "targetCells": 150,
    "processOnlyEmpty": true,
    "estimatedDuration": "3-7 minutes",
    "createdAt": "2024-03-14T15:30:00Z",
    "webhookId": "6ba7b810-9dad-11d1-80b4-00c04fd430c8"
  }
  ```

  ```json Invalid Selection Error theme={null}
  {
    "error": "Invalid cell selection",
    "code": "INVALID_SELECTION",
    "details": {
      "message": "Cannot combine cellIds with rowIds or columnIds. Use either cellIds alone, or rowIds/columnIds, or no IDs to process all empty cells"
    }
  }
  ```
</CodeGroup>

## Get Task Status

Retrieve detailed information about a specific task and its progress.

<CodeGroup>
  ```bash cURL theme={null}
  curl -X GET "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/tasks/$TASK_ID" \
    -H "Authorization: Bearer YOUR_API_KEY"
  ```

  ```python Python theme={null}
  import requests

  headers = {"Authorization": "Bearer YOUR_API_KEY"}

  dataset_id = "your-dataset-id"  # Replace with your dataset ID
  task_id = "your-task-id"  # Replace with your task ID

  response = requests.get(
      f"https://secure-api.getclaro.ai/api/v2/datasets/{dataset_id}/tasks/{task_id}",
      headers=headers
  )
  ```

  ```javascript JavaScript theme={null}
  const datasetId = "your-dataset-id"; // Replace with your dataset ID
  const taskId = "your-task-id"; // Replace with your task ID

  const response = await fetch(
    `https://secure-api.getclaro.ai/api/v2/datasets/${datasetId}/tasks/${taskId}`,
    {
      headers: {
        Authorization: "Bearer YOUR_API_KEY",
      },
    }
  );
  ```

  ```json Success Response - In Progress theme={null}
  {
    "taskId": "task_550e8400-e29b-41d4-a716-446655440000",
    "datasetId": "550e8400-e29b-41d4-a716-446655440000",
    "type": "bulk_processing",
    "status": "processing",
    "progress": {
      "completed": 75,
      "total": 150,
      "percentage": 50
    },
    "targetCells": 150,
    "processOnlyEmpty": true,
    "estimatedCompletion": "2024-03-14T15:37:00Z",
    "createdAt": "2024-03-14T15:30:00Z",
    "startedAt": "2024-03-14T15:31:00Z",
    "webhookId": "6ba7b810-9dad-11d1-80b4-00c04fd430c8"
  }
  ```

  ```json Success Response - Completed theme={null}
  {
    "taskId": "task_550e8400-e29b-41d4-a716-446655440000",
    "datasetId": "550e8400-e29b-41d4-a716-446655440000",
    "type": "new_rows",
    "status": "completed",
    "progress": {
      "completed": 100,
      "total": 100,
      "percentage": 100
    },
    "rowCount": 100,
    "results": {
      "processedCells": 500,
      "successfulCells": 485,
      "failedCells": 15,
      "newRowsCreated": 100
    },
    "createdAt": "2024-03-14T15:30:00Z",
    "startedAt": "2024-03-14T15:31:00Z",
    "completedAt": "2024-03-14T15:38:00Z",
    "webhookId": "6ba7b810-9dad-11d1-80b4-00c04fd430c8"
  }
  ```

  ```json Task Not Found Error theme={null}
  {
    "error": "Task not found",
    "code": "TASK_NOT_FOUND",
    "details": {
      "taskId": "task_invalid-id"
    }
  }
  ```
</CodeGroup>

## Cancel Task

Cancel a running or queued task. Only tasks in `queued` or `processing` status can be cancelled.

<CodeGroup>
  ```bash cURL theme={null}
  curl -X DELETE "https://secure-api.getclaro.ai/api/v2/datasets/$DATASET_ID/tasks/$TASK_ID" \
    -H "Authorization: Bearer YOUR_API_KEY"
  ```

  ```python Python theme={null}
  import requests

  headers = {"Authorization": "Bearer YOUR_API_KEY"}

  dataset_id = "your-dataset-id"  # Replace with your dataset ID
  task_id = "your-task-id"  # Replace with your task ID

  response = requests.delete(
      f"https://secure-api.getclaro.ai/api/v2/datasets/{dataset_id}/tasks/{task_id}",
      headers=headers
  )
  ```

  ```javascript JavaScript theme={null}
  const datasetId = "your-dataset-id"; // Replace with your dataset ID
  const taskId = "your-task-id"; // Replace with your task ID

  const response = await fetch(
    `https://secure-api.getclaro.ai/api/v2/datasets/${datasetId}/tasks/${taskId}`,
    {
      method: "DELETE",
      headers: {
        Authorization: "Bearer YOUR_API_KEY",
      },
    }
  );
  ```

  ```json Success Response theme={null}
  {
    "message": "Task cancelled successfully",
    "taskId": "task_550e8400-e29b-41d4-a716-446655440000",
    "status": "cancelled",
    "cancelledAt": "2024-03-14T15:35:00Z"
  }
  ```

  ```json Cannot Cancel Error theme={null}
  {
    "error": "Cannot cancel task",
    "code": "TASK_NOT_CANCELLABLE",
    "details": {
      "message": "Task is already completed or failed",
      "currentStatus": "completed"
    }
  }
  ```
</CodeGroup>

## Task Types and Operations

### Data Enrichment Tasks

Enhance existing data with additional attributes and classifications.

* **New Rows**: Generate new rows with enriched data based on dataset patterns
* **Bulk Processing**: Enrich specific cells, columns, or rows with missing attributes

### Data Extraction Tasks

Extract structured data from unstructured sources.

* **New Rows**: Process new documents and extract structured data
* **Bulk Processing**: Re-extract data from specific cells or update extraction results

### Map Extraction Tasks

Extract location-based data within geographic boundaries.

* **New Rows**: Find new locations within specified map boundaries
* **Bulk Processing**: Update location data for specific entries

### AI Generation Tasks

Generate synthetic data using AI models.

* **New Rows**: Create new synthetic rows based on existing dataset patterns
* **Bulk Processing**: Generate content for specific empty cells

## Request Parameters

### Create New Rows Task

| Parameter   | Type   | Required | Description                                 |
| ----------- | ------ | -------- | ------------------------------------------- |
| `rowCount`  | number | Yes      | Number of new rows to generate (max: 10000) |
| `webhookId` | string | No       | Webhook ID for completion notification      |

### Create Bulk Processing Task

| Parameter          | Type    | Required | Description                              |
| ------------------ | ------- | -------- | ---------------------------------------- |
| `cellIds`          | array   | No       | Array of specific cell IDs to process    |
| `columnIds`        | array   | No       | Array of column IDs to process           |
| `rowIds`           | array   | No       | Array of row IDs to process              |
| `processOnlyEmpty` | boolean | No       | Only process empty cells (default: true) |
| `webhookId`        | string  | No       | Webhook ID for completion notification   |

## Bulk Processing Logic

### Target Selection Options:

1. **Process All Empty Cells**: Omit all ID parameters to process all empty cells in the dataset
2. **Specific Cells**: Use `cellIds` to target exact cells (cannot combine with row/column IDs)
3. **Entire Columns**: Use `columnIds` to process all cells in specified columns
4. **Entire Rows**: Use `rowIds` to process all cells in specified rows
5. **Row-Column Intersection**: Use both `rowIds` and `columnIds` to process only cells at their intersection

### Examples:

```json theme={null}
// Process all empty cells in dataset
{}

// Process specific cells only
{"cellIds": ["cell_1", "cell_2"]}

// Process entire columns
{"columnIds": ["col_name", "col_category"]}

// Process entire rows
{"rowIds": ["row_1", "row_2"]}

// Process intersection: only cells where specified rows and columns meet
{"rowIds": ["row_1", "row_2"], "columnIds": ["col_name", "col_category"]}
```

## Task Status Values

| Status       | Description                           |
| ------------ | ------------------------------------- |
| `queued`     | Task is waiting to be processed       |
| `processing` | Task is currently being executed      |
| `completed`  | Task finished successfully            |
| `failed`     | Task encountered an error and stopped |
| `cancelled`  | Task was cancelled by user request    |

## Dataset Locking

Datasets are automatically locked when a task is created and remain locked
until the task completes, fails, or is cancelled.
During this time:

* No new tasks can be created on the dataset
* Dataset structure cannot be modified
* Data cannot be manually edited
* Other operations may be restricted

## Webhook Notifications

When a `webhookId` is provided, the system will send HTTP POST notifications to the configured webhook URL upon task completion:

```json theme={null}
{
  "taskId": "task_550e8400-e29b-41d4-a716-446655440000",
  "datasetId": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "type": "new_rows",
  "completedAt": "2024-03-14T15:38:00Z",
  "results": {
    "processedCells": 500,
    "successfulCells": 485,
    "failedCells": 15,
    "newRowsCreated": 100
  }
}
```

## Error Codes

| Code                   | Description                               |
| ---------------------- | ----------------------------------------- |
| `DATASET_LOCKED`       | Dataset is locked by another task         |
| `DATASET_NOT_FOUND`    | Dataset doesn't exist                     |
| `TASK_NOT_FOUND`       | Task doesn't exist                        |
| `TASK_NOT_CANCELLABLE` | Task cannot be cancelled in current state |
| `INVALID_SELECTION`    | Invalid cell/column/row selection         |
| `QUOTA_EXCEEDED`       | Task creation limit reached               |
| `WEBHOOK_NOT_FOUND`    | Specified webhook doesn't exist           |

## Next Steps

<CardGroup cols={2}>
  <Card title="Manage Datasets" icon="database" href="/api-reference/manage-dataset">
    View and manage your dataset structure and data
  </Card>

  <Card title="Manage Webhooks" icon="webhook" href="/api-reference/webhooks">
    Set up webhooks for task completion notifications
  </Card>
</CardGroup>
