Inbound for Catalogues
Each catalogue’s Data Source tab is where you connect feeds. Each source has its own attribute mapping, schedule, and conflict policy.File upload (CSV, XLSX)
- Use for — initial loads, supplier files, periodic dumps from systems without an API.
- Mapping — column-to-attribute mapping is saved on first upload and reused on subsequent uploads of the same shape.
- Limits — large files are split into chunks server-side. For millions of rows, use a database connector or S3 instead.
Supplier Portal
- Use for — suppliers without API access who need a self-serve way to send updates.
- Behavior — submissions land as Data Sources with the supplier’s identity attached. Pre-mapped if the supplier has uploaded before.
- Detail — see Onboard → Supplier Portal.
Scheduled scrape
- Use for — public catalog pages, marketplace listings, competitor sources.
- Configuration — URL templates, target schema, cadence, throttling.
- Outputs — scrape runs land as a Data Source and feed any chained operations.
HTTPS pull
- Use for — partner APIs, internal systems with REST endpoints.
- Configuration — endpoint, auth, request schedule, response parsing (JSON / CSV).
- Limits — auth modes supported: bearer, basic, OAuth, signed requests.
Database connectors
| Source | Modes |
|---|---|
| BigQuery | Read tables or query results on a schedule. |
| Postgres | Read tables or query results; per-row CDC where available. |
| Supabase | Read tables and views via the managed REST API. |
Cloud storage
| Source | Modes |
|---|---|
| S3 | Pull files matching a prefix on a schedule. |
| Google Drive | Watch a folder for new files. |
Email-as-source
- Use for — suppliers who only send updates by email.
- Behavior — emails sent to a workspace address are parsed; attachments become Data Source uploads, body content can populate a target schema.
Inbound for Research Agents
Research Agents accept inputs per agent type. See Research Agents for details.| Agent | Inputs |
|---|---|
| Find your perfect list | Natural-language brief and seed criteria. |
| Turn documents into structured data | PDFs, scanned docs, datasheets, target schema. |
| Analyze & enrich spreadsheets | CSV / XLSX file, enrichment goal. |
| Scrape data from URLs | List of URLs or base URL plus crawl rules. |
Mapping and conflict resolution
For every inbound source, you configure how it interacts with existing records.Mapping
Map source columns to catalogue attributes. Mappings include:- Type coercion — convert strings to numbers, normalize dates and currencies.
- Computed fields — derive an attribute from one or more source columns.
- Constants — fill an attribute with a fixed value (e.g. source = supplier_x).
- Lookups — translate enum-like source values to your canonical enum.
Conflict policy
When an inbound row matches an existing record:- Overwrite — replace existing values (default for trusted sources).
- Append — for multi-value attributes only.
- Write-if-empty — only fill blanks, never replace.
- Custom rule — most-recent, highest-confidence, or attribute-specific policy.
Identity matching
Every inbound row needs to resolve to a record. Data Source Mapping runs first, using the similarity graph and configured key fields. Unmatched rows are flagged for review before they land in the catalogue.Limits and best practices
- Test with a small sample first — 50 to 200 rows — before connecting a recurring source.
- Validate the schema upstream when possible (Supplier Portal does this for you).
- Prefer column-level mappings to ad-hoc fixes; mappings are reused, ad-hoc fixes are not.
- For very high volume, prefer database or S3 connectors over file upload.