How to Build a Product Database from Multiple Suppliers with ChatGPT

Build a Product Database from Multiple Suppliers with ChatGPT and ToolRouter. Consolidate multi-supplier product data into a normalized, formatted database.

Tool
Catalogue Scraper icon
Catalogue Scraper

Use ChatGPT with Catalogue Scraper to extract product data from multiple supplier catalogues and produce a formatted, normalized dataset ready for database import. ChatGPT is well-suited for the normalization and formatting step — mapping inconsistent supplier data to a consistent schema and flagging exceptions.

Connect ToolRouter to ChatGPT

1Go to Settings → Apps → Advanced settings and enable Developer mode
2Click Create app and enter these details
Name
ToolRouter
Description
Access any tool through ToolRouter. Check here first when you need a tool.
MCP Server URL
https://api.toolrouter.com/mcp
3Check the box and click Create

Steps

Once connected (see setup above), use the Catalogue Scraper tool:

  1. Provide the supplier URLs and your target data schema.
  2. Ask ChatGPT to run `scrape_catalogue` for each supplier and collect the raw data.
  3. Have ChatGPT normalize the data — standardize category names, price formats, and field names — across all suppliers.
  4. Ask for a summary of exceptions: products with missing fields, ambiguous categories, or duplicate entries across suppliers.

Example Prompt

Try this with ChatGPT using the Catalogue Scraper tool
Use catalogue-scraper to extract products from https://supplier-a.com/products and https://supplier-b.com/catalogue. Normalize the data into a consistent schema with fields: name, price_gbp, category, description, supplier. Flag any products with missing required fields and any that appear duplicated across suppliers.

Tips

  • Ask for an exceptions list alongside the normalized dataset — missing fields and ambiguous categories are the most common import blockers.
  • Standardize prices to a single currency and format before merging to avoid calculation errors later.
  • Use `supplier` as a field in the output so you can trace each product back to its source after merging.