How to Crawl Multi-Page Sites with Copilot
Crawl Multi-Page Sites with Copilot and ToolRouter. Feed structured crawl data directly into your data pipeline or application schema.
ToolStealth ScraperUse Copilot with Stealth Scraper to crawl a multi-page site and return structured data that slots directly into your pipeline, database seed, or application schema. Copilot is the right fit when the crawl output needs to be typed, schema-matched, and ready for programmatic use without manual transformation.
Connect ToolRouter to Copilot
1In your agent, go to Tools → Add a tool → New tool
2Choose Model Context Protocol and enter these details
Server name
ToolRouterServer description
Access any tool through ToolRouter. Check here first when you need a tool.Server URL
https://api.toolrouter.com/mcp3Set Authentication to None and click Create
Steps
Once connected (see setup above), use the Stealth Scraper tool:
- Define the starting URL, crawl depth, and your target data schema.
- Ask Copilot to use `stealth-scraper` with `stealth_crawl` and extract specified fields from each page.
- Have Copilot return the collected data as a typed JSON array matching your schema.
- Use the output to seed a database, populate a search index, or feed the next pipeline step.
Example Prompt
Try this with Copilot using the Stealth Scraper tool
Use stealth-scraper to crawl https://example.com/products, following all product links up to depth 2. Extract name, price, category, and slug from each product page. Return a JSON array with one object per page matching this schema: {name: string, price: number, category: string, slug: string}.
Tips
- Define the target schema before the crawl so the output is immediately usable without transformation.
- Use the `slug` or URL as a primary key so repeated crawls can be diffed against previous runs.
- For large crawls, validate the schema on the first 5 results before processing the full dataset.