How to Extract Data from Bot-Protected Sites with Claude
Extract Data from Bot-Protected Sites with Claude and ToolRouter. Bypass bot detection and Cloudflare challenges to retrieve content from protected pages.
ToolStealth ScraperUse Claude with Stealth Scraper to extract content from bot-protected pages and immediately analyze or structure what comes back. Claude handles the extraction request through the stealth browser layer and then applies reasoning to the returned content — parsing the data, identifying gaps, and recommending what to check next.
Connect ToolRouter to Claude
1Open connector settings Open Settings
2Add a custom connector with these details
Name
ToolRouterURL
https://api.toolrouter.com/mcp3Let Claude set you up Open Claude
Steps
Once connected (see setup above), use the Stealth Scraper tool:
- Provide the URL of the bot-protected page and specify the data you need.
- Ask Claude to use `stealth-scraper` with `stealth_scrape` — the stealth layer handles bot detection automatically.
- Ask Claude to extract the specific fields from the returned content.
- Follow up if the content requires scrolling, pagination, or waiting for specific elements to load.
Example Prompt
Try this with Claude using the Stealth Scraper tool
Use stealth-scraper to extract the job listings from this page that blocks standard scrapers: https://jobs.example.com/listings. Extract job title, company, location, salary range, and date posted for each listing on the page. Return the data as a JSON array.
Tips
- If the site requires user interaction to reveal content — clicking tabs, expanding sections — mention this so the right approach is used.
- For paginated sites, scrape the first page first to verify the extraction works before scripting pagination.
- Check that the data returned is the actual rendered content and not a bot-challenge fallback page.