About Crawl

The 'Crawl' entry maps to Diffbot, a web data extraction platform hosted at diffbot.com. Diffbot uses AI, computer vision, and machine learning to extract structured data from any webpage and provides Crawlbot for spidering entire websites at scale. Its Knowledge Graph covers over 10 billion entities and one trillion structured facts, refreshed every 4-5 days. It is used for competitive intelligence, lead enrichment, market research, and AI training data pipelines.

AI Agent Use Cases

- Crawlbot for full-site web crawling returning structured JSON output

- Automatic article, product, image, and discussion page extraction via AI

- Knowledge Graph with 10B+ entities and 1T+ facts (organizations, people, products)

- Natural language processing for entity, relationship, and sentiment extraction

- Integrations with Excel, Google Sheets, Tableau, Zapier, and REST API

- Credit-based usage (1 credit per page extract, 25 credits per Knowledge Graph entity export)

Available Actions

These are the specific actions that AI agents can perform with this tool

Scrape Website Urls

3 inputs

Scrape all URLs from a given website using Diffbot Crawl API.

Inputs

name

The descriptive name for the crawl job.

seedUrls

List of starting URLs for the crawler to begin its extraction.

notificationEmail

Email address to receive notification when the crawl is complete.

Back to tools

diffbot.com

•

One-click Login