Crawler

Crawler (Web Crawler / Spider / Bot) is a bot program that automatically crawls web pages, collects their content, and processes it into the search engine database.

The best-known crawlers:

Googlebot: Google’s crawler; it also has subtypes such as Googlebot Smartphone (mobile) and Googlebot Image.
Bingbot: Microsoft Bing’s crawler.
YandexBot: Yandex’s crawler.
Baiduspider: Baidu’s (China) crawler.
SEO tools (Ahrefs, Semrush, Moz) and social media bots (FacebookExternalHit) also use their own crawlers.

How a crawler works:

It starts from a known URL (sitemap, previously crawled pages)
It downloads the page and analyzes the HTML
It adds all links on the page to the queue
It visits new URLs while following the robots.txt rules
It sends the content it finds to the index database

Tip: Monitor Googlebot visits in your server logs. Which pages are crawled and how often can tell you a lot about your site’s technical health.

← Back to Full Glossary