SEO & Search Engines2,400 searches/mo

robots.txt

Quick Definition

robots.txt is a text file that gives search engine crawlers instructions on which pages may be crawled.

What is robots.txt?

robots.txt is a text file in the root directory of a website that controls search engine crawling behavior. It uses the Robots Exclusion Protocol and contains instructions such as: User-agent (which bot), Disallow (don't crawl), Allow (crawling allowed), and Sitemap (reference to sitemap). Important: robots.txt only prevents crawling, not indexing! Pages can still appear in search results if linked from other pages. For actually blocking indexing, you need noindex meta tags or X-Robots-Tag headers. Errors in robots.txt can have fatal SEO consequences.

Key Points

Always located at domain.com/robots.txt
Disallow prevents crawling, not indexing
Wildcards (*) and $ for patterns possible
Crawl-delay only respected by some bots
Sitemap reference recommended
Test with Google Search Console

Practical Example

“We blocked the admin area in robots.txt: Disallow: /admin/”

Explore More Terms

Our glossary contains 20+ marketing and SEO terms.

View All Terms