2026-05-147 min read

robots.txt and Sitemap Basics for SEO

Learn what crawlers read from robots.txt and XML sitemaps, and how to avoid blocking important pages by mistake.

Search engines discover pages by crawling links and reading hints you provide. robots.txt and sitemaps are two lightweight files that guide what gets crawled and indexed.

What robots.txt does

robots.txt tells well-behaved bots which paths they may request. It does not hide secrets — blocked URLs can still be linked from elsewhere. Use it to reduce crawl noise on admin or duplicate paths.

What a sitemap does

An XML sitemap lists URLs you want indexed, often with lastmod dates. It helps search engines find new or deep pages faster, especially on large sites.

Common mistakes

  • Disallow: / blocking the entire site
  • Missing sitemap URL in robots.txt
  • Including noindex pages in the sitemap
Try it instantly

Generate starter files with our Robots.txt Generator () and Sitemap XML Generator (/tools/sitemap-xml-generator).

Explore more tools in the Tools Directory.
Browse all articles →