Skip to content
useToolz online tools
Robots.txt: Configuring Website Indexing
Utilities

Robots.txt: Configuring Website Indexing

Александр Михеев

Александр Михеев

20 March 2025 · 1 min read

The robots.txt file is the first thing a search engine crawler checks when visiting a website. This text file, located at the root of the domain, contains instructions for crawlers: which pages can be indexed and which cannot.

Robots.txt Syntax

The file consists of rule blocks. Each block starts with a User-agent directive, followed by Disallow and Allow rules:

  • User-agent: * — rules for all crawlers
  • User-agent: Googlebot — rules for Google only
  • Disallow: /admin/ — block the /admin/ section from indexing
  • Allow: /admin/public/ — allow a subdirectory (exception to Disallow)
  • Sitemap: https://example.com/sitemap.xml — path to the sitemap

Common Mistakes

  • Blocking CSS/JS files — Google won't be able to render the page
  • Disallow: / — completely blocks the entire website from indexing
  • Missing Sitemap — the crawler may not discover all pages

Conclusion

Create a proper robots.txt with our generator. Add Schema.org markup using the Schema.org Generator.

Понравилась статья?

Оцените — это помогает нам делать контент лучше

Change rating

Your rating:

Thanks for your rating!

Comments

Log in to leave a comment

No comments yet. Be the first!

We use cookies for site operation and analytics. Подробнее

Upscaled image
Download

Log in to continue

or