Utilities

Robots.txt: Configuring Website Indexing

Александр Михеев

20 March 2025 · 1 min read

The robots.txt file is the first thing a search engine crawler checks when visiting a website. This text file, located at the root of the domain, contains instructions for crawlers: which pages can be indexed and which cannot.

Robots.txt Syntax

The file consists of rule blocks. Each block starts with a User-agent directive, followed by Disallow and Allow rules:

User-agent: * — rules for all crawlers
User-agent: Googlebot — rules for Google only
Disallow: /admin/ — block the /admin/ section from indexing
Allow: /admin/public/ — allow a subdirectory (exception to Disallow)
Sitemap: https://example.com/sitemap.xml — path to the sitemap

Common Mistakes

Blocking CSS/JS files — Google won't be able to render the page
Disallow: / — completely blocks the entire website from indexing
Missing Sitemap — the crawler may not discover all pages

Conclusion

Create a proper robots.txt with our generator. Add Schema.org markup using the Schema.org Generator.

Понравилась статья?

Оцените — это помогает нам делать контент лучше

Change rating

Your rating:

Thanks for your rating!

Comments

No comments yet. Be the first!

Useful tools

Transliteration

Transliterate Cyrillic to Latin. GOST, passport and SEO slug standards.

Word Counter

Count characters, words, sentences. Word frequency and water content.

QR Code Generator

Create QR codes for links, WiFi, vCard, Email. Color, size, SVG export.

Text Case Converter

Convert text case: UPPER, lower, Title, camelCase, kebab-case.

Robots.txt: Configuring Website Indexing

Robots.txt Syntax

Common Mistakes

Conclusion

Comments

Useful tools

Читайте также

Schema.org: Structured Data for SEO

Text Case Converter: All Transformation Methods

SEO Text Analysis: Nausea, Water Content, and Word Frequency