Create robots.txt file for site indexing control
robots.txt — a file in the site root that tells search bots which pages can and cannot be crawled.
User-agent — the bot name the rules apply to. The * symbol means "all robots".
Disallow — disallows crawling of the specified path. An empty value means "everything is allowed".
Allow — allows crawling of the path (used as an exception to Disallow).
Crawl-delay — delay between bot requests in seconds. Supported by Yandex and Bing, ignored by Google.
Sitemap — sitemap URL in XML format. Helps bots find all pages.
Robots.txt Generator: Create Crawler Rules for Your Website
This online robots.txt generator helps you quickly create a properly formatted robots.txt file to control search engine crawling. Choose ready-made presets for WordPress, e-commerce, and Laravel, configure Allow/Disallow directives and Sitemap for Googlebot, Yandexbot, and other crawlers.
What is robots.txt
Robots.txt is a text file placed at the root of your website that instructs search engine crawlers which pages and sections they may or may not crawl. It's the first file a search bot checks when visiting a site.
Important: robots.txt controls crawling, not indexing. A page blocked from crawling can still appear in search results via external links. To fully exclude a page from the index, use the noindex meta tag.
Directive syntax
| Directive | Description | Example |
|---|---|---|
| User-agent | Specifies which crawler the rules apply to | User-agent: Googlebot |
| User-agent: * | Applies to all crawlers | User-agent: * |
| Disallow | Blocks crawling of a URL or section | Disallow: /admin/ |
| Allow | Permits crawling (overrides Disallow) | Allow: /public/ |
| Sitemap | Specifies sitemap URL | Sitemap: https://example.com/sitemap.xml |
| Crawl-delay | Delay between requests (seconds). Not supported by Google. | Crawl-delay: 1 |
Common examples
Allow everything
User-agent: *
Disallow:
Sitemap: https://example.com/sitemap.xml
Block admin and utility pages
User-agent: *
Disallow: /admin/
Disallow: /cart/
Disallow: /checkout/
Disallow: /search/
Allow: /
Sitemap: https://example.com/sitemap.xml
Common mistakes
- Blocking the entire site —
Disallow: /for all agents. Often happens when staging robots.txt is deployed to production. - Blocking CSS/JS — prevents crawlers from rendering pages correctly, hurting quality scores.
- Blocking URLs in your sitemap — a contradiction: you're simultaneously recommending and blocking a page.
- Disallow/Allow conflict — Google gives priority to the more specific rule; Bing uses the last matching rule.
FAQ
Can robots.txt fully hide a page?
No. Robots.txt prevents crawling, but Google can still index a URL discovered via external links without visiting the page itself. For complete removal: use <meta name="robots" content="noindex"> or X-Robots-Tag: noindex HTTP header.
Does robots.txt affect rankings?
Indirectly. Blocking junk pages (duplicates, filters, thin paginated pages) preserves crawl budget for important content. But blocking important pages directly removes them from the index.
Should I include sitemap in robots.txt?
Not required, but recommended. Google and Bing automatically discover sitemap.xml through robots.txt, speeding up new page discovery.
See also: meta tag generator, Schema.org generator, keyword generator.