Robots.txt is an important tool in the SEO arsenal, creating rules and instructing robots which parts of a website need to be crawled and which don’t. However, when editing a robot.txt file, we need to keep in mind that great performance comes with great responsibility. This is because even a small mistake can potentially de-index an entire website from search engines.

Because it is important to properly compile a website’s robot.txt file, I surveyed our professional services team to uncover some common mistakes to make sure you Search engines and other bots are getting what you want so you can crawl pages. So, without further ado, let us begin our guide Common robots.txt Mistakes That Should Be Avoided.

Do not Repeat the Instructions of the General User Agent in Certain User Agent Blocks

Search engine bots adhere to the most suitable user-agent block in the robot.txt file, and other user-agent blocks are ignored.

In this example, Google Boot will follow only one rule that was specifically reserved for Google Boot and the rest will be ignored.

User Agent: *

Don’t allow: / some 1

Prohibited: / Something2

Prohibited: / Some 3 User Agent: Google Boot Forbidden: / Something else

For this reason, it is important to repeat general user agent instructions that apply to more specific bots because the rules are included for them as well.

Forget That the Longest Matching Rule Wins

When using entry rules, they only apply when the number of characters in the matching rule is high.

For example:

Forbidden: / Some words

Allow: / Anything

In the above example, for example / words are not allowed because the prohibition principle has more similar characters.

However, you can increase the permissions to allow this instance by using additional wildcard characters (*).

Forbidden: / Some words

Allow: / anything *

How to create and add custom robots.txt file in Blogger

Addition of Wild Cards at the End of the Rules

* Wildcard characters do not need to be included at the end of the rules in the robots.txt file unless you use them unless you make them the longest matching rule, by default they end the rule but, most match. This is usually not a problem, but it can cause you to lose respect for your co-workers and family members.

Forbidden: / Some words *

Hence, this concludes our article Common robots.txt Mistakes That Should Be Avoided. We hope that this article was helpful to you and helped you understand about the most common robot.txt mistakes that should be avoided.

RELATED CONTENTS