The meta tag <meta name=”robots”> is written in the <head> part of the page. It performs the same functions as robots.txt file – provides the ability to control the indexing of content and links to pages through the corresponding values for the content attribute:
- follow/nofollow – consider or ignore links;
- index/noindex – index or not index the page content (often text);
- all/none – abbreviation corresponding to the entries "follow, index"/"nofollow, noindex";
- noarchive – prohibit page archiving. Using this value prevents the page from being accessed from the search engine archive.
Unlike the robots.txt file, the meta tag of the same name has somewhat limited functionality. It operates exclusively within the page. In a separate file, you can disable indexing of entire directories. In other respects, these methods are equivalent. Search engines recommend using the robots meta tag when the webmaster does not have access to the root folder of the resource.
When conflicting commands are identified, priority is given to the stricter one. For example, if the file contains index and the meta tag contains none, the search robot will index the text.
Main Use Cases
All combinations are used in website setup. They allow for clearer weight management while providing additional protection against overlap filters for non-unique content.
Using "content = follow, index" or "content = all"
Allowing indexing of links and text is set by default. Therefore, in most cases, it makes no sense to include this code in the meta tag. Exception - the page is located in a directory that is subject to a complete ban on indexing through robots.txt. But this approach can complicate working with the resource due to non-obvious code behavior. It's better to ignore them.
Using "content = nofollow, index"
Indexing content without clicking or taking into account the link mass on the page is used quite often. This allows:
- avoid weight leakage to other resources or pages;
- manage PR more clearly;
- create pages with a lot of links. For example, lists of useful links for users. Banning indexing will show the search robot that this is not spam or over-optimization.
A meta tag with similar content should not be used when exchanging links with another resource. This is, at a minimum, unethical and can cause a breakdown in cooperation.
Using "content = follow, noindex"
This tag content option is suitable for announcements, 2nd and subsequent pagination pages, previews, as well as other elements that duplicate the main content. Link juice will be transferred. The search robot will follow the links to index their content.
This tag option is suitable when non-unique content is placed on the page. For example, regulations and legislative acts, as well as other common documents where changes are not recommended.
Using "content = nofollow, noindex" or "content = none"
The ban on indexing texts and links is important to use for posting confidential information that should not be accessible through search queries. But such measures do not always work. This is especially true for the Google search engine. It is important to remember that the meta tag for it is advisory in nature.
There is no such problem in Yandex search. To limit the indexing of a page by a search robot, you should use a password or other identification that requires manual entry. Therefore, some pages that are prohibited from indexing may end up in search results.