🕵️ The Ultimate Guide to Meta Robots Tags

The Meta Robots tag gives you page-level control over how search engines index and serve your content to users. It is one of the most critical tools in an SEO professional's arsenal for managing crawl budget and index bloat.

1. What is a Meta Robots Tag?

A meta robots tag is a piece of code placed in the <head> section of a web page. It provides instructions to web crawlers (like Googlebot) regarding whether that specific page should be added to the search engine's index and whether the links on that page should be followed.

2. HTML Code Example

The tag uses the name="robots" attribute to target all crawlers. The content attribute contains the specific instructions (directives) separated by commas.

<!DOCTYPE html>
<html>
<head>
    <title>Internal Search Results</title>
    <meta name="robots" content="noindex, follow">
</head>
<body>
    <!-- Page content -->
</body>
</html>

Pro Tip: You can target specific bots by changing the name attribute. For example, <meta name="googlebot" content="noindex"> will tell only Google to ignore the page, while Bing and Yahoo might still index it.

3. Core Directives (Index vs. Follow)

The most common values used in the meta robots tag revolve around indexing and link following:

4. Advanced Directives

Google supports several other powerful directives to control how your search snippets appear:

5. Meta Robots vs. robots.txt

This is the most common and dangerous mix-up in technical SEO:

robots.txt is about crawling. Meta Robots is about indexing.

⚠️ Critical Warning: If you add a noindex tag to a page, but then block that URL path in your robots.txt file, Google will never see the noindex tag! The bot is blocked from crawling the page, so it cannot read the <head>. If the page was already indexed, it might remain in the search results as a "URL only" result.

If your goal is to remove a page from Google, ensure the page is allowed to be crawled in robots.txt so Google can read the noindex command.

6. Common Mistakes to Avoid