Link Analysis: Internal, External & Broken Links
Links are the threads that hold the web together. They tell search engines how pages relate to one
another, how authority should flow, and which destinations are worth crawling. A thorough link analysis
reveals broken paths, wasted crawl budget, and missed opportunities to pass equity. This guide covers
internal and external links, the rel attribute, redirects (3XX), broken links (4XX), and a
practical strategy to tie it all together.
1. Why Links Are the Backbone of SEO
Google's original breakthrough, PageRank, was built entirely on links. The core idea is simple: a link from page A to page B is a vote of confidence, and pages that accumulate more (and higher-quality) votes are deemed more authoritative. This concept of link equity (often called "link juice") still underpins how authority flows through the web today.
Links matter for three distinct reasons:
- Link equity / PageRank: Each link passes a portion of a page's authority to its target, helping that target rank higher for relevant queries.
- Crawl discovery: Search engine crawlers follow links to find new and updated pages. A page with no inbound links (an "orphan page") may never be discovered or indexed at all.
- Topical authority: The way you link related content together signals to search engines which topics your site covers in depth, strengthening your relevance for an entire subject area rather than a single keyword.
In short, links are not decoration. They are the infrastructure that determines what gets found, what gets ranked, and how trust spreads across your site and the wider web.
2. Internal vs External Links
Every link on a page falls into one of two categories, and each plays a fundamentally different role.
- Internal links point to another page on the same domain. They guide users deeper into your site and create the crawl paths that search engines follow.
- External links point to a page on a different domain. They cite sources, reference tools, and connect your content to the wider web.
Internal links are one of the few ranking levers you control completely. They distribute link equity throughout your site, push authority from strong pages (like your homepage) toward deeper pages that need a boost, and establish a logical hierarchy. Good internal linking ensures that no important page is more than a few clicks from the homepage.
External links are often misunderstood. Linking out to authoritative, relevant sources does not "leak" your rankings away. Instead, it signals to search engines that your content is well-researched and trustworthy. The key is relevance: an external link should add genuine value and point to a reputable destination, not a spammy or unrelated one.
3. Anchor Text Best Practices
Anchor text is the visible, clickable text of a link. It gives both users and search engines a strong hint about what the destination page is about, so it deserves careful attention.
- Be descriptive: The anchor should describe the target page. A link reading "on-page SEO checklist" is far more useful than one reading "this page".
- Avoid generic phrases: Anchors like "click here", "read more", or "learn more" waste a valuable relevance signal and hurt accessibility for screen-reader users.
- Keep it varied and natural: Use a mix of branded, partial-match, and descriptive anchors. A natural link profile never repeats the same phrasing over and over.
- Avoid over-optimization: Stuffing the exact-match keyword into every internal and external anchor looks manipulative. Over-optimized anchor text is a classic spam signal that can trigger algorithmic suppression.
Pro Tip: Rank-O-Saur audits every link on the page in one pass. It counts your total, external, and nofollow links, flags links carrying URL parameters and duplicate destination targets, and highlights any 3XX redirects and 4XX broken links so you can fix problems before they drain your crawl budget.
4. The rel Attribute: nofollow, sponsored, ugc
The rel attribute lets you tell search engines how a link should be interpreted. By default,
a link with no rel value is a normal dofollow link that passes equity.
The relevant values are:
- nofollow: Tells search engines you don't necessarily vouch for the target and they should not pass equity to it. Originally used for untrusted content.
- sponsored: Marks paid, advertising, or affiliate links. Required by Google's guidelines for any link you were compensated for.
- ugc: Stands for "user-generated content", such as links in blog comments or forum posts that you did not place yourself.
Here is how each is applied in HTML:
<!-- Standard dofollow link (passes equity) -->
<a href="https://example.com/guide">SEO guide</a>
<!-- Nofollow: do not pass equity -->
<a href="https://untrusted.com" rel="nofollow">source</a>
<!-- Sponsored / paid link -->
<a href="https://partner.com/offer" rel="sponsored">our partner</a>
<!-- User-generated content (e.g. a comment link) -->
<a href="https://user-site.com" rel="ugc">commenter's site</a>
<!-- Multiple values can be combined -->
<a href="https://affiliate.com" rel="sponsored nofollow">affiliate</a>
Since 2020, Google treats nofollow, sponsored, and ugc as
hints rather than strict directives. This means Google may still choose to use these
links for crawling and indexing purposes, even though the attribute discourages passing equity. Use the
most accurate value you can: sponsored for paid links, ugc for user content,
and nofollow when you simply don't want to endorse a destination.
5. Redirects & 3XX Status Codes
A 3XX status code means the requested URL has moved, and the server is pointing the browser (and crawler) somewhere else. The two you will encounter most often are:
- 301 Moved Permanently: The page has moved for good. Use this when content has a permanent new home; it passes the vast majority of link equity to the new URL.
- 302 Found (Temporary): The move is temporary and the original URL will return. Search engines keep the original URL indexed and are more cautious about passing equity.
Redirects are useful when handled correctly, but two anti-patterns cause real damage:
- Redirect chains: When URL A redirects to B, which redirects to C, every hop adds latency, consumes crawl budget, and dilutes a little more equity along the way. Always point redirects directly to the final destination.
- Redirect loops: When A redirects to B and B redirects back to A, neither page ever loads. The crawler gives up and users see an error.
Caution: Internal links should always point to the final, canonical URL, never to a URL that redirects. Linking to a 3XX hop forces every visitor and crawler through an unnecessary extra request, wastes crawl budget on a large site, and slowly erodes the link equity that should have reached your real page intact.
6. Broken Links & 4XX Errors
A 4XX status code means the client requested a resource the server cannot deliver. The two most relevant to SEO are:
- 404 Not Found: The URL doesn't exist (or no longer exists). The server is unsure whether it might return in the future.
- 410 Gone: The URL has been permanently removed. This is a stronger, more explicit signal that tells search engines to drop the page from the index faster.
Broken links are costly for two reasons. First, they create a poor user experience: visitors who hit a dead end are likely to leave frustrated. Second, they harm crawl efficiency: a crawler that repeatedly follows internal links into dead pages spends its budget on nothing, instead of discovering and refreshing your real content. A broken external link also signals that your content may be stale or poorly maintained.
To find and fix them:
- Scan each page for links that return a 4XX status (Rank-O-Saur flags these inline as you browse).
- For broken internal links, either update the link to the correct URL or redirect the old target with a 301 if the page genuinely moved.
- For broken external links, replace them with a working, equivalent source or remove them entirely.
- For pages that are intentionally gone, return a 410 so search engines de-index them cleanly.
7. URL Parameters & Duplicate Link Targets
Not every link points to a clean, unique URL. Two common issues quietly inflate the number of URLs search engines have to deal with.
URL parameters are the key-value pairs after a ? in a URL. They serve many
purposes, but tracking and faceted-navigation parameters can create huge numbers of near-duplicate URLs:
<!-- These often resolve to the same content -->
<a href="/shoes">Shoes</a>
<a href="/shoes?utm_source=newsletter">Shoes (tracking)</a>
<a href="/shoes?color=red&size=42">Shoes (faceted filter)</a>
<a href="/shoes?sessionid=abc123">Shoes (session)</a>
- Tracking parameters (like
utm_*) don't change the page content but create distinct URLs that can be crawled and indexed separately. - Faceted navigation (filters and sorting) can multiply a single category into thousands of parameterized combinations, draining crawl budget.
Duplicate link targets occur when several links on the same page point to the exact same
destination, or when different URLs resolve to identical content. While linking to a page more than once
isn't harmful by itself, excessive duplicate targets can dilute the clarity of your internal link
structure and make audits noisier. The deeper risk is duplicate content: when parameterized
variants are all indexable, search engines must guess which version is canonical. Use the
rel="canonical" tag, consistent internal linking, and parameter handling to consolidate
signals onto one preferred URL.
8. Internal Linking Strategy & Best Practices
A deliberate internal linking strategy turns a pile of pages into a coherent, crawlable, authoritative site. Apply these principles:
- Build topical silos: Group related content into clusters and interlink pages within the same topic. This concentrates topical authority and helps search engines understand your subject expertise.
- Favor contextual links: Links placed within the body of your content (in-context) carry more weight and relevance than links buried in footers or sidebars.
- Keep a flat architecture: Ensure every important page is reachable within three or four clicks from the homepage. The shallower the path, the easier it is to crawl and the more equity it receives.
- Use a reasonable link count: Don't cram hundreds of links onto a single page. Equity is divided among all links, so too many links spreads it thin and overwhelms users.
- Link to canonical URLs only: Always point internal links at the final destination, never at a URL that redirects (3XX) or returns an error (4XX).
- Fix broken links promptly: Audit regularly, repair internal 4XX links immediately, and prune dead external links.
- Use descriptive anchor text: Help users and crawlers understand the destination before they click, while keeping anchors varied and natural.
Treat link analysis as an ongoing routine rather than a one-time cleanup. Every new page, redesign, or content migration is an opportunity for links to break, chains to form, and equity to leak. A quick, regular audit keeps your link graph healthy and your rankings from going extinct.