Thursday, July 10, 2008

Calling Matt Cutts To The Bat Phone


This post has been moved to the new link building blog the Link Spiel. You'll find the entire post here: http://www.linkspiel.com/2008/07/mattcutts-bat-phone/

Dear Matt Cutts:

I hope this finds you well. :)

I'm writing to ask a question about anchor text and if you count the anchor found in a second link on a page. There's been a lot of discussion on this topic recently, it's an important point and one a lot of people want more information on so... since it's a subject only Google can definitively answer I thought I'd go straight to the source and ask.....

10 comments:

Matt Cutts said...

Your question is short, but the answer is more complex. Typically if the anchortext on the two links is identical, we would probably drop one of those links.

mvandemar said...

Matt, what about differing anchor texts though? I (and some others) ran some tests that seem to indicate that not only does only the first link count, but if you nofollow that link then none of the others to that page will count either:

Single Source Page Link Test Using Multiple Links With Varying Anchor Text

I mean, it makes sense in a way, but knowing for sure that it works that way (whether or not it is what is intended) would be good, especially when it comes to navigation and PageRank sculpting, etc.

Thanks. :)

Marion said...

I guess it depends on how you define the word... probably/is...

Dudibob said...

Wait wait wait, so Matt, your saying (simplified) if the second anchor text is different to the first, Google will pass (some) value?

Matt Cutts said...

Dudibob, no, I confirmed the converse: if the anchortext is the same, we'll typically drop the second link.

This is the sort of thing where people can run experiments to see whether different anchortexts flow in various ways.

jerry said...

Matt, what if the first text on a page is nofollowed, are the rest of the links treated as follow'd links? What if the second link is nofollow'd but the first is not?

Neyne said...

I don't think he is going to do the tests for you...

RedCardinal said...

Hmm... I think the question has to be broken down into chunks. It may be that Google counts the second anchor, but doesn't include the anchor text. Then you've got the Pagerank element, and you can throw NOFOLLOW in for good measure.

Any follow-ups from mvandemar on his testing?

mvandemar said...

@RedCardinal, the way my tests came out it seems as if the first link is the only one that is counted. That means that only the first anchor text passes any relevancy, and if that happens to be nofollowed then none of the subsequent links to the same page pass value. I explained what I think is happening here, but basically I think it has to do with efficiency. The routines that spider pages are (I am assuming) quite separate from those that actually assign ranking values. When the page is parsed for links, it probably de-dupes them, since there is no reason to add the same link to the spidering queue twice. It also applies the nofollow tag, since, again, if it's nofollowed on that page there is no reason to add it to the queue (from that page, anyways, but of course there might be a link to the same page elsewhere that is followed). My guess is that these two functions are overlapping when there are multiple links to one target and the first is nofollowed.

I am also guessing that this just happened, and was a side effect rather than being something that they did by design. They could just as easily switch the order of operations, and remove the nofollowed links first, and then dedupe, in which case if the first link were nofollowed the second link would become the first. Now that attention has been drawn to it, who knows, they may in fact do that.

Halfdeck said...

"there is no reason to add the same link to the spidering queue"

Links aren't just pushed on the crawl queue. Google also needs to keep a record of what links exist on a given page. That'a separate list, and its what eventually decides how Google constructs its link graph.

Google may also blindly store all the links during crawl and process them later, post-crawl. That frees up CPU resources and allows Google to use more complicated algoes to sort out the links. You don't need to filter out dupes on a list to crawl correctly, though Googlebot needs to obey nofollows.

The biggest lesson here though is its harder than people think to set up an SEO test. If there are 10 factors at play but you assume there's only one factor at play, you can end up with a wacked conclusion.