December 31, 2003
Jill’s recent post on “linktheft” and the fact that I’ve recently read one of the PageRank papers got me thinking about links. Blog spam by those seeking more PageRank has become a real annoyance. We’ve be slogged by spam as well; some other bloggers have taken rather extreme measures in response. I re-read Jill’s paper Links and Power and also looked at some of the descriptions of PageRank online to make sure I understood the article by Larry Page et al. Here are a few observations…
PageRank models the Web as a series of nodes or vertices (pages) with directed edges (links) between them. The question the algorithm tries to answer is this: ignoring the content of pages entirely, what does the structure of the Web tell us about which pages are most important? In fact, it tells us a lot.
PageRank is clever because it doesn’t assume that one page agrees with or endorses another page that it links to. If you link to a page that you’re refuting or mocking, that page must be at least somewhat important, because you’re bothering to point it out (and then refute it or mock it). The only assumption it makes was that links were put there for people to actually use when on the Web. This is why blog spam and link farms and such cause problems for the algorithm; they don’t use links for this reason at all.
Google has not published the most recent algorithms behind their search technology, but the basic system is more or less known (or surmised) by search engine optimisers and manipulators. One striking effect of the PageRank system is link drain.
I have to mention that in my reading of Jill’s paper, it seemed that PageRank was presented as being Google’s secret. It’s not, and no one has to guess or surmise it; it is a publicly documented algorithm presented in numerous academic papers. (I’d be glad to discuss how it works or explain parts of “The PageRank Citation Ranking: Bringing Order to the Web” [PDF], the paper mentioned earlier, if anyone cares.) However, PageRank is only one factor that determines what search results you get. You don’t go to Google and just ask for the most important page on the Web – you type in a search query, and you want your results to be relevant to that query. So there’s also information retrieval algorithms that come into play. And then, far down the line, Google certainly does some tweaking of the results, possible blacklisting, and other stuff in an attempt to defeat link farms and such.
In her paper Jill offers the metaphor of links as currency or cash. This is nice in some ways and it does highlight some important things about links, but it’s as limited a perspective as is the perspective of PageRank itself, and if this becomes your controlling metaphor you’ll fall into the same trap as a PageRank addict. Links are not mainly valuable because they can pump up your PageRank. They’re mainly valuable because they let people click and go from one Web page to another!
When I post a comment on Jill’s blog and enter my URL so that my name is linked to it, I score myself some PageRank. But that’s not really the reason I add my URL. It’s because I often like to know who these people are who are posting comments on blogs, and I like to be able to click on their names and read something about them at their sites. I suppose some people reading my comments might sometimes like to know things about me, too. So, a link.
My own website, and blogs like grandtextauto, are useful in part because of the external links they have. If you removed all of these links (as some 1990s commercial sites thought would be a good idea) these websites would, to put it bluntly, suck. Then, far fewer people would visit them and link to them. These sites would “bleed” less PageRank, sure, but since they would become loser sites that no one would link to, they’d face a much larger decline in PageRank. Google likes blogs – but people like blogs, too! Google likes blogs for all the right reasons.
Instead of being dollars or euros or kroner, links seem a lot more like the major prison currency, cigarettes, in the hands of a heavy smoker who cares more about smoking than about any other prison commodity. First of all, you can do something with them; as an afterthought, you also use them to get more PageRank if you like. In fact, I think the currency metaphor is a bit weak in another way, since links can only be “exchanged” in a very very indirect way for PageRank – maybe they’re more like some super-restrictive frequent flyer miles. But really, I think they’re better seen as the airplanes. Links aren’t just tokens of value ready to be traded; they’re one of the main valuable things about the Web – along with the words and images that we link together.