Introduction to PageRank
Google’s implementation of PageRank was the fundamental link popularity portion of the original Google algorithm and was one of several unique factors Google originally used. From the very first versions, the Google search engine has placed varying levels of emphasis on PageRank as a factor in the ranking of any page.
While it is generally conceded that these days Google places much less importance on PageRank as a ranking factor (some estimate that its present weight may be no more than 5-10%) it is still one of the factors to be considered in increasing your ranking and has grown (through the deprecated Google Toolbar making a mutated form of PR visible) to be a factor of considerable economic importance to webmasters.
PageRank was originally designed as a form of a voting system based on the academic paper citation system. A link to a page was considered as a vote for that page, with higher PageRank pages being viewed by Google as being more important thus adding to the ranking score of the pages they linked to. Adding incoming links to your web pages will almost always add PageRank, but not all incoming links provide the same value.
As an example, it can take many PR2 incoming links to increase your targeted page to PR5 or it might only take one PR6 page to give you the same result. Likewise, a PR5 page with only two outgoing links on the page will provide five times more PageRank to the page linked to as compared with a page which has ten outgoing links. Or to put it another way, each page only has so much voting power based on its PR and that is divided among all the pages it links to.
Google originally defined PageRank (in their paper The Anatomy of a Large-Scale Hypertextual Web Search Engine) as :
We assume page A has pages T1…Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:
PR(A) = (1-d) + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))
Note that the PageRank’s form a probability distribution over web pages, so the sum of all web pages’ PageRank’s will be one.
While we are not sure if this formula is still the exact one used by Google today, it is probably at least very similar given that the PageRank patent is held by Stanford University and leased to Google. It seems logical that if Google had developed a radically different concept for link popularity there would be no need to continue to pay Stanford large sums of money to lease their PageRank patent.
For purposes of simplicity the formula can be stated as:
1. The cumulative PageRank of any page is the sum of all the contributions of all the pages linking to it.
2. The PR contribution from any given page is equal to .15 + (.85 * PR of the page)/ the number of outgoing links
It is well to remember that the PR of any given page is divided equally between all of the links on the page. If a page has one outgoing link, that receiving page gets the entire contribution. If there are ten links on the page, the PR contribution is divided into ten ways.
At this point, it may be well to digress a moment to discuss the two different kinds of PR – the True PR numbers which are used by Google in its calculations, and the PR numbers which are displayed on the Google toolbar.
No one outside Google knows the real value of True PR (which I will abbreviate here as PR), but we do know that the average of all PR is one, and thus there must be many pages whose PR value is less than one and other pages who have PR more than one, and perhaps as high as several million.
Toolbar PR (abbreviated here as TPR), on the other hand, was visible on the toolbar for everyone to see and has a range of from 0 to ten. The problem with this visual representation is that the range of True PR values, which doubtless have millions of different values are arranged by a method unknown to us so that groups of ranges are represented by boxes numbered one thru ten and that this distribution is probably done on some sort of logarithmic scale, All we really know about Toolbar PR is that it is certainly a different number that True PR, higher numbers are better and it seems to take five or six times more PR contribution for a page to move from a TPR4 say to a TPR5 than it did to move from the TPR3 to the TPR4.
Uses and Abuses of Pagerank
From a practical point of view, there is not a lot more you need to know about PageRank, TPR will show up on the Google toolbar and will show you the current TPR of that page in your browser which is indicated generally of:
- How many links there are to that page and some rough indication of the quality of those links.
- If the TPR is zero it is likely either a new site or a site that Google has penalized. If you are considering linking to a PR0 page it is worth the time to investigate carefully as too many links to penalized pages could result in a penalty for your page.
- PR is still a ranking factor in Google, and while PR itself is not enough to rank well it does help out, especially in competitive situations where small differences in your relevancy score may make larger differences in your rankings at Google.
Commercial aspects of PageRank
While in my opinion the commercial aspects or PageRank are more important than the ranking aspects I am only going to touch briefly on the economic value of PR.
The selling price of a website is influenced by its PR as often people will judge the importance of a website by its PR. PR may, in fact, represent a lot of work by the site owner and thus there may be some justification for this.
When buying links the price is almost always influenced by the PR of the linking page, even though the PR of the linking page will seldom do more than perhaps increase your PageRank.This may not seem terribly logical the marketplace is not always logical and we have to accept this fact as a commercial reality.
Manipulation of PageRank is not often discussed, but it should be a consideration for everyone with a website that they want to rank better. Pagerank can be manipulated both by soliciting links that point to pages whose PR you want to increase and by careful consideration of the architecture of your site. It’s easy enough to see how gaining links that point to your pages will increase your PageRank but it is not so easy to understand the effect of site architecture on PR.
Links internal to a site were not discriminated against in the original Google equation, and that still seems to be the case today. Although PageRank is generated on a page by page basis it is useful in gaining a better understanding of how to use what you have best if we consider that each website has a certain amount of PageRank from each of its pages and that if control of how we link within and outside of the site we can channel PR to those pages which need it most.
Most of us place the most emphasis on the home or index page of our sites and use that page to target our most competitive keywords, and most of our home pages have the highest PR and the best rankings, largely as a result of most external links pointing to the site root, and most pages in a site linking back to the home page for navigation purposes. If we remember that the PR transferred to each page is divided by the total number of links on the page we can devise ways of conserving the PR available to us and channelling it where it is needed most. For a more thorough understanding of how this works, I recommend that you visit and play with Phil Cravens Google PageRank Calculator and his more thorough discussion of Pagerank
While many do not like the words manipulation and search engines used in the same paragraph, I see no problem with making the most of the PageRank Google provides us.
Pagerank is often deemed important by webmasters but we feel that this is true mostly of commercial considerations. PageRank by itself is not likely to help your page rank well except in some situations where it may be used to break a ranking tie. Whether your primarily interested in PageRank for ranking or economic reasons, making the most of what is available to you makes good sense.
There are many ways to improve the use of Pagerank on your site but they generally fall into two categories – Conserving what you have and channelling what you have to where you need it most.
In general conserving means either not linking externally too much or if you do link externally use a link that Google does not recognize in its PR calculations.
Google, in collaboration with Yahoo and MSN, has recently introduced the Nofollow attribute. A normal link with the attribute added would look something like this: <a href=”http:/somesite.com” rel=”nofollow”> which will result in the major search engines not spidering that link.
Just be sure that you do this ethically – If you are exchanging links be sure that you give a crawlable link back.
Dhtml menu systems that control the visibility of layers as a means of showing the dropdown pages tend to spider very well.