BlueVoda Website builder

       create website html




search engines optimization

It can sometimes be an expensive experiment getting an outside company carry out search engines optimization for you, they often use suspect methods for temporary results. The most important thing is the actual site and the html behind it, rather than the actual submitting to search engines. First here's some background on search engines and directories.

search engines

  • The user browses the search engines database of indexed sites by entering keywords or phrases into a search box
  • "Crawl" web sites using software - often called "spidering"
  • Some use the <meta> tag information
    • e.g Altavista and Excite
  • Index the content of <title> meta tag and actual page content
  • Site may be found by search engine spidering software
  • Usually need to submit site
  • Users find sites by searching for keywords
  • Can sometimes change your listing
  • e.g. Google


  • Listings where user has to choose the category / categories to search through
  • Compiled by human editors
  • Each has very specific standards which sites must meet to get listed - very strict
  • Site may be found by people who compile directory
    • Usually need to submit site
  • Can also guarantee someone will look at your site quickly if you pay
    • Doesn't guarantee entry to directory, but more likely
  • Difficult to change your listing once it's there
  • Some list sites in alphabetical order so obviously this will determine how far down the list your site will appear
  • The main disadvantage is that users have to know which category to choose
    • And you have to know which category users may search for your type of site in
  • e.g. Yahoo

other points

  • Many search engines / directories are both - list sites from an indexed database as well as from category listings
    • Directories will often list the results from their directory first, then display results from partnering search engine(s)
    • e.g. Yahoo shows results from its own database first, then uses the Google search engine to display further results
  • Important to submit correctly to the search engines and directories
    • So your site comes up when a user searches for relevant keywords/through relevant categories
  • It is crucial to do this manually to at least the top 20
    • Can use software, such as SubmitWolf, to register with 100s of others, but not the top 20
  • They each have different criteria to which they allow web sites to be listed
    • Failing to comply could result in a bad listing or being permanently rejected
  • This is becoming increasingly important - there's no point having a fantastic web site if no one finds it!

submission tips

html & page design

  1. Keep HTML as simple as possible
  2. Keep the most important textual information at the top of the html
    • For users - so they don't have to scroll
    • For Search Engines - include keyword phrases which are in the meta tags
    • Using tables etc can mean text is further down the html than the user sees on screen
    • Consider using a layers based design - when you create website you can use css to position the layers with textual content at the top of the html even if they're further down the visible page
    • If keywords are also links - increases their efficacy
    • Use keywords in your <h1 - h6> heading and subheading tags
    • Each page should have lots of keyword rich content for its topic and corresponding meta tags
  3. Include textual hyper links to important areas of the site from the home page
  4. Call files / folders / domain (and subdomain) names something descriptive - directories like this (acts as another keyword)
  5. Include a site map, linked to from the home page
    • Means the search engine will be able to index the whole site quickly and easily
  6. Put all JavaScript & CSS in seperate files
    • Include a robots.txt in the root of the create website directory detailing js and css files to be excluded from indexing

meta tags

  1. The meta tags (in the <head> tag of the HTML of each page) for title, keywords and description should be utilised to their full value, and match keywords that appear in the body the page
  2. Don't use the same html meta tags across the site - each page should have its own
  3. Optimise your meta tag information
    • View source on sites of the same genre as yours that come up highly on search engines and directories and see what keywords they use
    • Check your site stats to see which words and phrases your users are searching for


  1. Submit manually to each search engine/directory once
    • Usually submit your home page (and any other key pages from your site on some search engines - they each have limits)
    • Be careful (i.e. don't spam) about the ones which use each other's databases:
      • Excite, Magellan, and Webcrawler use the same database
      • AOL Search, AltaVista, AltaVista UK,, DogPile, Go2Net, Google, HotBot, Infospace, Lycos, Netscape Search all use Open Directory
    • If your site hasn't appeared after 4-6 weeks - repeat submission
  2. Submit to the correct category in directories
  3. Some search engines base their listings on link popularity - how many other sites link to yours? - are there any sites which may be willing to link to yours?
    • Referring sites have to be of good quality and/or similar subject area
  4. Use the "final comments" box where they have one - last chance to make an impression
  5. Read all the Search Engines/Directories "How to submit your site" sections before submitting create website (most have them)
  6. Consider "buying your way in" - Yahoo, LookSmart etc have the option to pay to have your site reviewed within a few days
    • Doesn't guarantee listing, just that your site will get reviewed quickly

what not to do

  1. Re-direct any page (you want listed) using JavaScript
    • Probably includes the js to get out of frames some people put in their <body> tag
    • Definitely includes the JavaScript re-direct to detect browsers for plug-ins
  2. Have no real HTML content on the home page
    • This is especially true for graphics-heavy "splash pages" and Flash enabled sites
  3. Include a "hidden" paragraph in a small font size or the same colour as the background - some search engines/directories (e.g. Yahoo) will actually penalise for this - considered spam
    • Especially if this hidden content is also just a list of keywords
  4. Put insufficient text links on the page, so there's nowhere for the search engine to spider
  5. Put all navigational links in layers/javascript - the search engines can't always spider through
    • This can include rollover images as links
  6. Use frames - search engines can't spider them
    • In the search engine results, the site will be seperated into the individual frames
    • If you MUST use frames - make sure there are navigational links in each frame used
    • And always use the <no frames> tag
  7. Use spamming techniques - it will get you barred from search engines and directories

improving the listing

  1. Analyse site stats
    • Which keywords people are searching for - change create website meta tags and content accordingly
    • Which search engines/directories are/are not putting traffic to the site
    • Click-through rate improves site listing
      • See if / where users are clicking "Back" button - negative effect on listing
    • How long users stay on site / pages - can improve listing
  2. Ensure the page design is not detrimental to search engines spidering your site
  3. Possibly have information rich pages
    • About things users would be interested in
    • Updated frequently - visitors return for updates
    • Possibly newsfeeds?
    • Link these info rich pages from site map

meta tags

  • Meta tags are included in the html, between the <head> and </head> at the top of the document
  • It is crucial to get the correct information in these as this could decide whether or not your web site is listed
  • The meta tags (in the HTML of each page) for title, keywords and description should be utilized to their full value, and match keywords that appear in the body of the page
    • Ensuring these keywords and phrases are in the page headings, body content and links is highly beneficial
    • Do not repeat them more than five times on a page though - search engines can pick this up as spamming
  • Some search engines (not all) use meta tags to obtain the information about your site - directories don't use them
  • Meta tags are an important aspect of search engine optimization

step by step instructions


  1. Think of a short keyword intensive phrase for the page
    • Keep to 90-150 characters or less (incl. spaces, punctuation etc.)
      <title>Meta tags for search engine optimization<title>
  2. Use this phrase as the basis for a short paragraph for the description meta tag
    • Do not split the phrase up in the paragraph - keep it intact
    • It is beneficial to include any other keyword phrases may be relevant for the page
    • Keep to 250 characters (incl. spaces, punctuation etc.)
      <meta name="description" content="Meta tags for search engine optimization, including page title, keywords and description.">
  3. Use this description paragraph as the opening paragraph in the main content of the page
    • n.b. There is not much point putting create website keywords in the description meta tag if they are not included in the actual body content anywhere
  4. Think of any other possible keyword/keyword phrases which are relevant to the page
    • Use these and the keywords from the title and description for the keywords meta tag
      <meta name="keywords" content="meta tags search engine optimization keywords description page title">


The <title> tag is becoming increasingly important to listings.

  • It should be descriptive, but short (about 150 characters)
  • Include keywords
    • Build it around a target phrase for the page
    • It needs to make sense as a sentence, but try to limit the use of irrelevant words (a, the, etc.)
    • Try to make it specific to your web site
    • Never over the top, marketing hype "we are the best site etc."
      • Is not descriptive or include valuable keywords
      • Can get you kicked off certain search engines/directories!
  • What the user sees at the top of their browser and therefore what is shown when they bookmark a page - make it useful!


The description (<meta name="description" content="This is what the user sees when the search engine displays the site in a list among all other similar sites and can sway which they click on">)

  • This should be informative and not too long (there are recommended limits - around 250 characters) as only a set number of characters will be displayed
  • Should include html keyword phrases
    • Ideally keywords which are repeated in the title and body content of the page
    • Put your most important keywords at the beginning of the
    • Don't use too many stop words (a, on, the, etc) - search engines don't index them - wastes your description


The keywords (<meta name="keywords" content="keyword keyword keyword">)

  • These are the possible keywords a user may submit when searching for information
  • It is worth putting in variations, acronyms (e.g. employment, vacancies, jobs, careers) and spelling mistakes of words
  • Use plural over singular
  • Don't use commas between keywords:
    • Adds characters and separates words which may be searched for together
    • Makes repeating of words harder (SEs don't like repetition)
    • Experiment - if no commas isn't helping your listing, reassess after a few months
  • 1000 characters (or less) in total - more is just ignored
  • Don't repeat a keyword more then 3 times
    • Be careful over variations of the same word (e.g. cook, cooked, cooking)
  • Match with important keywords in the body content of the page
Spamming Search Engines
  • Search Engines & Directories want to have the best sites listed
    • Increases the user experience
    • Gets them more sponsorship etc
  • They take a harsh line against spamming
    • Spamming: using "tricks" to try and get your site listed


  1. Use hidden keywords
    • Text the same colour as the background colour
  2. Use "doorway" pages to submit with
    • These are usually pages which contain nothing more than keywords and a link to your home page
  3. Use other company names in your meta tags / content of your site without permission
    • Considered spamming, and could get you sued by said company
  4. Use "popular" phrases in keywords (e.g. sex, porn, MP3, Britney Spears) to try and boost listing
  5. Submit to a search engine / directory too frequently
    • Leave 4-6 weeks between submissions
  6. Include hidden links to other sites to try and improve listing
    • Links to other sites don't help unless the referring site is very popular and preferably of the same genre as yours
    • Besides, you need good quality links to your site to improve your ranking
    • Hidden links will be considered spam anyway


If you really have to use frames (which I would never advocate because of usability issues and search engine listings), here's how to create a framed site which may have a chance on search engines.

People often use frames to provide an easy to maintain site. Navigation can be kept in a seperate frame (often on the left-hand side) and therefore be updated just once, it is always visible when the rest of the page scrolls. The same is true for a header or a footer that stays static while the content is scrollable. Some companies like this so that the corporate logo and contact information are always visible and easy to find. Using frames on a flash-enabled html site is one way to get content indexed by search engines, as at present search engines can't spider the content in flash.

As an argument against frames, the easy to update debate can be quashed by the use of server side includes which are probably easier to maintain than frames. And while having the company logo and contact information visible is important, this can be achieved by having pages which don't scroll too far down. Frames have many usability issues, mainly in the form of the url being destroyed and orphaned pages.

Frames are bad for search engine optimization because:

  • search engine spiders cannot spider the content of a site because they cannot follow the links within a frameset
  • there is also no body text to index on the main frameset

These two factors can be counteracted by:

  • using the <noframes> tag properly
  • making sure that the content inside your <noframes> tag contains links to the rest of your site

The search engine will spider this information as it would a normal html page.

other issues

Make sure there are navigational links in the main body content of all your internal framed pages - many sites often put these at the bottom of the page. This is useful for both the search engine to spider and for any user who finds the site on a search engine, as the page will be orphaned from its frameset.

using site traffic stats

  • Which keywords people are searching for
    • Update your meta tag keywords / content keywords and links accordingly
  • Which search engines/directories are/are not putting traffic to the site
  • Click-through rate improves site listing
    • i.e. if your site is clicked on from a search engine listing
    • Also, how far into the site users go - i.e. how many clicks (may be measured)
  • Check site stats to see if/where users are clicking "Back" button - negative effect on listing
    • By seeing how long users stay on your site - search engines can "time" how quickly a visitor returns to their site from yours
    • Also, by seeing which pages are "single access pages" (i.e. the visitor arrives and exits from the same page, without visiting others)
  • Which sites have links to your site ("referring sites")
    • Value of these referring sites - more valuable if they themselves are popular and/or in same category as your site

BlueVoda Website Builder

Google's PageRank Explained

Not long ago, there was just one well-known PageRank Explained paper, to which most interested people referred when trying to understand the way that PageRank works. In fact, I used it myself. But when I was writing the PageRank Calculator, I realized that the original paper was misleading in the way that the calculations were done. It uses its own form of PageRank, which the author calls "mini-rank". Mini-rank changes Google's PageRank equation for no apparent reason, making the results of the calculations very misleading.

Even though the author abandoned mini-rank as a result of this and another paper, the original, unchanged paper is still available on the web. So if you come across a PageRank Explained paper that uses "mini-rank", it has been superceded and is best ignored.

What is Page Rank?

PageRank is a numeric value that represents how important a page is on the web. Google figures that when one page links to another page, it is effectively casting a vote for the other page. The more votes that are cast for a page, the more important the page must be. Also, the importance of the page that is casting the vote determines how important the vote itself is. Google calculates a page's importance from the votes cast for it. How important each vote is is taken into account when a page's PageRank is calculated.

PageRank is Google's way of deciding a page's importance. It matters because it is one of the factors that determines a page's ranking in the search results. It isn't the only factor that Google uses to rank pages, but it is an important one. From here on in, we'll occasionally refer to PageRank as "PR".


Not all links are counted by Google. For instance, they filter out links from known link farms. Some links can cause a site to be penalized by Google. They rightly figure that webmasters cannot control which sites link to their sites, but they can control which sites they link out to. For this reason, links into a site cannot harm the site, but links from a site can be harmful if they link to penalized sites. So be careful which sites you link to. If a site has PR0, it is usually a penalty, and it would be unwise to link to it.

How is Page Rank Calculated?

To calculate the PageRank for a page, all of its inbound links are taken into account. These are links from within the site and links from outside the site.

PR(A) = (1-d) + d(PR(t1)/C(t1) + ... + PR(tn)/C(tn))

That's the equation that calculates a page's PageRank. It's the original one that was published when PageRank was being developed, and it is probable that Google uses a variation of it but they aren't telling us what it is. It doesn't matter though, as this equation is good enough.
In the equation 't1 - tn' are pages linking to page A, 'C' is the number of outbound links that a page has and 'd' is a damping factor, usually set to 0.85.

We can think of it in a simpler way:-

a page's PageRank = 0.15 + 0.85 * (a "share" of the PageRank of every page that links to it)

"share" = the linking page's PageRank divided by the number of outbound links on the page.

A page "votes" an amount of PageRank onto each page that it links to. The amount of PageRank that it has to vote with is a little less than its own PageRank value (its own value * 0.85). This value is shared equally between all the pages that it links to.

From this, we could conclude that a link from a page with PR4 and 5 outbound links is worth more than a link from a page with PR8 and 100 outbound links. The PageRank of a page that links to yours is important but the number of links on that page is also important. The more links there are on a page, the less PageRank value your page will receive from it.

If the PageRank value differences between PR1, PR2,.....PR10 were equal then that conclusion would hold up, but many people believe that the values between PR1 and PR10 (the maximum) are set on a logarithmic scale, and there is very good reason for believing it. Nobody outside Google knows for sure one way or the other, but the chances are high that the scale is logarithmic, or similar.

If so, it means that it takes a lot more additional PageRank for a page to move up to the next PageRank level that it did to move up from the previous PageRank level. The result is that it reverses the previous conclusion, so that a link from a PR8 page that has lots of outbound links is worth more than a link from a PR4 page that has only a few outbound links.
Whichever scale Google uses, we can be sure of one thing. A link from another site increases our site's PageRank. Just remember to avoid links from link farms.

Note that when a page votes its PageRank value to other pages, its own PageRank is not reduced by the value that it is voting. The page doing the voting doesn't give away its PageRank and end up with nothing. It isn't a transfer of PageRank. It is simply a vote according to the page's PageRank value. It's like a shareholders meeting where each shareholder votes according to the number of shares held, but the shares themselves aren't given away. Even so, pages do lose some PageRank indirectly, as we'll see later.

Ok so far? Good. Now we'll look at how the calculations are actually done.
For a page's calculation, its existing PageRank (if it has any) is abandoned completely and a fresh calculation is done where the page relies solely on the PageRank "voted" for it by its current inbound links, which may have changed since the last time the page's PageRank was calculated.

The equation shows clearly how a page's PageRank is arrived at. But what isn't immediately obvious is that it can't work if the calculation is done just once. Suppose we have 2 pages, A and B, which link to each other, and neither have any other links of any kind. This is what happens:-

Step 1: Calculate page A's PageRank from the value of its inbound links

Page A now has a new PageRank value. The calculation used the value of the inbound link from page B. But page B has an inbound link (from page A) and its new PageRank value hasn't been worked out yet, so page A's new PageRank value is based on inaccurate data and can't be accurate.

Step 2: Calculate page B's PageRank from the value of its inbound links

Page B now has a new PageRank value, but it can't be accurate because the calculation used the new PageRank value of the inbound link from page A, which is inaccurate.

It's a Catch 22 situation. We can't work out A's PageRank until we know B's PageRank, and we can't work out B's PageRank until we know A's PageRank.

Now that both pages have newly calculated PageRank values, can't we just run the calculations again to arrive at accurate values? No. We can run the calculations again using the new values and the results will be more accurate, but we will always be using inaccurate values for the calculations, so the results will always be inaccurate.

The problem is overcome by repeating the calculations many times. Each time produces slightly more accurate values. In fact, total accuracy can never be achieved because the calculations are always based on inaccurate values. 40 to 50 iterations are sufficient to reach a point where any further iterations wouldn't produce enough of a change to the values to matter. This is precisiely what Google does at each update, and it's the reason why the updates take so long.

One thing to bear in mind is that the results we get from the calculations are proportions. The figures must then be set against a scale (known only to Google) to arrive at each page's actual PageRank. Even so, we can use the calculations to channel the PageRank within a site around its pages so that certain pages receive a higher proportion of it than others.


You may come across explanations of PageRank where the same equation is stated but the result of each iteration of the calculation is added to the page's existing PageRank. The new value (result + existing PageRank) is then used when sharing PageRank with other pages. These explanations are wrong for the following reasons:-

1. They quote the same, published equation - but then change it from PR(A) = (1-d) + d(......) to PR(A) = PR(A) + (1-d) + d(......)
It isn't correct, and it isn't necessary.

2. We will be looking at how to organize links so that certain pages end up with a larger proportion of the PageRank than others. Adding to the page's existing PageRank through the iterations produces different proportions than when the equation is used as published. Since the addition is not a part of the published equation, the results are wrong and the proportioning isn't accurate.

According to the published equation, the page being calculated starts from scratch at each iteration. It relies solely on its inbound links. The 'add to the existing PageRank' idea doesn't do that, so its results are necessarily wrong.

Internal linking

Fact: A website has a maximum amount of PageRank that is distributed between its pages by internal links.

The maximum PageRank in a site equals the number of pages in the site *

1. The maximum is increased by inbound links from other sites and decreased by outbound links to other sites. We are talking about the overall PageRank in the site and not the PageRank of any individual page. You don't have to take my word for it. You can reach the same conclusion by using a pencil and paper and the equation.

Fact: The maximum amount of PageRank in a site increases as the number of pages in the site increases.

The more pages that a site has, the more PageRank it has. Again, by using a pencil and paper and the equation, you can come to the same conclusion. Bear in mind that the only pages that count are the ones that Google knows about.

Fact: By linking poorly, it is possible to fail to reach the site's maximum PageRank, but it is not possible to exceed it.

Poor internal linkages can cause a site to fall short of its maximum but no kind of internal link structure can cause a site to exceed it. The only way to increase the maximum is to add more inbound links and/or increase the number of pages in the site.

Cautions: Whilst I thoroughly recommend creating and adding new pages to increase a site's total PageRank so that it can be channeled to specific pages, there are certain types of pages that should not be added. These are pages that are all identical or very nearly identical and are known as cookie-cutters. Google considers them to be spam and they can trigger an alarm that causes the pages, and possibly the entire site, to be penalized. Pages full of good content are a must.

Inbound links

Inbound links (links into the site from the outside) are one way to increase a site's total PageRank. The other is to add more pages. Where the links come from doesn't matter. Google recognizes that a webmaster has no control over other sites linking into a site, and so sites are not penalized because of where the links come from. There is an exception to this rule but it is rare and doesn't concern this article. It isn't something that a webmaster can accidentally do.

The linking page's PageRank is important, but so is the number of links going from that page. For instance, if you are the only link from a page that has a lowly PR2, you will receive an injection of 0.15 + 0.85(2/1) = 1.85 into your site, whereas a link from a PR8 page that has another 99 links from it will increase your site's PageRank by 0.15 + 0.85(7/100) = 0.2095. Clearly, the PR2 link is much better - or is it?

Once the PageRank is injected into your site, the calculations are done again and each page's PageRank is changed. Depending on the internal link structure, some pages' PageRank is increased, some are unchanged but no pages lose any PageRank.

It is beneficial to have the inbound links coming to the pages to which you are channeling your PageRank. A PageRank injection to any other page will be spread around the site through the internal links. The important pages will receive an increase, but not as much of an increase as when they are linked to directly. The page that receives the inbound link, makes the biggest gain.

It is easy to think of our site as being a small, self-contained network of pages. When we do the PageRank calculations we are dealing with our small network. If we make a link to another site, we lose some of our network's PageRank, and if we receive a link, our network's PageRank is added to. But it isn't like that. For the PageRank calculations, there is only one network - every page that Google has in its index. Each iteration of the calculation is done on the entire network and not on individual websites.

Because the entire network is interlinked, and every link and every page plays its part in each iteration of the calculations, it is impossible for us to calculate the effect of inbound links to our site with any realistic accuracy.

Outbound links

Outbound links are a drain on a site's total PageRank. They leak PageRank. To counter the drain, try to ensure that the links are reciprocated. Because of the PageRank of the pages at each end of an external link, and the number of links out from those pages, reciprocal links can gain or lose PageRank. You need to take care when choosing where to exchange links.
When PageRank leaks from a site via a link to another site, all the pages in the internal link structure are affected. (This doesn't always show after just 1 iteration).

The page that you link out from makes a difference to which pages suffer the most loss. Without a program to perform the calculations on specific link structures, it is difficult to decide on the right page to link out from, but the generalization is to link from the one with the lowest PageRank.

Many websites need to contain some outbound links that are nothing to do with PageRank. Unfortunately, all 'normal' outbound links leak PageRank. But there are 'abnormal' ways of linking to other sites that don't result in leaks. PageRank is leaked when Google recognizes a link to another site. The answer is to use links that Google doesn't recognize or count. These include form actions and links contained in javascript code.

Form actions

A form's 'action' attribute does not need to be the url of a form parsing script. It can point to any html page on any site. Try it.

<form name="myform" action="">
<a href="javascript:document.myform.submit()">Click here</a>

To be really sneaky, the action attribute could be in some javascript code rather than in the form tag, and the javascript code could be loaded from a 'js' file stored in a directory that is barred to Google's spider by the robots.txt file.

Example: <a href="javascript:goto('wherever')">Click here</a>

Like the form action, it is sneaky to load the javascript code, which contains the urls, from a seperate 'js' file, and sneakier still if the file is stored in a directory that is barred to googlebot by the robots.txt file.

The "rel" attribute
As of 18th January 2005, Google, together with other search engines, is recognising a new attribute to the anchor tag. The attribute is "rel", and it is used as follows:-

<a href="" rel="nofollow">link text</a>

The attribute tells Google to ignore the link completely. The link won't help the target page's PageRank, and it won't help its rankings. It is as though the link doesn't exist. With this attribute, there is no longer any need for javascript, forms, or any other method of hiding links from Google.

So how much additional PageRank do we need to move up the toolbar?

First, let me explain in more detail why the values shown in the Google toolbar are not the actual PageRank figures. According to the equation, and to the creators of Google, the billions of pages on the web average out to a PageRank of 1.0 per page. So the total PageRank on the web is equal to the number of pages on the web * 1, which equals a lot of PageRank spread around the web.

The Google toolbar range is from 1 to 10. (They sometimes show 0, but that figure isn't believed to be a PageRank calculation result). What Google does is divide the full range of actual PageRanks on the web into 10 parts - each part is represented by a value as shown in the toolbar. So the toolbar values only show what part of the overall range a page's PageRank is in, and not the actual PageRank itself. The numbers in the toolbar are just labels.

Whether or not the overall range is divided into 10 equal parts is a matter for debate - Google aren't saying. But because it is much harder to move up a toolbar point at the higher end than it is at the lower end, many people (including me) believe that the divisions are based on a logarithmic scale, or something very similar, rather than the equal divisions of a linear scale.

Let's assume that it is a logarithmic, base 10 scale, and that it takes 10 properly linked new pages to move a site's important page up 1 toolbar point. It will take 100 new pages to move it up another point, 1000 new pages to move it up one more, 10,000 to the next, and so on. That's why moving up at the lower end is much easier that at the higher end.
In reality, the base is unlikely to be 10. Some people think it is around the 5 or 6 mark, and maybe even less. Even so, it still gets progressively harder to move up a toolbar point at the higher end of the scale.

Note that as the number of pages on the web increases, so does the total PageRank on the web, and as the total PageRank increases, the positions of the divisions in the overall scale must change. As a result, some pages drop a toolbar point for no 'apparent' reason. If the page's actual PageRank was only just above a division in the scale, the addition of new pages to the web would cause the division to move up slightly and the page would end up just below the division. Google's index is always increasing and they re-evaluate each of the pages on more or less a monthly basis. It's known as the "Google dance". When the dance is over, some pages will have dropped a toolbar point. A number of new pages might be all that is needed to get the point back after the next dance.

The toolbar value is a good indicator of a page's PageRank but it only indicates that a page is in a certain range of the overall scale. One PR5 page could be just above the PR5 division and another PR5 page could be just below the PR6 division - almost a whole division (toolbar point) between them.


Domain names and Filenames To a spider,,, and are different urls and, therefore, different pages. Surfers arrive at the site's home page whichever of the urls are used, but spiders see them as individual urls, and it makes a difference when working out the PageRank. It is better to standardize the url you use for the site's home page. Otherwise each url can end up with a different PageRank, whereas all of it should have gone to just one url.

If you think about it, how can a spider know the filename of the page that it gets back when requesting ? It can't. The filename could be index.html, index.htm, index.php, default.html, etc. The spider doesn't know. If you link to index.html within the site, the spider could compare the 2 pages but that seems unlikely. So they are 2 urls and each receives PageRank from inbound links. Standardizing the home page's url ensures that the Pagerank it is due isn't shared with ghost urls.

Example: Go to my UK Holidays and UK Holiday Accoommodation site - how's that for a nice piece of link text ;). Notice that the url in the browser's address bar contains "www.". If you have the Google Toolbar installed, you will see that the page has PR5. Now remove the "www." part of the url and get the page again. This time it has PR1, and yet they are the same page.

Actually, the PageRank is for the unseen frameset page. When this article was first written, the non-www URL had PR4 due to using different versions of the link URLs within the site. It had the effect of sharing the page's PageRank between the 2 pages (the 2 versions) and, therefore, between the 2 sites. That's not the best way to do it. Since then, I've tidied up the internal linkages and got the non-www version down to PR1 so that the PageRank within the site mostly stays in the "www." version, but there must be a site somewhere that links to it without the "www." that's causing the PR1.

Imagine the page, The index page contains links to several relative urls; e.g. products.html and details.html. The spider sees those urls as and Now let's add an absolute url for another page, only this time we'll leave out the "www." part - This page links back to the index.html page, so the spider sees the index pages as Although it's the same index page as the first one, to a spider, it is a different page because it's on a different domain. Now look what happens. Each of the relative urls on the index page is also different because it belongs to the domain. Consequently, the link stucture is wasting a site's potential PageRank by spreading it between ghost pages.

Adding new pages
There is a possible negative effect of adding new pages. Take a perfectly normal site. It has some inbound links from other sites and its pages have some PageRank. Then a new page is added to the site and is linked to from one or more of the existing pages. The new page will, of course, aquire PageRank from the site's existing pages.

The effect is that, whilst the total PageRank in the site is increased, one or more of the existing pages will suffer a PageRank loss due to the new page making gains. Up to a point, the more new pages that are added, the greater is the loss to the existing pages. With large sites, this effect is unlikely to be noticed but, with smaller ones, it probably would. So, although adding new pages does increase the total PageRank within the site, some of the site's pages will lose PageRank as a result. The answer is to link new pages is such a way within the site that the important pages don't suffer, or add sufficient new pages to make up for the effect (that can sometimes mean adding a large number of new pages), or better still, get some more inbound links.


The Google toolbar
If you have the Google toolbar installed in your browser, you will be used to seeing each page's PageRank as you browse the web. But all isn't always as it seems. Many pages that Google displays the PageRank for haven't been indexed in Google and certainly don't have any PageRank in their own right. What is happening is that one or more pages on the site have been indexed and a PageRank has been calculated. The PageRank figure for the site's pages that haven't been indexed is allocated on the fly - just for your toolbar. The PageRank itself doesn't exist.

It's important to know this so that you can avoid exchanging links with pages that really don't have any PageRank of their own. Before making exchanges, search for the page on Google to make sure that it is indexed.

Some people believe that Google drops a page's PageRank by a value of 1 for each sub-directory level below the root directory. E.g. if the value of pages in the root directory is generally around 4, then pages in the next directory level down will be generally around 3, and so on down the levels. Other people (including me) don't accept that at all. Either way, because some spiders tend to avoid deep sub-directories, it is generally considered to be beneficial to keep directory structures shallow (directories one or two levels below the root).

ODP and Yahoo!
It used to be thought that Google gave a Pagerank boost to sites that are listed in the Yahoo! and ODP (a.k.a. DMOZ) directories, but these days general opinion is that they don't. There is certainly a PageRank gain for sites that are listed in those directories, but the reason for it is now thought to be this:-

Google spiders the directories just like any other site and their pages have decent PageRank and so they are good inbound links to have. In the case of the ODP, Google's directory is a copy of the ODP directory. Each time that sites are added and dropped from the ODP, they are added and dropped from Google's directory when they next update it. The entry in Google's directory is yet another good, PageRank boosting, inbound link. Also, the ODP data is used for searches on a myriad of websites - more inbound links!

Listings in the ODP are free but, because sites are reviewed by hand, it can take quite a long time to get in. The sooner a working site is submitted, the better. For tips on submitting to DMOZ.



BlueVoda Website Builder