I found this interesting link through a post that Frank Arrigo made recently – basically it traverses the HTML of a page and then turns it into a pretty little graph. At first glance I dismissed it as totally useless, but then I actually started analysing what it showed. In doing so, it colours the graph nodes based on what type of HTML element they represent (link, table structure, etc).
If you look at the graph for SSW's homepage you can see that it is dominated with a lot of red nodes. These nodes represent table markup (TABLE, TR, TD, etc) and are reflective of their heavy reliance on tables for markup. There are also a significant number of blue nodes representing links. The graph is a pretty good indication that Google will be sifting through a lot of redundant markup to find anything (the red nodes) and any PageRank earned will be diluted across a lot of pages (the blue nodes). The density of blue nodes (links) also indicates there are a lot of possible navigation paths from the homepage, which to me indicates that it's most likely overcomplicated and doesn't provide a clear banana for users.
In contrast, if you look at the graph for Unwired's homepage it's instantly obvious that their underlying markup is much cleaner. Granted, they don't have as much content as SSW, however for a homepage I think that's almost better. The lack of any red nodes (tables) indicates that a search engine will be focussing on more relevant markup and content. The concise number of blue nodes (links) indicates that there are a small set of clearly defined navigation paths from the homepage.
The graph of Fuel Advance's homepage shows a very few nodes at all, even though it has more content than the Unwired homepage. This can be attributed to a combination of a simpler layout (it's fairly bland at the moment – although in the process of changing) but more so to the fact that we eliminated a lot of redundant markup during the development phase. (Take a look at the source code of the page and see for yourself.) This is one of the main reasons why our PageRank is so high for such a small site that few people link to.
You can see a whole gallery of generated DOM graphs by looking at the websitesasgraphs tag on Flickr.
Update (9th June 2006): Added inline screenshots for quick references (but you miss out on the cool animations this way). Added link to gallery of graphs.
Update (12th June 2006): Added inline screenshots of the actual websites instead of just the graphs.
technorati tags:websitesasgraphs, CSS, webdev
very nice