Visualizing the Internet
Tuesday, July 01, 2008   

[Images on this page may take a while to load.]

A few months ago I wrote about Internet as a high-stakes game played by millions of people.

The Internet, you see, is an enormous, three-dimensional gaming surface. This gameboard consists of billions of nodes (web pages) joined by trillions of connectors (hyperlinks), fluctuating and evolving in real-time. It helps if you visualize it as a physical thing, and to do that, we'll need to put on our magic decoder glasses.

In the comments to that post, I said I was working on an Internet visualization tool or agent. Today I'd like to introduce that tool, code-named Inviz, short for "Internet Visualizer". It's the piece of software responsible for the images you're about to see. They were generated entirely inside this tool, in a matter of a few minutes.

Get those magic decoder glasses ready. Our first Internet snapshot is in black and white:

That's a living, breathing picture of a small galaxy of web sites as they exist in the real world. Each block in the above diagram is a single web page. Each line is a single hyperlink (or other association between two pages). Everything you see can be manipulated, dragged and dropped, resized, queried for its properties and content. If you double click a node, the tool traces the nodes hyperlinks and opens up another galaxy of nodes, and so forth. As you can see, after just a few minutes of working with the tool, the relationships can get complex.

Because the Internet itself is complex, and all Invis really does is take a (somewhat) intelligent picture of that complexity.

Let's zoom out for the bird's-eye view, and tell Inviz to color-code sites according to the domain they belong to:

It's a battleground, our Internet. Can you guess which galaxies belong to Digg.com? Microsoft? Coding the Wheel? Or Jeff Atwood over at Coding Horror?

Give up?

Let's zoom in (Inviz supports zooming by any arbitrary scale factor) and take a look. Here's the diagram at 500%.

So, at least in this particular view, Inviz has chosen a red-white gradient fade for Digg.com.

The galaxy of red blips you saw in the 2nd picture? The Digg.com home page, and (some of) the pages it links to. Light blue nodes? Various Microsoft sub-domains. Dark blue nodes? Good old Jeff Atwood over at CH. Black nodes? Sites we don't care about.

Of course, colors, shapes, and connectors can all be customized by the user. In fact, we can do away with the graphical view entirely, and boil the entire web down to a simple tree control.

This link hierarchy is the well-known factor behind Google's PageRank. And although not shown here, Inviz has an option allowing you to tweak the size, shape, color, or other aspects of a given element based on its incoming links, resulting in a sort of visual PageRank. Now if I could only get access to the Google master index...

Anyway, this diversion has been brought to you by...

Inviz was implemented using C# and .NET 2.0+ exclusively. It is a work in progress. Inviz is not currently available for download or purchase. That may change if there's any interest. If you'd like to see a picture of your own website as seen through Inviz's eyes, leave a comment. It's capable of some very strange visualizations, and even has some useful HTTP functionality under the hood.

Until next time.


Posted by James Devlin   22 comment(s)

SEARCH

COMMENTS

Here's a couple of similar pictures of a some major porn sites.

Anonymous on 7/1/2008 5:20:45 PM (98 days ago)

I'm developing a similar tool as part of an exam. Yours looks really better of course... i'd really like to take a look at your code...
Which graph drawing algorithms does Inviz use?

slacker on 7/1/2008 5:51:22 PM (98 days ago)

Amazing pictures. You say you can do full drag and drop with this tool? How does that hold up using C#?

Keith on 7/1/2008 11:00:11 PM (98 days ago)

Pretty.. but what does this have to do with building a poker bot? ;) Or can the poker bot surf the Internet too? Smile

We Want Bots on 7/1/2008 11:08:32 PM (98 days ago)

James, I'm curious. Obviously in order to generate these pictures of the Internet, the "bot" (that's what Inviz is, right?) has to send out HTTP requests and parse the returned responses. In other words, this is essentially a spider bot such as that used by the search engines. My question is what user agent string you're using for it? "Inviz v1.0"? Or does it use an IE user-agent string and so forth? Or are you perhaps using a third-party spider?

Edmund Kasze on 7/1/2008 11:33:12 PM (98 days ago)

I did give it its own UA string, more on a whim than for any other reason. But in practice you get better results using a well-known UA string. There's a dropdown (not shown) which allows you to set that and other request/response options.

James Devlin on 7/1/2008 11:59:59 PM (98 days ago)

If C#/WinForms is capable of generating that UI with drag and drop I'll eat my shirt. What're you using for the GUI? Be honest now... DirectX? Open GL? Maybe an SVG library of some sort?

If you can package the display code as a component I would gladly pay money for it. Assuming it's interactive as you say it is, which I doubt because C# is so slow. Post a working demo so we can have a chance to use this tool. Or just admit you built everything in Photoshop. ;)

Coding the Wheel Skeptic on 7/2/2008 1:11:23 AM (98 days ago)

like slacker, i am a student and like slacker, i also would like to hear more about the drawing algorithms. are you using a graph layout algorithm or just winging it? does it spider nodes recursively and if so, how does it store them? and can you send me a demo i'd like to take a few prints of some of my websites.

Evan Blake on 7/2/2008 1:15:29 AM (98 days ago)

The above images remind me of another blogger, I forget his name but the link is here:

http://www.flight404.com

Look at his work on visualizations.. performed in the open-source Processing language.

Anonymous on 7/3/2008 1:11:52 AM (97 days ago)

I'd like to see some more of these. You should do a big collage and post to Flickr.

Anonymous on 7/3/2008 5:19:50 AM (96 days ago)

Hi James,

I wrote you a private message about a link exchange. I'm going to scrap the idea of a link exchange and if you mention our site somewhere in this blog we'll post your link up on the front page. The anchor text can be "How to make a poker bot" and seeing as our site is "www.pokerisrigged.com"... I think you'll be seeing a lot of interest. We are currently getting ~70 UVs a day.

Look forward to hearing back,
Nick.

Poker League on 7/4/2008 9:27:21 AM (95 days ago)

JazzyBets has extensive coverage on the Independent Chip Modeling Pokerbot Calculator, ICM Bot.

JazzyBets.com on 7/4/2008 3:29:13 PM (95 days ago)

Nick: Sorry, I am (way) behind on my emails. Thanks for the note and I'll be in touch.

James Devlin on 7/5/2008 8:26:58 AM (94 days ago)

I find this kind of work really intriguing.. I think you need to make some enhancements when the nodes are shown at high zoom- surely you can do something better with that screen real estate than to just show the site URL? but I'm guessing you're aware of that. Visually the colors and layouts are quite striking, but how much functionality is under the hood?

Arthur E. on 7/6/2008 4:56:32 AM (93 days ago)

And: where's the demo? This tool loses a lot of its value if people aren't allowed to play with it. ;) Just my .02.

Arthur E. on 7/6/2008 4:57:24 AM (93 days ago)

These diagrams are fascinating. If Google would hook something like this into their crawlers maybe we could have a map over the entire Internet (as reachable by links, that is).

One interesting question you could answer with such a map is "What is the least number of clicks you need to take yourself from page X to Y."

J on 7/6/2008 7:26:30 AM (93 days ago)

Is it possible to hook a windows application using Java?

Josh on 7/8/2008 10:17:11 AM (91 days ago)

this is awesome. distribute this!

buddhahands on 7/9/2008 10:02:58 AM (90 days ago)

You didn't answer to me Frown

However, if someone is interested i've found an interesting tool at http://whatsonweb.diei.unipg.it/

slacker on 7/10/2008 12:38:49 PM (89 days ago)

>You didn't answer to me

Hi slacker- the layout code is rather a mess. Pending refactoring. I'd like to say I'm using a force-based or spectral layout, etc., and have looked at some of these approaches, but right now it's just a typical cluster grouping with a small amount of randomness applied. Also, various layout commands are made available to the user (gather/scatter by domain/connection/etc).

I'll post a follow-up to this at some point.

James Devlin on 7/11/2008 10:15:55 AM (88 days ago)

You should package this as a Mozilla add-in.

Anonymous on 7/20/2008 3:42:30 AM (79 days ago)

Or a standalone tool like Fiddler.

Anonymous on 7/20/2008 3:42:54 AM (79 days ago)

Comment on this post:

Thanks for your interest in Coding the Wheel. All fields are optional.