The author is alive and well and living somewhere west of the Greenwich meridian.
 
Google
Long Dark Tea-Time Web
Site hosted by DreamHost
 
Archives
March 2003
April 2003
May 2003
June 2003
July 2003
August 2003
September 2003
October 2003
November 2003
December 2003
January 2004
February 2004
March 2004
April 2004
May 2004
June 2004
July 2004
August 2004
September 2004
October 2004
November 2004
December 2004
January 2005
February 2005
March 2005
April 2005
May 2005
June 2005
July 2005
August 2005
September 2005
October 2005
November 2005
<< current
 
Tea-Time Feeds
Atom feed Atom feed
Subscribe with Bloglines Bloglines subscribe
Add to Yahoo MyYahoo subscribe
 
All your links are belong to us
Chicken or Beef?
eAsylum.net
Hateful Things
KnowProSE
K'vitsh
The Long Dark Tech-time of the Soul
The Mad Prophet Blog
Meg Does Blogs
Net Politik
Rush Limbaughtomy the Dittohead Recovery Site
South Knox Bubba
 
Rolling, rolling, rolling
Alternet.org
Angry Bear
Arianna's Blog
The Big Picture
Curry Blog
General Glut's Globblog
GuvWurld
In These Times
It's Still The Economy Stupid
Let's run the numbers
Tufte's Economics Classes Blog

SF  Bay Blogger
 
Douglas Adams
1952/3/11 - 2001/5/11

DNA Home Page
Wikipedia Entry
The Long Dark Tea-Time of the Soul Novel
H2G2
 
StatCounter:
SiteMeter:
 
Creative Commons License
This work is licensed under a Creative Commons License

The Long Dark Tea-Time of the Soul
Miscellaneous ramblings written as my soul endures a long dark tea-time
 
Thursday, April 17, 2003  

Crawl Grub, crawl!

Wired News featured a story on a new web search engine concept Grub. It relies on a distributed web crawling engine that anyone can download and run on their computer and also run as a screensaver, much like the famous SETI@home project screensaver. As the screen saver runs it throughs a node map onto your screen of recently crawled URLs and the statistics.

The Grub concept really makes a lot of sense to me. As the amount of material on the web increases the amount of work to crawl and index it increases too. Relying on a small but finite percentage of machines out there indexing for you is much more practical than trying to centralize a single huge collection of indexing machines.

Grub isn't actually providing its own search engine form yet (if ever), its making its crawling results available to third parties for doing searches. The first example is the WiseNutsearch engine, others will be coming soon. Crawl data is also available to users directly via an XML interface.

If all goes according to plan Grub plans to accumulate enough indexing clients to crawl the entire web every day. Compare this to Google that typically expects to take two to four weeks to crawl the web. I expect its only a matter of time before Google launches its own Google screen saver distributed web crawling engine. But much as I love Google, I'd much rather support the young upstart Grub.

Crawl Grub, crawl!

4/17/2003 12:07:52 PM