Someone is unFurling a Solution to One of My Search Problems

Dream about an application, and someone is already building it!

Back in November I wrote,

It’s true that linkrot is a serious problem. It’s also true that archive.org is only a partial solution since it doesn’t get anything and some big content providers — like the Washington Post — block it.

Is the only solution to make (copyright busting?) offline copies of everything? If so, where’s the tool that will automate that for me, and — more importantly — index all that content on my drive, disk, or tape?

Maximillian Dornseif wrote in the comments section that,

I have build such a beast. Basically it snatches your browsers browsers history and downloads the pages you have visited. Its running on a server because my notebook hasn't enough harddisk space for such experiments. Searching in this Archive is possible although at the moment only via the command line.

I share that installation with a few friends and we are looking at it as an research project. We would love to make it available to others but on thee other hand we have no desire to to though evaluation of the restrictions based upon us by the various laws governing immaterial goods.

See http://blogs.23.nu/disLEXia/stories/1412/ and http://blogs.23.nu/c0re/stories/1928/

That project looked a little experimental for me…but now it seems that someone else is trying to make a commercial version of a web memory/personal history full text search tool, and he calls it Furl:

John Battelle's Searchblog, Grokking Furl: Storage, Search, And The Personalweb: Mike [Giles] started Furl about a year ago to solve a problem he – and a lot of us – had with bookmarks. Namely, bookmarking is a lame, half-assed, unsearchable, flat, linkrotten approach to recalling that which you've seen and care to recall on the web. Now, a lot of folks have made stabs at solving this particular problem, but Mike's got a lot of very cool features built into his beta, and more on the way.

And from my conversation with him, he's got one more thing that others might be missing: a clear sense of what Furl could do if it were part of a massively scaled platform like AOL, Yahoo, Google, or MSN. If I'm reading him right, he's smart enough to realize that what he's built will probably be a feature set on everyone of those platforms before the end of 2005, and he's also smart enough to know that by launching Furl, he's forced all of them to consider him as the person to watch in the space.

So what is it about Furl that made me write that past paragraph? After all, it's just a web page-saving application. Right? Well, yes and no. Furl does a good job of helping you manage your web browsing. It adds several features that others don' t have – full text search on your saved pages, for example. But Furl saves the entire web page you've “furled”, not just the URL, which prevents link rot, on the one hand, and creates what I'll call a “PersonalWeb,” on the other.

Now, having your own PersonalWeb is a very cool thing. Every page you care about is now saved forever, and is searchable. How I wish I had Furl while I was researching my book for the past year. This application was inconceivable before the cost of storage and bandwidth began to fall toward zero.

But wait…there's more. You can share your PersonalWeb with others. And Mike just added a recommendation engine, so you can see links the service thinks will be interesting to you, based on what you've already Furl'd. Now, let's play this out. Imagine Furl on, oh, Yahoo, for example. Or Google. You now have a massively scaled application where millions of people are creating their own personal versions of the web, and then sharing them with each other, driving massively statistically significant recommendations, and…some pretty damn useful metadata that can be fed into search engine algorithms, resulting in…yup, far better search (and…far better SFO (Search Find Obtain) opportunities).

This entry was posted in Internet. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *