Blog Search = Opportunity
Some time ago, Robert Scoble was pointing out the different results on various blog tracking/search engines. He offered up an opinion on what his results were which started a fairly fun food fight between him and reporter David Berlind from ZDNet. Doc Searls suggested that that there be formal testing to get apples to apples comparisons. You can dive into the fun, via Scoble, here or you can start with David’s wack-a-Scoble article here.
[Random side note: If the folks at Microsoft have “MSN Search instead of Google It”, “no ipods”, “no podcastings”, I wonder if they can say apples to apples? ;-)]
Anyway, I started thinking about this from a VC perspective, i.e., making money off this problem. First thing I noticed is that, even today, all basic search engines (Google/MSN/Yahoo) will give generally the same first page results on major issues but all of them go off into different levels based on the secret sauce the particular engine uses. So, the general rule of thumb of checking more then one engine on almost any search certainly applies. No surprises.
In a recent posting about rude Q&A, I put a line in about Linux and laptops running that operating system. The original post is here. That post was about 24 hours ago, give or take. Using the specific phrase from that post, I tried these 4 services. All totally free I might add.
The feedster hit returned my post with some interesting side data. How long ago was the posting (1 day 5 hours), how long it was (1000 words) and on the left a little powered by typepad emblem which I assume (hope) they are getting paid for.
The Blogpulse hit returned my post with no real additional data but additional services like tracking the conversation, trending the entry, RSS feed for the search, a nice graph, etc.
IceRocket got my hit as well as another blog that used the exact phrase I did (in his title). The only issue on that second one is the entry went to a broken link. IceRocket offered some very interesting features/services. On my result, you were offered the ability to get all the links to my blog, see all my posts, ignore my blog from the search results, and subscribe to my blog feed. In addition, it showed you the exact time my entry was posted and who the author was. The number of posts shown was wrong, but the concept and services offered seem very interesting.
Technorati got my hit twice so it’s a dupe but it also shows it being by two different Rick Segal types so I’m not sure what that problem is. The entries are a dupe for sure and Technorati didn’t get the other blogger who used the exact same phrase as me.
The Google search resulted in 9 hits, none of which were blogs entries.
The MSN Search resulted in a dialog box saying “Linux on a Laptop? What, are you nuts??” Just kidding. Actually, it produced 8 hits including my blog entry. (So all the rumors about RSS Search? It’s happening, folks.)
With respect to David Berlind, I am not a reporter and do not practice the art of journalism. Likewise respect to Robert Scoble, ain’t no evangelism being practiced here either. So, the results (if you can call em that) are simple; showing some interesting data points that tell me a couple of things.
The Long Tail of Conversations
(Attn: Chris Anderson. Sorry I made the snarky comments about it being an overworked term, I use it. A lot. Sorry..)
I have a running dialog (upgraded from debate) with many people on the approach of “ready fire aim” when it comes to blogging. I’m of the opinion that thinking about it, even for an hour or so, can result in better quality, higher signal to noise ratio, and just more civil dialog. Others disagree and think that when the conversation is happening, it’s happening baby, dive in or get left behind.
We are going to have a “long tail” of conversations that, over time, is going to be stocked full of valuable data. The tools/services that the blog search engines are trying out represent attempts to get real value out of all the conversations, not just the tier one or immediate entries.
Well thought out prose are going to be found because the tools will get better and “google juice” will matter less.
Raw search, been there, done that. What to do with the data and how to make it useful to me/my business? That’s the exciting and opportunity rich stuff you should be watching.
There clearly is no winner here yet. So, in my view, I’d avoid declaring a winner or this one is better vs. that one for at least a bit longer. Me thinks the really good stuff is coming.







Actually, I consider this to be a good test. Counting the links forces you to dive into the methodologies of each service to see if their objective is to unearth every link possible (regardless of whether we agree with it or not). You also end up having to vet the accuracy of the search results. How many of them appear to be dupes? Are they pure dupes, or turning up because of things like link blogs and other RSS feeds? How many of them are actually correct? What about these bots that people have set up that do nothing but watch then specific blogs for anything "CRM" related (just and example) and turn around and spit it out? In my experience, all the services appear to discover new citations. So, I like your test Rick, because it's pretty easy to check the results. I have a new post. How long does it take each service to discover it and any citations of it. With this test, you don't have double check every citation of every historical post. Just the most recent one which is probably pretty representative of how these search services discover all new posts. Then, you can examine each result, see how accurate it is, and report on any cool "condiments."
If I had to come up with a more scientific methodology, I think I'd find ten volunteers and start like this (feel free to improve on it). I'd write three blogs and then ask each of the volunteers to randomly cite those blogs in (a) the text of one of their blogs, (b) in a list of links that appears on the side, (c) then link to my whole blog as a part of their blog roll, (d) make use of some folksonomy tags that I provide to them to see how that impacts the situation, and (e) ask some to ping (automatically, or manually) the blog search services and others not to ping at all. I'd also want them to let me know when that (a)citation scrolls off their home page so I can see if and how that impacts each search engines results. Then, I'd like them to remove the other links as well, warning me just before it happens and then telling me exactly when it happens, and then monitoring the results for any changes. The folks at the different blog search services might have some other suggestions based on what the objectives of their engines are.
David Berlind
Executive Editor
ZDNet
Posted by: David Berlind | July 24, 2005 at 14:01