news/blog search

first more rules for vertical entrepreneurs
rule 9 find your inner killer ap.
understand the core problems and built tools to solve those problems. make it sticky.
rule 10, let’s get vertical.
community is a natural for verticals.


Steve Gilmore of zdnet moderating

tantek celik, technorati
index more than 12 million weblogs in real time, and sue that data for other things like top movies, etc
also trying to create microstandards for verticals.
Scott rafer, people will move to subscription model and there is a biz model for that
Jim pitkow, moreover,
enterprise and provides to big guys like MSN.
Chris tolles
categorizes news.

Q: Scott can you connect threads across the sessions?
Scott: I wasn’t in all of them, i had investor things to do. but one thing I see, is how people see vertical search. vertical search is vertical market. but shopping search is functional in a certain way. and our work is more about a new standard rising. we’re connected by rss, and know it screws up pagerank. here is a place where there is an inefficiency between gold standard of google and what some users want. So let’s get that together and put a great ui around it and see what happens.
Jim: people read news online every single day, and it’s tough, tens of thousand of sources, updating all the time. and end users want to know about it as it happens. search can be slower. and news is old, profitable and has standards around ti, so it’s an interesting space.
Chris: in the shopping panel, a good shopping product isn’t necessarily a search engine, with news, a big factor in relevancy is timeliness. if you want stuff from the last five minutes, the last hour, that’s what ties together this vertical. Freshness.

Q: (Om Malik steps in biz 2.0 as moderator)how do you add context to the same news story over and over.
Chris: one thing we do, is cluster stories so we can cluster all the versions of the same Ap story. So the data structures is an event, not a story. so we don’t’ have 70 of the same story on top of each other. then we rank, and we have a story picker on the front page, and you can categorize by locality. We create a formula to act as editor.
Jim: relevancy is an interesting point, relevant to whom, when? We see it as a matter of metadata so users can drill down. provide context, so san joses make sense based on hierarchy. Also, add authority so you can create a feed that has locality, topic and authority. data isn’t just data, we have metadata on top. you have flexible and agile, we use tech, we add human editors, and allow users to customize.
Scott: dedupe is important, but a lot of what we touch is two/three lines different. and that two/three lines makes the difference. we have to respect not pulling out the information provided. these are not dispassionate users, the energy is incredible. here is heartfelt work, and you can’t just dedupe that, we’ve tried dialing it up and down and you have to decide what feeds are most important and go with that.. pagerank takes to much time to build. so the person provides the authority, that person has been blogging on this topic for a year and a half, and make them a micropublicaiton. it’s tricky. deduping can loose publishers and searchers.
tantek: relevance is hard– freshness, authority, SN? what makes it relevant? you can look at yahoo and see what’s emailed and that tells you something, but someone who has blogged it, who has gone to the trouble of providing comments, looking at the hypermesh of bloggers can tell you what is relevant.
We’ve seen some stories that pop first in blogsphere before mainstream media, such as the tsunami. they can match ads to level of profanity, they can match to kind of consumer (a.k.a. don’t want to advertise nokia on sony fan). there were problems with democratic ads on republican sites.. advertisers didn’t mind as much as site owners. (tantek… accidentally, right?)
Scott: rotten tomatoes has figured out if the overall is negative is positive or negative. it can be done technologically.
Tantek: it can be hairy also…. we’ve been working on open standard on publishing reviews. the problem is there are many dimensions, it’s a challenge to do. But I want to return to Om’s question. With technorati you can see more recent, or least which allows for story breaking. on authority, we-humans determine it many ways, political bias, etc. but we (technorati) we do it abstractly, by (something that sounds like pagerank).

Q: how do you deal with relevancy against the reblogging of things, how to take the most value raise to the top.
Chris: most people don’t want everything. maybe when egosurfing. but people offer a different kind of relevance, editorial touch. If many people are writing about a story, such as tsunami, we can put the scoop at the top if it works… but sometimes the scoop isn’t important. the story is a commodity– it doesn’t matter who did it first, you want the best story or someone’s take/brand preference is bigger than scoop. if you are a fox viewer, you want the fox view
tantek: do we want to encourage that kind of siloing of viewpoint?
Chris: you want to expose people to as many points of view. it’s not as important if it’s NYT or a blogger.
Jim: Some people care about the source, some want the color commentary. We can ask what is authority, or we can ask to whom? can we give tools so users can choose who is authoritative? these are pieces of metadata and we can use it. We can offer tools.
Scott: mentions “who broke the story”, it’s hard to tell who did. sentiment, editorial bias… is important to CPC advertisers also.

Q: you all have an editorial slant, what kind of user interface suits that?
Scott: it’s religious. We believe in RSS, in which there is no ultimate interface. we are a we service. 99% of data leaves in xml. I don’t knwo all the interfaces in which we’re being used. I probably wouldn’t’ like most of them. We keep having to struggle with TOS, because we want people to remix, but we want to get paid. there is no ultimate UI, there is a long tail of user interfaces. and most value is in the one I’ll never see?
Tantek: what is the one resource I am short of? Time. I have five minutes to hear something that is relevant to me. the UI’s that will be best for me will be the most valuable.
Scott: the interface has to account for other things– people returning to your site and optimizing for people to return to your site.
Scott: there will be many interfaces with the same core data in the future (lost track, got unplugged)
Jim: for us the interface is the api. that’s the beauty of the data layer. you have to have good data, good metadata, and and let the user choose (I wonder if he means the user or the provider)
Scott: if it bleeds, it leads (tantek, you just said the reason I turned off my tv)… we want to see what’s interesting. we have a point of view. you build for you users. if you are building for someone whining about a rss standard, that’s for you.
Scott: the interface of feedster is fine most of the time, we’ve got the white page, the box… but if it’s red sox, it belongs on the Boston globe and that’s the right interface.
Q: not what he meant… can you send me data in the way I’m used to, can you get me the information I need, that’s what I mean as UI.
Chris: we’re an agregator, so we can’t reproduce the NYT way… depends what you mean.
tantek: describe your ideal news reading experience, om.
Om: there is a a ordering of stories, most important to least, but on the others it doesn’t work that way. How do you come up what the UI– i see stories i want to break my head against the wall, i can’t find what I want to see.
Scott: here isn’t’ that much news out there, look at the %of stories in chron that are from the wire. Some days there is no news.
Scott: there are only so many MCI executives to convict
The technorati guy dos these beautiful graphs of story strength, peaks and troughs of attention. I have things that only go bright blue (popular?) every few weeks.
tantek: we talked about different ways of filtering. on NYT that’s editorial filtering, humans choosing.
Scott: 12 or 15 hours ago
tantek: or you can show what is most important to the blogosphere. But what it sounds like you want is a persistant search, so that if you are into a specific topic, you can follow it.
Om: a newspaper chooses the story of the moment, so why don’t’ you do that.
Chris: you can do that with a top page, like googles.
Q; how well is tagging going?
tantek: we’ve seen amazing results. by allowing people to tag their blog posts, and bring in flickr and furl, you can across. We’ve seen some spam, but they stick out like a sore thumb. mostly it provides a lot of value. If you have shown interest in one tag, and there is new stuff with that tag, we bump relevance.
Scott: I think tagging your own stuff leads to spam. With stand alone tagging, we’re trying to get less sophisticated people to tag, and that is beyond most people even those who have blogs.
tantek: everyone is learning from everyone else– we’ve seen so much interest.

Q: Micropayments?
Scott: we think google is already a massive micropayment system, it’s called adsense. Micropayments to read an article is going to be limited.
Chris: everyone has tried it and it’s never works. there is no success story from micropayments. people don’t buy that stuff in volume
tantik: I have to agree– there is an explotion in content, the question is not how can I pay for good stuff, it’s how do I find the good free stuff.
Scott: of course I was completely wrong in regards to itunes.
Om: I can tell you for a fact, no one will get rich form adsesne.
Chris: people are making nice money form adsense
<>>dissolves into madness<<>

Separator image Posted in Uncategorized and tagged with .