Search engines
Analysis of search technology, products, services, and vendors. Related subjects include:
Windows Live search is rather different from MSN
Until the middle of this year, I got negligible search engine traffic from either MSN or Yahoo, or indeed any other search engine except Google. We’re literally talking a 90-95% share for Google, on each of my three main blogs, most months.
But in November, the Windows Live share was 19% on DBMS2, 29% on Text Technologies, and 41% on the Monash Report. And those aren’t blips; in each case there was steady August-November monthly growth. But on the other hand, early December month-to-date figures are all back down. Weird. Read more
Danny Sullivan thinks blended vertical search is a game-changer
Danny Sullivan thinks blended vertical search — which he’s calling Search 3.0 — is a game changer. (In this context, “vertical” search denotes alternate result types such as video, image, map coordinates, or product listings.) In saying that, he’s focused on search marketers, who now have a lot more ways to try to get their messages onto Google searchers’ top result pages. But I presume what he’s really saying is that there will be a feedback effect — if Google tells all web searchers about videos and product listings, then internet marketers will be more motivated to post videos and product listings, and hence there will be more interesting choices of videos and product listings — which Google will naturally wind up featuring more prominently in its search results. And so on.
Given the Youtube explosion, I find it hard to argue with his claim.
Categories: Google, Search engine optimization (SEO), Search engines, Specialized search, Structured search | Leave a Comment |
FeedBlitz search is totally fried
If you take our integrated feed — and you should* — and you happen to pick the email option, that’s delivered via FeedBlitz. I subscribe myself, of course, and today I happened to check the option “Search Monash Information Services” (Monash Information Services is the name of the feed). That goes to this search page.
*That’s what this link is for. Or this one.
Curious to see how results compared to those from our own cross-site search, I tried a search on a company I write a lot about, namely “Netezza.” Nothing came up. Then I tried “Attensity.” Ditto. And “text mining”. Still nothing. In fact, there aren’t even any results on “Monash”.
I think some repairs may be in order …
Categories: Blogosphere, Search engines, Social software and online media | 2 Comments |
The case for Inxight Awareness Server
I’ve been pretty skeptical about Inxight’s Awareness Server. My theory is that ordinary enterprise search engines can index remotely anyway, and they offer much better search functionality. Inxight’s Ian Hersey was kind enough to write in and offer two counter-arguments.
First, Ian points out that there are circumstances when, due to security and permissions, you can’t really index everything via one search engine. Specifically, he offers the government as an example. OK, I can see that in the government, with its classified and/or regulated silos. However, I have trouble thinking of many more examples. While there certainly are plenty of instances where a variety of organizations share information on a somewhat arms-length basis, it’s tough to think of such cases where federated text search would come into play.
Second, Ian in essence disputes my claim of inferior functionality. While implicitly conceding — as well he should! — that Inxight’s Awareness Server doesn’t do some things full-featured search engines do, he points out analytic features that may not be found in conventional search engine offering. The big one he calls out is faceted search — which of course was the core of Intelliseek, the acquisition Awareness Server came from. Hmm. Faceted search has a checkered history, with Excite and Northern Light being perhaps the most visible among many failures. On the other hand, it’s a great idea that keeps being tried, and some versions — notably Endeca’s — have turned out well.
I guess I’ll have to reserve judgment on that part until I look at Inxight’s product and see what they do and don’t actually have.
Categories: BI integration, Business Objects and Inxight, Endeca, Enterprise search, Search engines | 1 Comment |
Event stream processors active in text filtering
OK. I secured permission to actually quote the details on something I’d previously dropped a small hint about — stream processing for text messages. Traditionally, that’s been the province of enterprise search companies. A decade ago, Verity had a kernel group of 6-7 engineers under Phil Nelson. They managed to produce not only a decent search engine, but a search engine “turned on its side” as well. I.e., instead of running one query against a corpus, they could run many queries each against documents as they arrived, one document at a time. Subsequently, the same idea has been implemented by most enterprise search providers, at least those that are serious about the intelligence market.
Well, the event-processing guys are active in that market too. At least StreamBase is. Read more
Categories: Autonomy, Business Objects and Inxight, Enterprise search, Search engines, Text mining | 2 Comments |
Text analytics marketplace trends
It was tough to judge user demand at the recent Text Analytics Summit because, well, very few users showed up. And frankly, I wasn’t as aggressive at pumping vendors for trends as I am some other times. That said, I have talked with most text analytics vendors recently,* and here are my impressions of what’s going on. Any contrary – or confirming! — opinions would be most welcome.
*Factiva is the most significant exception. Hint, hint.
If you think about it, text analytics is a “secret ingredient” in search, antispam, and data cleaning,* and this dominates all other uses of the technology. A significant minority of the research effort at companies that do any kind of text filtering is – duh — text analytics. Cold comfort for specialist text analytics vendors, to be sure, but that’s the way it is.
*I.e., part of the “T” in “ETL” (Extract/Transform/Load).
Text-analytics-enhanced custom publishing will surely at some point become a must-have for business and technical publishers. However, it appears that we’re not quite there yet, as large publishers make do with simple-minded search and the like. In what I suspect is a telling market commentary, there’s no headlong rush among vendors to dump text mining for custom publishing, notwithstanding the examples of nStein and (sort of) ClearForest. I don’t want to be overly negative – either my friends at Mark Logic are doing just fine or else they’re putting up a mighty brave front – but I don’t think the nonspecialist publishing market is there yet. Read more
BOBJ Inxight insights
When a company announces an acquisition, it usually does a round of limited-content briefings, in no small part because the antitrust lawyers won’t let them do anything else. Once the deal closes, antitrust restrictions are lifted, and they do another round of briefings. These, typically, are vague and platitudinous.
Business Objects/Inxight have now reached that point. Even so, my briefing yesterday had some aspects worth writing up. Read more
Categories: BI integration, Business Objects and Inxight, Enterprise search, Search engines | 2 Comments |
(A little) more on Business Objects/Inxight
After missing what seems to have been an uninformative press conference anyway, I hooked up later with the Business Objects folks on the phone. I say that it was probably uninformative because in the short call, it was pointed out to me that they really weren’t at liberty to say much anyway. Here are a couple of tidbits I picked up even so.
- Business Objects’ text mining partnerships have been more demo/sales-cycle than actual sales up until now. That said, they have a few deals each with Attensity and Inxight (but not with ClearForest, which pulled in its horns prior to being acquired by Reuters). I still think they’re the leading BI vendor in integrating with text mining, SAS perhaps aside (who if nothing else have a lot of fun using text mining for data cleaning). The working Inxight partnership, by the way, was all about the specific app of email compliance, with the demo being based on the publicly available Enron corpus.
- Inxight’s visualization technology is in the form of an SDK anyway. So integrating it into BOBJ’s product line should be straightforward. Note: Through the Excelsius acquisition, BOBJ has been trying to gain competitive advantage in the cool-visualization area.
- Inxight’s “federation” capability for search is pretty primitive (my term and opinion of course, not theirs). It takes in search result sets from various sources, then clusters and/or refilters them. What it does NOT do is the much harder task of taking actual relevancy rankings from various engines and somehow arbitrating between them. Nor, I’m guessing, does it even assign higher or lower weights to various corpuses or anything like that. Thus, it does not sound terribly competitive with the distributed search capabilities built into any state-of-the-art enterprise search engine.
Categories: Attensity, Business Objects and Inxight, ClearForest/Reuters, Enterprise search, SAS, Search engines, Text mining | 5 Comments |
Huge e-commerce gains claimed by everybody
The folks at Progress claim huge conversion rate benefits to EasyAsk, although unfortunately so far I’ve been unable to drill down and see what those numbers really mean. (Flagship customer = Land’s End.) Baynote makes more modest but still large claims. (Flagship customer = no big names that I’m aware of.) Endeca is clearly the market leader. (Flagship customers = Wal-Mart, Home Depot.) Mercado and Inquira are important players, at least in certain verticals.
I think it’s safe to say that e-commerce site navigation aids constitute a really important product category.
Categories: Baynote, Endeca, InQuira, Mercado, Progress and EasyAsk, Search engines, Structured search | 1 Comment |
Wise Crowds of Long-Tailed Ants, or something like that
Baynote sells a recommendation engine whose motto appears to be “popularity implies accuracy.” While that leads to some interesting technological ideas (below), Baynote carries that principle to an unfortunate extreme in its marketing, which is jam-packed with inaccurate buzzspeak. While most of that is focused on a few trendy meme-oriented books, the low point of my briefing today was the probably the insistence against pushback that “95%” of Google’s results depend on “PageRank.” (I think what Baynote really meant is “all off-page factors combined,” but anyhow I sure didn’t get the sense that accuracy was an important metric for them in setting their briefing strategy. And by the way, one reason I repeat the company’s name rather than referring to Baynote by a pronoun is that on-page factors DO matter in search engine rankings.)
That said, here’s the essence of Baynote’s story, as best I could figure it out. Read more