Search engines
Analysis of search technology, products, services, and vendors. Related subjects include:
The biggest text analytics company you probably never heard of
I caught up with Expert System S.p.A. last week. They turn out to be doing $10 million in text technology annual revenue. That alone is surprising (sadly), but what’s really remarkable is that they did it almost entirely in the Italian market. As you might guess, that figure includes a little bit of everything, from search engines to Italian language filters for Microsoft Office to text mining. But only $3 ½ million of Expert System’s revenue is from the government (and I think that includes civilian agencies), and under 30% is professional services, so on the whole it seems like a pretty real accomplishment. Oh yes – Expert Systems says it’s entirely self-funded.
As of last year, Expert System also has English-language products, and a couple of minor OEM sales in the US (for mobile search and semantic web applications). German- and Arabic-language products are in beta test. The company says that its market focus going forward is national security – surely the reason for the Arabic – and competitive intelligence. It envisions selling through partners such as system integrators, although I think that makes more sense for the government market than it does vis-a-vis civilian companies. In February the company is introducing a market intelligence product focused on sentiment analysis.
Expert System is a bit of a throwback, in that it talks lovingly of the semantic network that informs its products. Read more
Categories: Application areas, Competitive intelligence, Enterprise search, Expert System S.p.A., Ontologies, Search engines, Text mining | Leave a Comment |
Google is putting more emphasis on phrases
I don’t know how pronounced this trend is, but Google web search seems to be putting more emphasis on phrases than it used to.
For starters, Google doesn’t always ignore stopwords. The Fly and Fly produce different search results. Beyond that, “or” is sometimes assumed to be a word you’re searching on, not an operator — for an example, try live free or die and see the line of text that comes back under the search box. (I’m not sure whether this ever works for “and” as well — even Sanford and Son returns the usual harangue that “the AND operator is unnecessary”.) This is all a pretty clear indicator that Google is looking at phrases. Bill Slawski’s patent-analysis-heavy SEO blog has a lot more to say on that subject, specifically on an indexing scheme that addresses the problems that indexing stopwords in might otherwise cause.
Also, there’s a direct series of patents on “Phrase-Based Indexing.”
Finally, although I don’t recall a link, there seems to be a belief that:
- Google is using or moving to Latent Semantic Indexing (LSI)
- Word-based LSI is patented by somebody else.
Categories: Google, Search engines | 3 Comments |
Lynda Moulton on enterprise search
Lynda Moulton and I see enterprise search quite similarly, as I discovered when she called me yesterday to praise my post on the many differences between enterprise and web search, and followed up with this one of her own. One of Lynda’s big themes is that large enterprises, much as they use multiple database management systems, use multiple search engines too. Read more
An interesting Matt Cutts interview from December
Stephen Spencer has a great interview with Matt Cutts of Google, from last month’s Pubcon. Almost all of it is SEO-related. But it also contains a few tidbits that may be interesting even if one doesn’t care about SEO, such as:
- Google now indexes up to 1/2 a megabyte per page, up from the old 101K limit.
- Google needs to do a fair amount of image recognition, but they’re going fairly plain-vanilla. For Flash they use an Adobe-supplied SDK. For detecting hidden text (e.g., white-on-white) they use what Matt characterizes as pretty simple heuristics.
- As I noted recently, Google seems to have a lot of heuristics for identifying particular types of pages. In this interview, the example was that a page that would otherwise seem spammy because it consisted only of links would be fine if it were serving as a true site map or archive.
SEO highlights included: Read more
Categories: Google, Search engine optimization (SEO), Search engines | Leave a Comment |
19 bullet points about the difference between enterprise and web search
Eric Lai wrote in this week’s Computerworld about “Why is enterprise search harder than Google Web search?” Highlights included: Read more
Categories: Attivio, Enterprise search, FAST, Google, Search engines | 16 Comments |
A claim that Google is doing pretty detailed extraction
In a blog post focusing on SEOing for local search, some interesting claims are argued, including:
- Google knows what a review is. (This seems to be “everybody knows it” conventional wisdom.)
- Google knows how many stars a review got. (Ditto.)
- Google tracks who the reviewer is and how many other reviews s/he wrote (that’s the big insight of the post and related conversation).
Pretty interesting. Text mining companies are paying a lot of attention to Voice-of-the-Market these days; even so, I question whether then can do the same things out of the box.
Categories: Competitive intelligence, Google, Search engines | 1 Comment |
How Google’s technology took flight
For those who missed the original publication in April, 2002.
Categories: Fun stuff, Google, Humor, Search engines | 2 Comments |
More on Microsoft in enterprise search
Following up on my prior posts about Microsoft’s impending acquisition of FAST, they’ve now had the conference call. By custom and indeed antitrust law, such calls are very light on content. But here are a few tidbits and takeaways, all from Jeff Raikes of Microsoft:
- Jeff talked solely about FAST as adding to enterprise search, and rightly contrasted that with web search.
- However, he deflected questions about web search with “We aren’t talking about that much detail right now” rather than with a firm “Well, we aren’t allowed to use FAST that way.”
- Specifically, enterprise search is all about integration with SharePoint (portal).
- Jeff said Microsoft’s current search could handle millions or maybe tens of millions of documents, but thought there was demand for FAST’s ability to handle billions.
- He positioned FAST as an application development platform, giving an example of structured search (the actual word was “pivot”) in consumer electronics. … Well, at least he’s looking in the right direction.
Categories: Enterprise search, FAST, Microsoft, Search engines, Structured search | 1 Comment |
Microsoft in enterprise search
Microsoft has certainly had a number of false starts in search. At the 1997 Verity user conference, a Microsoft employee told me of his confidence Microsoft would surpass Verity in enterprise search the next year. Yeah, right.
In September, 2003, a nice woman wrote me to tell me she had joined Microsoft and would personally write the ranking engine for MSN search. That worked out great too.
Now Microsoft has a multi-faceted enterprise search strategy. Guy Creese seems mightily impressed. Should we, for once, be impressed too?
Frankly, yes. So far as I can tell, most traditional text search products have atrophied, including Verity before it was bought by Autonomy. And I’m skeptical about Autonomy’s Bayesian-everything approach. Oracle and Google, in different ways, consistently fail to round out their products. So if FAST’s technology can ever be fleshed out and stabilized, it indeed could be a market leader or even dominator. Read more
Categories: Enterprise search, FAST, Microsoft, Search engines | Leave a Comment |
Microsoft is buying FAST; what about FAST’s contractual prohibition?
As you’ve probably heard by now, Microsoft is buying enterprise search vendor FAST (Fast Search & Transfer). FAST wasn’t always focused on enterprise search; in fact, FAST built alltheweb.com. And when FAST sold alltheweb.com to Inktomi, it agreed not to reenter the web search business itself. Inktomi was subsequently bought by Yahoo, a company not much inclined to do Microsoft any favors in the web search arena.
I look forward to hearing why this won’t be a problem.
Categories: Enterprise search, FAST, Microsoft, Search engines, Yahoo | 4 Comments |