July 27, 2006

More on Attensity

I had a long and far-ranging talk today with Attensity. Key takeaways included:

They want to be positioned in the BI space, e.g. as “ETL for text/unstructured data.” They place a lot of value on their partnership with Business Objects and Teradata. And, as a key part of the exhaustive extraction/FRN story, they think that BI/data warehouse information roll-up tools are an excellent (if imperfect) substitute for hardcore semantic extraction.
Attensity 4 is coming out soon, and it’s going to feature extremely multimodal text analysis.

July 26, 2006

Pioneers moving on

Ramana Rao is leaving Inxight, or has by now. Today I also discovered that Todd Wakefield is leaving Attensity. Such things happen in all industries, of course.

Categories: Attensity, Business Objects and Inxight, Text mining

Megaputer on the text mining market

Sergei Ananyan is president of Megaputer, which is not one of the easier companies to get information about. They’re an essentially Russian firm based in Bloomington, Indiana. Their website is, to put it kindly, not up to date. And I wound up speaking with Sergei while he was at his rural vacation house, located somewhere between the Black and Aral Seas.

However, Sergei followed up by email with his views of the marketplace, and I think they’re interesting enough to share below. I really like his focus on analytic business processes, something that generally doesn’t get enough consideration.

Categories: Megaputer, Text mining

1 Comment

July 23, 2006

Introduction to ClearForest

I had a fascinating talk with Jay Henderson of ClearForest Friday. While I have more research to do before I know what I really think, there already is plenty to post about.

ClearForest is one of the two companies whose name comes up for fact extraction applications, probably even a little ahead of Attensity. Their flagship account is the GM deal they did with IBM, kicking off the whole warranty report mining boom. Procter & Gamble is no slouch of a customer either. They’re involved enough in anti-terrorism that, when I asked Jay if he knew who Cogito was, he said “Of course.” And apparently one of their techie founders is the guy who coined the term “text mining” in the first place.

Categories: ClearForest/Reuters, Text mining

1 Comment

July 23, 2006

Autonomy on text mining

I asked Mike Lynch (Autonomy CEO) about text mining. He responded with an example:

A very well-known company “mines” its incoming emails for signs of trouble, not via any linguistics-driven approach, but just by clustering them. If a cluster changes size anomalously over time, it bears close investigation.

Categories: About this blog, Autonomy, Search engines, Text mining

1 Comment

July 23, 2006

Update: Autonomy/Verity merger

I had a couple of very interesting calls with Autonomy last week. One message I got was that they do not want to be pigeonholed in search, which they think on the whole is a primitive way of dealing with “unstructured information.” Nonetheless, my first post based on those calls will indeed focus on text indexing and search. You see, I wrote quite skeptically about the Autonomy/Verity merger when it was announced, and I’d like to amend that with an updated opinion. Autonomy’s claims can be summarized in part by the following: Read more

Categories: About this blog, Autonomy, Enterprise search, Search engines

Lead UIMA architect Dave Ferrucci speaks about adoption

Dave Ferrucci, lead architect for UIMA, shared some detailed views with me about UIMA adoption. WIth his permission, they are reproduced below. UIMA is still not getting a lot of attention from commercial text analytics vendors, but ultimately I think it will prevail, if just because nobody cares enough to start a war of dueling alternative standards.* So it’s something you should educate yourself about as it progresses.

*And IBM plans to convince me ASAP that even that assessment is too negative, which it well may be. Stay tuned.

So to sum up — 1. We seem to have fair amount of traction with the UIMA framework by communities that are very interested in plug-n-play with components from other providers. This includes the government, life sciences and research communities. 2. The UIMA standard, as opposed to the specific Java Framework implementation, developed under an SDO will broaden the opportunity and strengthen the case of adoption of UIMA as a standard for text and multi-modal analytics that allows interoperability across different frameworks and applications. It would of course be the case that the Java UIMA Framework would comply to the standard.

The complete email follows.
Read more

Categories: About this blog, IBM and UIMA, Open source text analytics

2 Comments

July 11, 2006

Google’s internal text-based project/knowledge management

Slashdot turned up an amazing article in Baseline on Google’s infrastructure. There’s lots of gee-whiz stuff in there about server farms, petabytes of disk packed into a standard shipping container so as to allow the setup of more server farms around the globe, and so on. But even more interesting to me was another point, about Google’s internal use of its own technology. In at least one case – a hybrid of project and knowledge management – Google really seems to be doing what other firms only dream about as futures. Here’s the relevant excerpt:

Categories: About this blog, Enterprise search, Google, Search engines, Specialized search

2 Comments

June 24, 2006

Attensity, extractive exhaustion, and the FRN

Two of the clearest and most charismatic speakers in the text mining business are Attensity cofounders Todd Wakefield and David Bean. Last year, Todd’s Text Mining Summit speech gave an excellent overview of the various application areas in which text mining was being adopted; vestiges of that material may be found in a blog post I made at the time, and on Attensity’s web site. This time, David’s Text Analytics Summit speech was basically a pitch for Attensity’s latest product release – and it was a pitch well worth hearing.
Read more

Categories: Attensity, BI integration, Comprehensive or exhaustive extraction, Text Analytics Summit, Text mining

10 Comments

June 24, 2006

Procter & Gamble on text mining projects

Terry McFadden of Procter & Gamble made a number of interesting points in his Text Analytics Summit talk, in the area of how to build and “amass” (his word) lexicons. Above all, I’m thrilled that he recognized the necessity of amassing lexicography that can be reused from one app to the next. Beyond that, specific comments and tips included: Read more

Categories: About this blog, ClearForest/Reuters, Companies and products, Ontologies, Text Analytics Summit, Text mining

2 Comments

← Previous Page — Next Page →

Search our blogs and white papers

Monash Research blogs

DBMS 2 covers database management, analytics, and related technologies.
Text Technologies covers text mining, search, and social software.
Strategic Messaging analyzes marketing and messaging strategy.
The Monash Report examines technology and public policy issues.
Software Memories recounts the history of the software industry.

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.

Links
- Monash Research
- White Papers
Admin
- Log in

More on Attensity

Pioneers moving on

Megaputer on the text mining market

Introduction to ClearForest

Autonomy on text mining

Update: Autonomy/Verity merger

Lead UIMA architect Dave Ferrucci speaks about adoption

Google’s internal text-based project/knowledge management

Attensity, extractive exhaustion, and the FRN

Procter & Gamble on text mining projects

Search our blogs and white papers

Monash Research blogs

User consulting

Vendor advisory

Monash Research highlights

Recent posts

Categories

Date archives

Admin