December 1, 2010

The state of the art in text analytics applications

Text analytics application areas typically fall into one or more of three broad, often overlapping domains:

For several years, I’ve been distressed at the lack of progress in text analytics or, as it used to be called, text mining. Yes, the rise of sentiment analysis has been impressive, and higher volumes of text data are being processed than were before. But otherwise, there’s been a lot of the same old, same old. Most actual deployed applications of text analytics or text mining go something like this:

Often, it seems desirable to integrate text analytics with business intelligence and/or predictive analytics tools that operate on tabular data is. Even so, such integration is most commonly weak or nonexistent. Apart from the usual reasons for silos of automation, I blame this lack on a mismatch in precision, among other reasons. A 500% increase in mentions of a subject could be simple coincidence, or the result of a single identifiable press article. In comparison, a 5% increase in a conventional business metric might be much more important.

But in fairness, the text analytics innovation picture hasn’t been quite as bleak as what I’ve been painting so far. While standalone, passively-reported text analytics is indeed the baseline, there are some interesting exceptions. For example:

*When it comes to text analytics, “long” means “at least for the past several years.”

In more recent examples:

Finally there are some applications that, while fitting the standard template, just strike me as getting to unusually sophisticated levels of analysis. For example, Vertica told me of another Vertica/Hadoop case where VotM document analysis is carried out to the level of observing which order brand names appear in, and adjusting that for whether or not it was just an alphabetical list.

I suspect text analytics is about to become more interesting again.

Related links

Comments

12 Responses to “The state of the art in text analytics applications”

  1. The six useful things you can do with analytic technology | DBMS 2 : DataBase Management System Services on January 3rd, 2011 8:12 am

    […] technologies as applied to non-tabular data types such as text or […]

  2. Mega-trends driving data warehousing and business intelligence | DBMS 2 : DataBase Management System Services on January 22nd, 2011 3:07 pm

    […] Human/nontabular, e. g. what is best handled via text analytics. […]

  3. Upcoming webinar on investigative analytics | DBMS 2 : DataBase Management System Services on February 12th, 2011 7:32 am

    […] technologies as applied to non-tabular data types such as text or […]

  4. Greg on June 9th, 2011 12:51 am

    I’d like to see a journal that could cipher sentiment analysis and guide the journalist/individual to the motivational aspirations and offer inspirational content to help them enrich their perspective and chance of moving to a greater state of well-being.

    I have a model to help just need the programmer to design the site in a collaborative and altruistic endeavor.

  5. Text data management, Part 1: Confusion | DBMS 2 : DataBase Management System Services on October 10th, 2011 8:58 pm

    […] analytic technology vendors ignore what the text analytic vendors actually have accomplished, and reinvent inferior wheels rather than OEM the state of the […]

  6. Historical notes on the departmental adoption of analytics | Software Memories on January 17th, 2012 3:05 am

    […] as having to do with artificial intelligence – e.g. expert systems, predictive analytics* and text analytics — have wound up with applications being concentrated in the same few […]

  7. Applications of an analytic kind : DBMS 2 : DataBase Management System Services on February 11th, 2012 8:32 pm

    […] a lot of application potential, general-purpose text analytics technology has floundered. But when text analytics technology is […]

  8. Cool analytic stories | DBMS 2 : DataBase Management System Services on May 21st, 2012 2:25 am

    […] are in telecommunications, specifically in churn prevention. My favorite may be the case of multilingual text analytic integration in Switzerland: I once confirmed that SPSS customer Cablecom‘s statistical models for churn and the like […]

  9. Data as an asset | DBMS 2 : DataBase Management System Services on September 21st, 2014 10:49 pm

    […] data. On the whole I’ve been disappointed by the progress in text analytics. Still — and this overlaps with some previous points — there’s a lot of […]

  10. Where the innovation is | DBMS 2 : DataBase Management System Services on January 19th, 2015 3:28 am

    […] long been disappointed in the progress in text analytics. But sentiment analysis is doing fairly well, many more languages are analyzed than before, and I […]

  11. The three principal kinds of analytic business benefit | DBMS 2 : DataBase Management System Services on April 4th, 2015 12:14 am

    […] Flashing forward to 2009, I unearthed a list of specific marketing uses for analytics, originally compiled by Mike Ferguson. That same post starts with a Teradata-supplied list of cases in which you’d want the benefits of your analytics to be delivered near-real-time. And finally, a few months ago, I opined that text analytics application areas typically fall into one or more of three broad, often overlapping d…: […]

  12. Notes, links and comments, May 2, 2015 | DBMS 2 : DataBase Management System Services on May 2nd, 2015 9:36 am

    […] In 2010 I wrote that the use of textual news information in investment algorithms had become “more common”. It’s become a bigger deal since. For […]

Leave a Reply




Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.