Application processes in text mining – finding warning signs
Sergei Ananyan’s claim that analytic business processes involving text are still very primitive is absolutely correct. Indeed, analytic business processes have a lot of maturing to do overall. Still, there’s one area where the industry has devoted a lot of thought over the past few years, and some notion of process has emerged. This is in the finding of warning signs.
Note: Hat tip to Attensity for focusing on this in our talk today, and even more in the part of the slide deck we didn’t actually go over, but they’re far from the only vendor to be thinking along these lines.
If we look at the major application areas for text mining, most of them fit more or less neatly into the “warning signs” bucket. In particular, that’s true of:
- Vehicle safety
- Other manufacturing/warranty analysis apps
- Reputation management (for the most part)
- Other customer sentiment apps (some, perhaps most)
- Anti-terrorism
- Sarbanes-Oxley compliance
- Antifraud
- Stopping money laundering
- Clinical applications (some)
- Early insurance risk management apps
- Early experimental hedge fund apps
And you can probably think of more examples yet.
So what are some processes used to deal with these apps?
1. In some cases, one has ongoing trouble, and is trying to diagnose it so as to prevent more occurrences. Sometimes there even are regular write-ups of known bad situations, such as warranty claims (technician or customer reports), insurance claims, Suspicious Activity Reports (for money laundering), etc. Then one can mine those write-ups to extract any facts that seem to be prevalent in those situations. This kicks off a standard data mining process – get and test some hunches, test some more, build an appropriate rule set, get the model into operational production for, as the case may be, either real-time (or real-enough-time) decisioning, or else a place of honor on dashboards and other performance monitors.
2. When the write-ups aren’t so regular, one can do the same thing anyway. An example might be correspondence from customers who later canceled their accounts.
3. In other cases, one is looking for trouble even before one has found some. Compliance often falls into this category, as does web-crawling reputation management. One process, favored by Autonomy, is simply to monitor document flow for all important themes, and hope that the trouble signs jump out at you. Alternatively, one can monitor documents for known bad event flags – vehicle malfunctions, drug side effects, angry customers, whatever. If there are only a few documents with such flags, one can read them directly If there are too many for humans to just read and digest in a timely manner – well, then you’ve transitioned into Case 1 or Case 2!
Comments
6 Responses to “Application processes in text mining – finding warning signs”
Leave a Reply
[…] The point of all this, it seems, is to enable a whole variety of processes for text analysis. • • • […]
[…] In particular, much of the column was based on a post in which I discussed “early warning” applications of text mining. […]
[…] As previously noted, I have a Computerworld column coming out next week on data mining. The heart of the column is an enumeration of markets where data mining applications were having genuine success. Before I sat down to actually write the column, my list went something like this: […]
[…] If you have any thoughts on these subjects, please share them in the comment thread! (One catch: Comment spam is really bad these days, often overflowing Akismet’s measly 150 message spam buffer. If your post somehow gets lost in the trash, I apologize deeply in advance and implore you to contact me directly.) Please also see last year’s post-Summit thread on text analytics marketing, and this observation on major text mining applications. […]
[…] hire experts in large part to keep you out of trouble. You use analytic technology in large part to warn you of trouble. And a huge part of investing is looking for and hopefully ruling reasons for a company and stock […]
[…] whacks at this kind of breakdown a few times before. Back in 2006 I rattled off a long list of early-warning uses for text analytics. The same year I discussed application areas for data mining and came up with a list much like the […]