Data capture for the sake of text mining
One of the major factors driving successful use of advanced analytic tools is direct initiatives to procure more data. The single best example I can think of is the gaming industry’s use of otherwise-contrived loyalty cards; improved marketing based on that data at chains like Harrah’s seems to produce upwards of 100% of total profits.
So can we apply the same approach to text mining? One place would be surveys. Rather than those annoying, contrived forms demanding we fill in a lot of choices as if we were taking the SATs all over again, maybe users would be more revealing if they could just write whatever they wanted? The obvious firm to ask is SPSS, which is big both in surveys and text mining, not to mention the intersection of the two markets. So I emailed Olivier Jouve, and he shot back an answer from an airport.
Specifically, I asked:
It occurs to me that we could expand the text mining market a lot of we could prove the following claim:
“Customers hate filling out structured surveys. Just let them write what they want, and the response rate will be a lot higher. Then, text mine it, and you’ll discover what they really think.”
A. Is that true???
B. If is is, do you have proof points?
And he responded (lightly edited):
A) True
B) We have 1100 unique organizations using our product called “Text Analysis for Surveys” (launched 15 months ago). You know how strong SPSS is in the market research stuff … This product is a killer for extracting opinions, etc. We have been focusing on this technology for years … You need sophisticated analysis (linguistic dependencies, etc) to extract good results, and to categorize on existing code frames (we propose some clustering algorithms to help building those code frames as well).
However, not all those customers are at the level you describe. They still mix open-ended and structured questions. But we observe more and more open-ended.
Thanks, Olivier! But, uh — what’s a code frame? Something to do with “coding” survey results?
Comments
3 Responses to “Data capture for the sake of text mining”
Leave a Reply
[…] I’m a huge fan of the idea that companies should deliberately capture as much information as possible for analysis. In the case of text, since I personally hate structured survey forms, I believe that free-form surveys have the potential to capture a lot more information than traditionally Procustean abominations do. SPSS indicated that there’s indeed some activity in this regard. […]
[…] Particularly interesting, I think, are some examples in the area of text data and analytics. • • • […]
[…] Harrah’s successes in targeting gambling customers, even though I’m having trouble validating something I thought I’d read, namely that over 100% of Harrah’s profits came from analytics on loyalty card data. […]