Procter & Gamble on text mining projects
Terry McFadden of Procter & Gamble made a number of interesting points in his Text Analytics Summit talk, in the area of how to build and “amass” (his word) lexicons. Above all, I’m thrilled that he recognized the necessity of amassing lexicography that can be reused from one app to the next. Beyond that, specific comments and tips included:
- Multiple people work on lexicons, in no small part because P&G is active in so many countries, with so many languages.
- Thus, they need great collaboration tools. Your specific tools for building taxonomies/ontologies/lexicons have to play well with your favorite collaboration tools (e.g., for desktop sharing).
- There’s only one computational linguist in the whole company (of 120,000 people). Maybe next year there will be two.
- Semi-seriously, they talk about whether they need poets to “think outside the box” about how language is used, to help with the lexicon building.
Elsewhere in his talk, McFadden held forth in generalities about how to gain support from senior management for new technology, something of which there was too much at the Summit (he wasn’t the worst offender). And he completely endeared himself to me when he explicitly made the Groucho Marx citation.
Comments
2 Responses to “Procter & Gamble on text mining projects”
Leave a Reply
[…] ClearForest is one of the two companies whose name comes up for fact extraction applications, probably even a little ahead of Attensity. Their flagship account is the GM deal they did with IBM, kicking off the whole warranty report mining boom. Procter & Gamble is no slouch of a customer either. They’re involved enough in anti-terrorism that, when I asked Jay if he knew who Cogito was, he said “Of course.” And apparently one of their techie founders is the guy who coined the term “text mining” in the first place. […]
[…] of known computational linguists working for end-user organizations, worldwide, was precisely 1, at Procter & Gamble. (Intelligence agencies excepted, of course.) I’d guess it’s higher now, but I probably […]