The text technologies market 4: Requirements for an industry-altering ontology management system
In previous posts I argued that what’s holding the text technology industry back is the lack of a viable ontology management system. The obvious objection to such a suggestion is: Who would use it? There is no business process for ontology management, even less than there is for “knowledge management,” and for that matter less than there was for “knowledge engineering” during the expert systems bubble of the 1980s. Enterprises do not have anything like a “chief ontologist.” Indeed, that job title sounds like a joke — a touchy-feely liberal-artsy nonstarter.
The only way a successful product category of ontology management systems can emerge is if the products are usable by ordinary IT personnel. Vendor-supplied product training can be required, of course. Some day there can be certifications, and maybe a single class in a computer science curriculum. But almost nobody is going to buy a product whose use requires a masters degree in library science or “ontology management.”
So here are some very high-level requirements I think an ontology management system needs to meet.
1. Basic knowledge representation has to be flexible. It has to accommodate semantic net kinds of relationships (is_an_instance_of, is_a_subcategory_of). It also has to accommodate machine learning/statistical kinds of evidence (both positive and negative evidence).
2. There has to be strong layering/versioning. Pieces of the ontology will come from the vendor. Pieces will come from frequently-updated machine-learning exercises against an enterprise’s own corpus(es). Pieces will be added by hand, through a collaboration between IT and (at first) power users. It will have to be possible to reverse any of those pieces out, to apply different pieces for different specific applications, and so on.
3. There need to be standard, open ways for different kinds of applications to use the ontologies. UIMA could be a starting point.
4. The product needs to be industrial-strength – reliable, scalable, secure, sufficiently easy to administer, available on a sufficient range of platforms, and compliant with general standards (not just the text-specific ones).
Obviously, these requirements are nontrivial to achieve. But if some vendor does do a good job on them, the payoff could be huge. Dominance of the enterprise text technologies market – which would be a greatly expanded market – is at stake.
Comments
4 Responses to “The text technologies market 4: Requirements for an industry-altering ontology management system”
Leave a Reply
[…] Nobody is doing anything about the platform advances I think are necessary. However, when prodded, they admit that something like that is needed, and the technology really isn’t finished or a commodity after all. But some other company should do it, because they aren’t going to. Arggh. […]
[…] Terry McFadden of Procter & Gamble made a number of interesting points in his Text Analytics Summit talk, in the area of how to build and “amass” (his word) lexicons. Above all, I’m thrilled that he recognized the necessity of amassing lexicography that can be reused from one app to the next. Beyond that, specific comments and tips included: […]
[…] I’ve argued previously that enterprises need serious ontologies, and that this lack is holding back growth in multiple areas of text technology – search, text mining and knowledge extraction, various forms of speech recognition, and so on. The core point was: The ideal ontology would consist mainly of four aspects: […]
[…] Data cleaning/quality versatility. Informatica acquired the Similarity product some months ago, which they assert is more modern than some competitors, and hence better suited to handle data beyond names/addresses. A key example would be product hierarchies/ taxonomies. I suggested they explore whether this could be leveraged for enterprises’ text technology architectures, specifically in the area of ontology management. […]