December 29, 2008
Where “semantic” technology is or isn’t important
At Lynda Moulton’s behest, I spoke a couple of times recently on the subject of where “semantic” technology is or isn’t likely to be important. One was at the Gilbane conference in early December. The slides were based on my previously posted deck for a June talk I gave on a text analytics market overview. The actual Gilbane slides may be found here.
My opinions about the applicability of semantic technology include:
- The big bucks in web search are for “transactional” web search, and semantics isn’t the issue there. (Slides 3-4)
- When UIs finally go beyond the simple search box — e.g. to clusters/facets or to voice — semantics should have a role to play. (Slide 5)
- Public-facing site search depends — more than any other area of text analytics — on hand-tagging. (Slide 7)
- “Enterprise” search that searches specialized external databases could benefit from semantic technologies. (Slide 8)
- True enterprise search could benefit from semantic technologies in multiple ways, but has other problems as well. (Slides 10-11)
- Semantics — specifically extraction — is central to custom publishing. (Slide 12 — upon review I regret using the word “sophisticated”)
- Semantics is central to text mining. (Slide 18)
- Semantics could play a big role in all sorts of exciting future developments. (Slide 19)
So what would your list be like?
Categories: Enterprise search, Ontologies, Search engines, Specialized search, Structured search
Subscribe to our complete feed!
Comments
5 Responses to “Where “semantic” technology is or isn’t important”
Leave a Reply
[…] his views of the text analytics market through his blog and a slide presentation that he’s made available online. The presentation is refreshingly hype-free, and I recommend you take a […]
My list would be very short:
– small, trivial applications with a few hundred static documents: semantic technology not required
– everything else: benefits from semantic technology
The degree of benefit grows with query complexity, data volume, change rates of contents, and query volume on contents. Esp. public-facing sites do not necessarily require manual tagging. Entity and relationship extractions do provide an adequate means of handling the “standard” cases much more efficiently than manual efforts (esp. in cases of high data volumes) – of course, manual tagging may be an addition to that. Why? Well, human tagging tends to be biased and not tag all entities that could be tagged. If two years ago, somebody would regard “cloud computing” a marginal topic and not bother putting that into a tag, today it would be clear that needs to be done. But, who will re-tag now all the old content in the light of new and changing topics that are interesting? This is the clear case for automatic extraction, clustering and classification technologies to provide a neutral basis for search and navigation, possibly augmented by manual efforts as needed.
Best regards,
–Jürgen
Tom Tague of Thomson Reuters spoke during the workshop at DataServices World in San Jose. He mentioned some interesting uses cases for Calais (semantic metadata) technology, including event processing and knowledge discovery.
The slides, podcast and video (duration 17:30) are at:
http://www.DataServicesWorld.com/People/TTague.htm
It would help to be a bit more specific about what you mean by “semantic”. It’s a relative term, used to compare two levels of abstraction: whichever one is higher is the “semantic” one.
Do you specifically mean the W3C “semantic web”? Do you mean any kind of metadata that purports to convey “here is what the data is about”? Do you include (rudimentary or more) natural language processing of the data?
Dan,
I’m deliberately not giving a precise definition.
What people mean by “semantic technology” is usually “Something more or less like what Tim Berners-Lee has been going on about for a long time now”, but one of my main points is that the devil is in the details.