Where “semantic” technology is or isn’t important
At Lynda Moulton’s behest, I spoke a couple of times recently on the subject of where “semantic” technology is or isn’t likely to be important. One was at the Gilbane conference in early December. The slides were based on my previously posted deck for a June talk I gave on a text analytics market overview. The actual Gilbane slides may be found here.
My opinions about the applicability of semantic technology include:
- The big bucks in web search are for “transactional” web search, and semantics isn’t the issue there. (Slides 3-4)
- When UIs finally go beyond the simple search box — e.g. to clusters/facets or to voice — semantics should have a role to play. (Slide 5)
- Public-facing site search depends — more than any other area of text analytics — on hand-tagging. (Slide 7)
- “Enterprise” search that searches specialized external databases could benefit from semantic technologies. (Slide 8)
- True enterprise search could benefit from semantic technologies in multiple ways, but has other problems as well. (Slides 10-11)
- Semantics — specifically extraction — is central to custom publishing. (Slide 12 — upon review I regret using the word “sophisticated”)
- Semantics is central to text mining. (Slide 18)
- Semantics could play a big role in all sorts of exciting future developments. (Slide 19)
So what would your list be like?
Categories: Enterprise search, Ontologies, Search engines, Specialized search, Structured search | 5 Comments |
Google is reported to be cutting back
Google seems to be cutting back its workforce, or at least radically scaling back its growth plans. It’s tough to quickly assess details just based on the blogosphere, given all the Google hate out there. But WebGuild Silicon Valley offers a post claiming that Google’s 20,000 actual employees are paired with 10,000 more contractors, and the latter are being pared way back. Various other posts linked in the comment thread say similar things.
Before you get too excited about hiring opportunities, however — it’s not obvious how many victims are in the core search business in any capacity, and it’s certain not clear whether anybody is being let go in areas like search algorithm research.
Categories: Google, Search engines | Leave a Comment |
More website weirdness
Here’s something longer-lasting and weirder than Vertica’s “We sell turkeys” theme: Mark Logic, whose product is used primarily to help enterprises make their content more acceptable, doesn’t have a search engine on its own website.* Read more
Categories: ClearForest/Reuters, Custom publishing, Mark Logic, Search engines | 7 Comments |
The silly fuss over Obama’s use of YouTube
President-Elect Barack Obama is posting videos on YouTube. Clearly, his use of relatively cutting-edge communications technology is a Good Thing. It’s also unsurprising, giving the sophistication and importance of video in the recent presidential campaign.
However, various commentators — even ones as smart as Dan Farber — see something wrong with the use of YouTube for this purpose. I think that’s silly. Read more
Categories: Google, Social software and online media | Leave a Comment |
Are denial-of-insight attacks a threat to search logs and/or VOTC/VOTM apps?
TechTaxi points out that it’s at least theoretically possible to, by polluting the Web, pollute somebody’s web-wide information gathering. (Hat tip to Daniel Tunkelang.) They further assert this is a relatively near-term threat.
The theory can’t be denied. What’s more, bad actors have other motives to pollute the Web. For example, if they plant favorable automated comments about their own products or unfavorable about the competition’s, Voice of the Customer/Market applications will naturally be confused. And if automated reputation-checkers get more prominent, there will be a major incentive to game them, just as there has been for Google’s PageRank. So VOTC/VOTM market research tools could polluted as a side effect.
Similarly, if somebody wants to test your e-commerce site by throwing a ton of searches at it, your search logs will lose value.
But disinformation of competitors for the sake of disinformation? Or, as the article suggestions, vandalism/extortion? Off the top of my head, I’m not thinking of a serious near-term threat scenario.
Categories: Competitive intelligence, Search engines, Spam and antispam, Voice of the Customer | 2 Comments |
The Google flu search story is pretty interesting
Google reports that it is tracking flu outbreaks via search. Actually, that’s a misnomer. Google is not tracking articles written about flu; HealthMap et al. do that. Rather, this Google project is tracking search queries about flu-related subjects. They have graphs suggesting a strong correlation between flu-related searches and actual cases of flu, notwithstanding that many searches on “flu” would be for, say “flu shot.” The key point is that Google tracks where searches come from, and hence detects which geographical areas are suffering flu outbreaks. And it does this 1-2 weeks faster than the alternative method, which is physicians reporting to the Centers for Disease Control (CDC).* Read more
Categories: Google, Search engines | 2 Comments |
Lukewarm review of Yahoo mobile search
Stephen Shankland reviewed Yahoo’s mobile voice search, which works by taking voice input and returning results onscreen (in his case on his Blackberry Pearl). He found:
- There are plenty of times when voice is a more convenient form of input than typing.
- Voice recognition was good but far from perfect.
- Editing search strings was annoyingly difficult.
- Search results themselves aren’t 100% perfect.
No big surprises there. 😀
Categories: Language recognition, Search engines, Specialized search, Speech recognition, Yahoo | Leave a Comment |
Google and the Author’s Guild establish an ASCAP for books
Most of the coverage of the Google/Authors Guild settlement today seems to focus on Google’s side of things. But I think the authors’ side is much more important. This deal paves the way for traditional publishers to become quaint and useless — and not a moment too soon.
Below are some quotes — fair use!! 🙂 — from the Authors Guild official statement on the deal (emphasis mine): Read more
Categories: Google, Search engines, Social software and online media, Specialized search | Leave a Comment |
Maybe text mining SHOULD be playing a bigger role in data warehousing
When I chatted last week with David Bean of Attensity, I commented to him on a paradox:
Many people think text information is important to analyze, but even so data warehouses don’t seem to wind up holding very much of it.
Categories: Attensity, Comprehensive or exhaustive extraction, Sentiment analysis, Text mining | 5 Comments |
Attensity update
I had a brief chat with the Attensity guys at their Teradata Partners Conference booth – mainly CTO David Bean, although he did buck one question to sales chief Jeff Johnson. The business trends story remained the same as it was in June: The sweet spot for new sales remains Voice of the Customer/Voice of the Market, while on-premise/SaaS new-name accounts are split around 50-50 (by number, not revenue).
David’s thoughts as to why the SaaS share isn’t even higher – as it seems to be for Clarabridge* – centered on the point that some customers want to blend internal and external data, and may not want to ship the internal part out to a SaaS provider. Besides, if it’s tabular data, I suspect Attensity isn’t the right place to ship it anyway.
*Speaking of Clarabridge, CEO Sid Banerjee recently posted a thoughtful company update in this comment thread.
When I challenged him on ease of use, David said that Attensity is readying a Microstrategy-based offering, which is obviously meant to compete with Clarabridge and any of its perceived advantages head-on.