ClearForest, Reuters, Factiva, Dow Jones, and possible futures
ClearForest is being acquired by Reuters. That ClearForest is being bought is unsurprising. The company recently pulled in its marketing horns dramatically, a common sign of putting oneself up for sale. The Reuters move, meanwhile, can be seen as a sequel to the divestiture of its half of Factiva to former 50-50 partner Dow Jones.
If the two main parts of the text mining market are custom publishing and finding warning signs, then both could actually be a good fit with Reuters. The custom publishing part is obvious. As for early warning – well, maybe ClearForest will lose its competitive edge in consumer product warranty analysis or something, but a significant fraction of the early warning market is tied to news articles, web postings, and other things that are a good fit for Reuters.
But the really interesting (at least to me) possibilities arise in the core Reuters and Dow Jones business of supporting investment decisions. Read more
Categories: ClearForest/Reuters, Factiva/Dow Jones, Text mining | Leave a Comment |
For search, extreme network neutrality must not be compromised
In a recent post on the Monash Report, I drew a distinction between two aspects of the Internet:Jeffersonet and Edisonet.Jeffersonet deals in thoughts and ideas and research and scholarship and news and politics, and in commerce too.It’s what makes people so passionate about the Internet’s democracy-enhancing nature.It’s what needs to be protected by extreme network neutrality.And it’s modest enough in its bandwidth requirements that net neutrality is completely workable.(Edisonet, by way of contrast, comprises advanced applications in entertainment, teleconferencing, etc. that probably do require new capital investment and tiered pricing schemes.)
And if there’s one application that’s at the core of Jeffersonet, it’s search.No matter how much scary posturing telecom CEOs do – and no matter how profitable or monopolistic Google becomes – telecom carriers must never be allowed to show any preference among search engines!At least, that’s the case for text-centric search engines such as Google, Yahoo, and Microsoft run today.The reason is simple:The democratic part of the Internet only works so long as things can be found.And search will long be a huge part of how to find them.So search engine vendors must never be able to succeed based on a combination of good-enough results plus superior marketing and business development.They always have to be kept afraid of competition from engines that provide better actual search engine results. Read more
Comment freely!
Comments are working again, and I’m easier to reach by email too.
Categories: About this blog | 2 Comments |
Now is not a good time to post a comment on Text Technologies
As with DBMS2, I am moving Text Technologies to another hosting provider this weekend. Until the name server change has propagated, there’s no guarantee a comment will really land in the right place. By Monday this should be a non-issue.
Categories: About this blog | Comments Off on Now is not a good time to post a comment on Text Technologies |
TEMIS, part 2 – application areas
CEO Eric Bregand clearly described TEMIS as being in three markets – life sciences, publishing, and “industrial.” However, based on his descriptions, I’d characterize industrial as itself having three components – competitive intelligence, adverse impact detection, and customer satisfaction. Legal is somewhere in the mix too.
The common theme among these markets seems to be an emphasis on applications where complex semantic analysis is important. Actually, I think it would be expedient for TEMIS to use the marketing hook of saying the subjects it does analysis about are complex. Nobody likes to be told their software is complex, but they don’t mind being told they’re experts in a complex discipline themselves. Read more
Categories: TEMIS, Text mining | 2 Comments |
TEMIS, part 1 – overview
Due to various transatlantic communication glitches, I’d never had a serious briefing with text mining vendor TEMIS until yesterday, when I finally connected with CEO Eric Bregand. So here’s a quick TEMIS overview; I’ll discuss what they actually do in a separate post.
- TEMIS has 50 people; 3 main businesses and a couple of secondary ones; two larger offices in France; and smaller offices in Germany and the US. As would be expected, TEMIS’ customer base is concentrated in Continental Europe. The US exceptions seem concentrated in the life sciences vertical (not coincidentally, the US office is outside Philadelphia).
- Like Inxight, TEMIS is at least partly a spin-off from Xerox’s text analytics efforts. Indeed, its Grenoble office was acquired from Xerox. Unlike Inxight, TEMIS doesn’t serious pursue OEM business, but a couple of exceptions have occurred (Eric mentioned Convera and Documentum). Read more
Categories: Business Objects and Inxight, IBM and UIMA, TEMIS, Text mining | 2 Comments |
Orlowski is back to his old tricks
Andrew Orlowski thinks he’s figured out the Apple/Google/Oracle partnership. But he has it all wrong.
Categories: Enterprise search, Google, Humor, Search engines | Leave a Comment |
So THAT’S why Andrew Orlowski still has a job (Part 2)
Andrew Orlowski is an over-the-top jerk, and a pretty sloppy reporter and analyst to boot. But he occasionally makes a good point even so. In the most recent instance, he confronted Tim Berners-Lee. As the article makes clear, Berners-Lee reacted badly to Orlowski, reflecting an attitude that is probably shared by 99% of the people who encounter the guy, and in the future will probably be adopted by sentient computers as well. Even so, Orlowski’s underlying point is valid: If the Semantic Web is going to be any more spam-free than the current Web, nobody has adequately explained why.
Categories: Ontologies, Spam and antispam | 2 Comments |
Uncyclopedia
If you haven’t seen it yet, Uncyclopedia is an occasionally hilarious parody of Wikipedia. Definitely worth checking out.
Categories: Humor, Social software and online media | Leave a Comment |
Clarabridge takes on Attensity
Text mining newbie Clarabridge gave me the all-too-customary “Please let us brief you, but then don’t write about it for a while” routine. Now that it’s OK to post, what I’m up for offering is a few salient points in bullet form.
- The closest analogy to what Clarabridge does is Attensity’s new(ish) strategy – extract “facts” from documents and dump them into a relational database management system. In particular, Clarabridge and Attensity alike make the case “Our categorization is more flexible because it’s applied only after the extraction happens.”
- Clarabridge’s sweet spot is extracting user opinions from short documents. E.g., the customer uses cases they talk about are customer feedback forms, public blog postings, etc. about A. hotels and B. consumer software products.
- Clarabridge has a strong business intelligence mentality, describing the product as “ETL for unstructured data.” But then, it’s spun out of a BI consultancy that itself was founded by Microstrategy veterans.
- Clarabridge uses a different database schema than Attensity. Attensity’s fact-relationship network (FRN) is basically just two thin, long tables. Clarabridge, however, uses a Microstrategy-like star schema, in which different kinds of things that you can tokenize correspond to different dimensions.
Frankly, if somebody wants an alternative to the Attensity/Teradata/Business Objects partnership they could do worse than talk with Clarabridge.