April 25, 2008

Drive-by Google de-listing

As previously noted, we got hit with some hidden text, probably by SQL injection, and that lead to a Google de-listing. Of the three blogs affected by the attack, I got a de-indexing notice for one (DBMS2); another was de-listed without a notice (Text Technologies); and a third seems to have waltzed through still indexed (Software Memories). I also received a de-indexing notice for another site I have nothing to do with and indeed had never heard of before. Go figure …

We’ve now upgraded to WordPress 2.5, which should close the vulnerability. (Thank you Melissa Bradshaw!) Fearing our old, buggy theme would degrade further, we upgraded to a new one, Biru, designed by Bob. There are some teething-pain stability issues, but if they don’t cause a reversion in the next day, I’ll apply to Google for re-inclusion. (Uh, does anybody have some boundaries around how long that’s likely to take?)

All these hours of aggravation because some criminal wanted a bit of SEO advantage …

April 7, 2008

Yahoo indeed seems to want an all cash deal

The Microsoft/Yahoo negotiation is in a very public phase right now. In its latest letter, the Yahoo board makes two references to “certainty,” in one case spelling out that this encompasses “certainty of value” and “certainty of closing.”

It’s hard to imagine what the former could mean other than “Please make an all-cash offer (or, better yet, go away).” But I previously noted, Microsoft can indeed afford to buy Yahoo entirely for cash.

The latter part is a reference to the antitrust boogeyman, obviously a non-trivial concern whenever Microsoft is involved.

Please subscribe to our feed!

March 15, 2008

MuseGlobal – ETL for text, sort of

Lynda Moulton introduced me to MuseGlobal, and specifically CEO Kate Noerr, last month. MuseGlobal sort of does ETL (Extract/Transform/Load) for text, although they prefer to call it Gather/Transform/Deliver. In any case, each of the three parts of the process are rather different for text than they are for traditional data warehousing. To wit: Read more

March 5, 2008

Google could dominate single-site search

Google has begun to introduce a feature whereby, if your search obviously leads you to a single site (e.g., you searched on a company name), you get a second search box to search only within that site. More details at Google and Search Engine Land. Basically, this is Google Site Search made a lot easier to use.

I think this could be a really big deal. Read more

March 4, 2008

Over 80 percent of blog posts are probably spam

Doug Caverly highlights a Matt Mullenweg quote indicating that about 1/4 of all the blogs ever on WordPress.com were spam (aka splogs). Now, that’s probably a higher fraction than for the blogoverse overall, because:

But there’s one more factor. Splogs have much higher posting frequency than real ones. 10-20+ posts per day is not uncommon, and 50-100+ is not unheard of. That’s 5-10X the post frequency of even the more active human-written blogs. So let’s assume:

In that case, over 80% (and indeed probably over 90%) of all blog posts are made by machines rather than by human beings.

February 28, 2008

Code search options

Questions come up here from time to time about code search engines, a subject I have not researched. Well, here’s a quick link listing some leading code search engines, both Web (guess who?) and internal. Most interesting may be that the list is so short.

February 17, 2008

A computational linguistics filksong

The Grammar and the Sentence

Truth be told, it’s not nearly as good as God’s Programming Language, but it might be worth a few chuckles even so.

February 15, 2008

Six blind men and the Twitter elephant

I got a long email today from a Very Smart Person who asked, in effect “What is Twitter for? I don’t get it.” Coincidentally, Rex Hammock posted a good answer yesterday, albeit with a bad title that I won’t repeat. The essence was:

… the most amazing thing about Twitter is this: everyone uses it differently.

It’s a little like trying to explain the telephone by describing what people talk about on the phone. “Telephones are devices that teenagers use to spread gossip.” “Telephones are the devices people use to contact police when bad things happen.” “Telephones are the devices you use to call the 7-11 to ask if they have Prince Albert in a can.”

Like the Internet itself, Twitter is hard to explain because it doesn’t really have a point. And it has too many points. Here’s what I mean: All it does is provide a common-place to relay short messages to a group of people who agree to receive your messages. Here’s the second part of what i mean: When you stop thinking those short messages aren’t limited to “I’m about to get on the elevator” but can be eye-witness accounts of breaking news stories or bursts of business-critical intelligence, or warnings that a gun-man is loose on campus, or shared conversations about political debates you and your friends are watching on TV, the possibilities of what can be done using Twitter becomes amazingly confusing — I think in a good way.

I’ve recently put up two posts on Twitter use cases. For yet another — well, as Shakespeare didn’t quite say, a 140 character limit is the soul of wit. Here’s my (ever-changing) list of Twitter “favorites”. The humor ranges from the sophomoric to the erudite; there are some serious aphorisms as well.

February 14, 2008

Yahoo wants to follow AOL into the dead pool

Yahoo CEO Jerry Yang has put out a shareholder letter in which he commits Yahoo to pursuing the strategies that have already devastated AOL. To wit:

This is exactly what AOL tried in the late 1990s, except that they also had the best dial-up connectivity in the world. I know; Linda and I were strategic consultants to AOL then.* And we told them that while the rest of their strategy was excellent, it would be to no avail unless their tools matched the quality of what people could get in the office or elsewhere online. Because if AOL’s technology didn’t catch and keep up, people would just laugh and go elsewhere. (Even my parents, who still use AOL mail, go outside AOL for their web surfing. AOL is getting very little revenue from them, and they’re about as captive as AOL users get.)

*Please note — AOL was a great client, but the people we dealt with are (for the most part) long gone, and our NDAs ran out years ago.

That’s brain-dead. Just consider how far technology has taken Google, how fast gaming technology advances, or how fickle internet users are about switching to the latest and greatest online services. What’s worse, Yahoo seems to mean it, given how many serious technology leader types are out on the street in connection with the recent layoffs.

Pretty much the only remaining hope for the Yahoo brand(s) and services is for the Microsoft acquisition to go through, and for Microsoft/Yahoo to unlock the deal’s huge potential synergies — which, while far from being certain, is at least realistically possible.

Please sign up for our feed!

February 13, 2008

More Twitter use cases

Monday, I posted about four Enterprise Twitter use cases. Episteme responds that that’s all well and good, but what’s really important is that Enterprise Twitter would lead senior management to communicate in a human way with the team. I agree completely, and think this is one of the big reasons Enterprise Twitter could be an improvement over email for many uses.

That post also illustrates a use of public Twitter. Read more

← Previous PageNext Page →

Feed including blog about text analytics, text mining, and text search Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.