The comprehensive guide to upgrading – or replacing – Twitter
Twitter is a rather new communications service, wildly popular in the technology blogging and podcasting communities. There are close to a million registered accounts or users, but I’d guess the active users number in the low-mid five figures. Even at that low usage, Twitter is on overload, plagued with outages and data loss.
Scaling Twitter is a huge challenge. Doing so will involve changing just about every aspect of what Twitter is. A number of commentators have suggested lesser fixes, but none that I’ve seen is apt to work. (Generally, they forget that UI options will need to change as usage grows.) However, I think I’ve come up with an approach that would indeed work, for:
- Arbitrarily high levels of public Twitter use.
- Twitter integration with other communication tools such as instant messaging or IRC-style chat.
- Enterprise or integrated personal/enterprise use of Twitter.
The sections below cover:
- Future metadata needed by Twitter “tweets” (i.e., posts)
- Filtering enhancements Twitter will need as usage scales (and could greatly use already today)
- Present and future Twitter use cases
- Twitter CEP and database architecture (almost everybody else I see writing about Twitter gets this wrong)
- Enterprise Twitter
- Twitter’s competitive vulnerabilities
If you’re not familiar with Twitter, you probably should be. Crunchbase gives a decent overview, and the link above is a live look.
Twitter posts need more metadata
Twitter’s limit of 140 characters /message is cute, and maybe even sustainable for actual text. But that doesn’t allow for much metadata, @ replies and # tags notwithstanding. And the reliance on TinyURL is a kludge. The minimum metadata Twitter posts (aka tweets) need going forward is:
- URL being linked to
- Target individual(s)
- Target group(s), selectable by readers and writers alike
- Level of urgency
- Level of protection (e.g., totally open, target group only, friends only, etc.)
- Subject tags (this could be combined with group tags as a temporary hack)
- What’s already there (date/time, author, etc.)
Twitter needs many more tweet filters
Even today, Twitter writers and readers would benefit from more ability to filter tweets. If the number of users went up 10X or 100X, better filtering would become an absolute need. Even absent such growth, if users join who are less technosocial than the early adopters – or if current users tire of the distraction Twitter now causes — filtering will be a need for them too.
Examples of filters that I think Twitter should develop or support include:
- User groups (both ways — targets selected by the writer or authors selected by the reader)
- Subject (whether by explicit tag or content analysis)
- Taboo words (foul language, and perhaps a lot more than that for enterprise use)
User-group filters are crucial, because the current model of listening to a whole “stream” doesn’t scale. Right now, Twitterers only fit into two groups – those you listen to and those you don’t. But as usage grows, we’ll need to be able to deploy filters such as:
- The group I’m discussing tonight’s meetup with, archived back through the six hours I’ve been traveling.
- My usual high-priority groups, because otherwise I’m too busy to tweet today.
- Business-oriented groups only, plus my immediate family, because I’m at work.
- Fellow political enthusiasts, because there’s a big primary election tonight.
- NOT sports, because March Madness has started and I don’t really care about college hoops.
The need goes even further than that. Already today, some people tweet publicly that they want to read Dave Winer’s views on technology but not on politics, or Robert Scoble’s actual tweets but not his automated notifications of podcasts. What’s more, we may prefer different filter sets for real-time streams on our phones, real-time streams on our PCs, and occasional archival lookups.
Twitter needs to expand its use cases
Right now, there are two main ways to use Twitter – like high-tech CB radio, broadcasting to all who listen, or in “private update” mode, communicating only with your friends. As I’ve suggested above, there needs to be a lot more variety than that, with user groups and subjects freely filtered in and out. If that functionality is added, Twitter could have a number of major uses, include:
- General socializing (arguably one of Twitter’s two core uses today)
- General issues discussion (arguably the other one)
- General advice (Twitter is a great way to get immediate tech help)
- Meeting planning (another major Twitter use)
- General workgroup collaboration
- Narrowcast news dissemination – local snow days, daily enterprise news, breaking fantasy sports alerts, and many more
In addition, Twitter should be integrated with instant messaging. Right now, many people use Twitter through AIM or GoogleTalk. The tighter that integration gets, the better. Seamless switching between mass Twitter and reciprocal IM would be a nice improvement. (Just remember not to broadcast intimate love notes to your entire Twitter following.)
And it’s not just IM integration. For example, a group of Twitterites tweeting just at each other would be a whole lot like an IRC or AOL chat room, if filtering functionality worked that way. However – and this is a big advantage – it would be easy to be “in” multiple rooms at once.
Twitter needs a different architecture (CEP/database)
The essence of Twitter is accepting and distributing messages in real time. As I’ve already pointed out, this should be done via complex event/stream processing (CEP), not by writing everything first to a database. The need for much more complex filters just makes the case for CEP overwhelming. Of course, there also has to be a persistent message store, but database writing only should happen after real-time needs have been met.
This could scale nicely. Suppose there were 1,000,000 users online in any given hour. Suppose for each of those users the system maintained a cache of 500 16-byte message IDs. We’d only be talking about 8 gigabytes of RAM for that portion, no matter how many followers the most popular Twitterers each have.
So far, I’ve begged the question of whether
- Each user would get a personal representation of her full Twitter stream on disk, or
- Her Twitter stream would be recreated by a full database query each time she logged on or drilled back in her archives.
What I suggest is a hybrid. When a user is online, whichever tweets she sees should eventually be persisted out to disk, in batches (at least their message IDs). When she first signs on (assuming she’s a frequent user), there should be a cache of tweets waiting for her in memory. But if she ever wants to do an archival search beyond those two groups of tweets, a slowish database lookup will have to do. That said, if it turns out to be a useful performance speedup hack to persist complete Twitter streams for the most active users, I won’t be at all astonished.
Sometimes there would indeed be a complex query to fetch all or part of somebody’s Twitter stream. It would start with a set of rules that generated a list of tweet authors, perhaps executed against a persistent list of all the authors that user ever follows (or against some other kind of cache). Then it would look for all messages, in an appropriate time period (key point for performance optimizations), on the desired subjects. And last it would apply any negative filters (e.g., strong language. But if this were done against a real data warehouse DBMS, I don’t see why it would be a terribly big deal at all.
Twitter needs an enterprise version
I think Twitter could be a valuable enterprise tool. In particular, much of what email is used for would work better on a sufficiently spruced-up Twitter — namely quick notifications, often with an associated URL. (There anyway should be fewer emails with file attachments in the world, as those should be replaced by URLs. This is especially true at enterprises where good downloading connectivity can be assumed.)
Obviously, enterprise Twitter would need better archiving and integrations than the public version. I think it would actually need better filtering too. On the other hand, scalability would be much less of a challenge.
Voila! We have a monetization model for Twitter. However, we also have a huge reason for Microsoft to competitively blow Twitter out of the water. Make that “another huge reason” — the first one lies in the potential for Twitter to be a major enhancement to IM.
Twitter is very vulnerable to competitors
As popular as Twitter is, it doesn’t have a lot of built-in loyalty. Tweets are ephemeral; walking away from one’s archive of them would not be a terrible loss. Rebuilding the network of people one follows is a bit of a pain, but we’ve all done that multiple times before. And a new improved version could build a user base quickly by being more proactive about invites than Twitter is.
Above all, there’s rampant dissatisfaction with Twitter’s system robustness. As I’ve noted above, there’s also a lot of room for feature improvement.
Twitter is very vulnerable to being blown away.
Comments
15 Responses to “The comprehensive guide to upgrading – or replacing – Twitter”
Leave a Reply
Well, I think it would be a good idea to build on XMPP and its pubsub extensions, but I don’t think that will happen — there’s not enough wheel reinventing to be done there.
Sergey,
I think scaling discussions have to start with functionality and data query infrastructure.
I don’t think those problems are apt to be solved by any kind of global message bus in which every message whizzes by and you only pick out the ones you want. Rather, I think there has to be a central server (suitably replicated, etc., but logically central) that only sends you the messages you actually asked for, or at most a SMALL superset of those.
CAM
I think you’ve turned Twitter into a whole other application, into a blog almost. 140 characters isn’t supposed to be “cute,” it’s the whole freakin paradigm!
My suggestion is that if you want those features, try WordPress. Twitter is something else. Applications can’t be everything to everyone. Ask Microsoft.
Community-oriented software often doesn’t scale in pleasant usability as usage grows. I’m trying to figure out how Twitter can be an exception.
This is related to the questions about how to make it scale technically, of course.
[…] long discussion Saturday of how to evolve (or replace) Twitter included a short discussion of what might be called Enterprise Twitter. Dennis Howlett just alerted […]
Curt, maybe it’s my misunderstanding, but XMPP is more about routing messages than a global bus. With Publish-Subscribe Extensions it already has the core functionality of Twitter. It’s also proven to be scalable. One can pretty much reimplement Twitter on top of existing infrastructure without any problem. I have to admit that I don’t use Twitter myself, so maybe I’m missing some feature that doesn’t project itself too well onto existing Jabber network?
Sergey,
If you send things to a group of 30 people, or receive them from a group of 50, and you change who those groups are from hour to hour or message to message, where is that filtering going to be enforced? The UI can work on your Blackberry or iPhone, but will the logic work there too? I don’t think so; you have to go to a server. Could that server be your personal choice of “parent” server for your clients? I guess so. I also must confess that there’s an inevitable element of distribution once enterprises start messaging behind their firewalls, yet wanting to connect to the outside world, and I haven’t really thought that aspect through.
But I still think the idea of writing messages to disk before sending them onward is just braindead. Send them first, THEN persist them quickly, but without allowing that persistence to be a bottleneck. XMPP doesn’t obviate the need for CEP, any more than CEP would replace XMPP.
CAM
[…] mars 2008 Twitter commonly has the problem of duplicate tweets. That is, if you post a message, it shows up twice. […]
[…] Twitter commonly has the problem of duplicate tweets. That is, if you post a message, it shows up twice. After a little while, the dupe disappears, but if you delete the dupe manually, the original is gone too. […]
[…] Twitter’s case, a mass-successful form will necessarily look utterly different from what exists today. Techie early-adopters are not going to recruit a critical mass of users into a system that […]
[…] advocated recently for increased use both of simple instant messaging and filtered microblogging. The main reason I like short text messages so much is that, at least in theory, they improve on […]
[…] Text Technologies has an even more comprehensive guide of changes that they would like to see from Twitter which include, […]
[…] I’ve been suggesting all along that Twitter needs radical user experience enhancements. But when has Google ever made made user experience enhancements to a service? Its core search […]
[…] needs to be integrated with other forms of communication. What’s more, Twitter’s functionality needs to be drastically extended. Google Wave is the best hope I know of to meet those needs. Enterprise Twitter is just a special […]
[…] early experiments with Twitter in the Enterprise and reflects on some uses, and Curt Monash shares what improvements he would make and how Twitter could be useful in the Enterprise. His follow-up Enterprise Twitterincludes a good index of other posts until […]