X

Scaling Twitter for the masses (from a technical perspective)

Building application scale is the key to SaaS and Cloud Computing. Twitter's single-thread application needs some re-thinking in order to meet the demand of it's users.

Dave Rosenberg Co-founder, MuleSource
Dave Rosenberg has more than 15 years of technology and marketing experience that spans from Bell Labs to startup IPOs to open-source and cloud software companies. He is CEO and founder of Nodeable, co-founder of MuleSoft, and managing director for Hardy Way. He is an adviser to DataStax, IT Database, and Puppet Labs.
Dave Rosenberg
2 min read

Twitter is still an early-adopter application, and if the system is running into scale issues already it's unlikely that it will be able to keep up when mainstream adoption occurs.

Twitter appears to have a fundamental design flaw that's not easily dealt with. It was designed to be a stand-alone system functioning in a multiparty/multiprotocol world. In the current architecture Twitter is an application, where it really needs to be a distributed system.

Twitter logo

Maybe Twitter needs enterprise service bus (ESB) functionality that runs in enough distributed locations (Yahoo, Google, Amazon.com, desktop) to ensure that messages are reliably delivered. This could be achieved in a wide variety of ways without having to maintain a massive infrastructure like the carriers do for SMS. It would also enforce pervasiveness and adoption.

The fact that Twitter is based on Ruby on Rails is probably only part of the real issue, though Ruby does require a fair amount of tweaking to run reliably. Scale issues are less likely to happen with PHP or Java, but Ruby apps are generally easier to build.

I came up with a few analogous systems that might help to explain some of the technical ways Twitter-scale could be achieved:

1. File-sharing systems like Limewire and BitTorrent that store pieces across a wide variety of machines
2. Distributed systems such as DNS (Marc Canter nailed that one)
3. Something like XMPP that has presence without a physical location definition
4. MOM (message-oriented middleware), which takes a message and does something with it
5. MQ Series (message queueing), which essentially moves a message from one place to another

Currently all of these ideas sort of work and also all fall apart due to either cost or interests of those involved. For the user, Twitter has become a utility. For Twitter the company, it's their business and we are all just messing with it.

The best case scenario is that the company figures out how to scale the application and maintains control before someone else figures it all out.

As a side note, I finally joined Twitter a few weeks ago and I can't get into it. I blame Sarah Lacy--she's been trying to tell me it's cool and I am just not that hip or obsessed with what other people are doing. On the other hand I can see how social people are into it and why there is all this hubbub.

Twitter guys--I don't know you but am happy to help. I deal with large-scale distributed systems all day long.

Links:
Twitter Can Be Liberated - Here's How
How to build the Open Mesh