Today a friend linked me to this article in the NY Times about networks, the US presidential inauguration, and twitter. Here’s the key quote, emphasis his/mine:
Biz Stone, co-founder of Twitter, said the company was hoping to sidestep network hiccups. He is not expecting the same traffic spikes as during the election, when the site was flooded with as many as 10 messages a second, but says the service â€œwill nevertheless be doubling our through-put capacity before Tuesday.â€
Because of all the hype Twitter gets, I couldn’t believe the figure was so low, so I checked elsewhere. “Twitter had by one measure over 3 million accounts and, by another, well over 5 million visitors in September 2008.” Simple math says there’s about 2.5 million seconds in a month, so 5 million impressions translates to one request every half a second. Presuming each query pulls a few pictures as well as text, 10 messages a second sounds about right based on published data. Let’s further assume each message is the size of an average Twitter page; mine came in at 34,100 bytes just now, or 341,000 bytes a second.
Bandwidth wise, that’s 2.728 Mbit/s, or roughly the bandwidth of 2 T1s. My home DSL line can push 700 kbit/s. With 5-6 of them bonded together, and the appropriate back end servers, I could run Twitter out of my basement.
It also isn’t very much, if you compare it with other semi-synchronous messaging technologies like IRC, Jabber and IM servers, who have been capable of pushing more data per second since well over 15 years ago. I’m sure mainframes were doing similar amounts of data I/O 30 years ago.
The snarky nerd in me wants to smear Ruby on Rails, the technology platform on which Twitter relies, but others did that 2 years ago already (and yes, that link defends the technology, and makes the ridiculous assumption that you can’t build in scalability.) I’m convinced it’s the incorrect application of a specific technology to solve a problem for which it is ill-suited. Perhaps the Twitter infrastructure never planned to expand so greatly, but I find it laughable that we’re in 2009 and that “important” services like Twitter can’t survive a “flood” of 10 messages a second. My friend agrees: “no i’m sure facebook is laughing at the 10 messages/a second ‘flood’ too.”
I’m also quite surprised that such a “popular” site, one that gets so much hype and marketing, really doesn’t get that much use. For comparison, here’s the figures for the Top 10 sites. Being generous and assuming those 5 million hits for Twitter are all unique visitors, that means the largest sites see more than 25 times the traffic it does. Facebook sees at least 10 times the number of unique visitors, and certainly will push more content, what with all the pictures and rich media it has vs. twitter’s limited use of graphics (small avatars only). Of course, none of this even gets into what AWS/S3 and content accelerators push from a pure bandwidth standpoint.
Increasingly, I’m convinced microblogging sites are hiveminds for particular flavours of individual. Disingenuously: StumbleUpon/Digg are “OMG LOOK AT THIS LINK!” Twitter feels like “marketing marketing SEO yadda yadda bend to my will.” Plurk is “cheerleader YAY happy happy dancing banana.” BrightKite is “mommy! mommy! look at me now!” And yes, IRC is probably the Wild Wild West. Others I know have made similar comparisons between LiveJournal, mySpace, Facebook and Friendster. I’m not sure what predestines a technology for a specific sort of person, but the link is there. This might make a good research paper. ;)
Pingback: Joan’s Research Blog » Crosspost: microblogging sillyness
wow, 10 messages/sec is a LOT lower than i was expecting.
on the other hand, from what i’ve read, i guess most of the system load is not the rate of incoming messages, but the rate of outgoing messages. when you have users that have almost 50,000 followers, that’s a lot of pushing (or a lot of people pulling).
still, you are right about IRC and such. these services really could be federated like jabber, for example, so anyone could run their own node and the load would be distributed.
and i definitely agree about the hivemind. i find that that applies to tech blogs in general, but especially so for twitter (compared to the amount of media attention and hype).
Someone said, there are programmers that can write Fortran in any language.
I guess, some programmers can write Bourne shell CGI scripts in any language/infrastructure, too.
You had me at, “snarky nerd,” but you lost me at, “2.728 Mbit/s, or roughly the bandwidth of 2 T1s”
misuse of the term “flood” ? I dunno.
RIP: Mr Hooper