jm + scalability   19

Scale Something: How Draw Something rode its rocket ship of growth
Membase, surprise answer. In general it sounds like they had a pretty crazy time -- rebuilding the plane in flight even more than usual. "This had us on our toes and working 24 hours a day. I think at one point we were up for around 60-plus hours straight, never leaving the computer. We had to scale out web servers using DNS load balancing, we had to get multiple HAProxies, break tables off MySQL to their own databases, transparently shard tables, and more. This was all being done on demand, live, and usually in the middle of the night. We were very lucky that most of our layers were scalable with little or no major modifications needed. Helping us along the way was our very detailed custom server monitoring tools which allowed us to keep a very close eye on load, memory, and even provided real time usage stats on the game which helped with capacity planning. We eventually ended up with easy to launch "clusters" of our app that included NGINX, HAProxy, and Goliath servers all of which independent of everything else and when launched, increased our capacity by a constant. At this point our drawings per second were in the thousands, and traffic that looked huge a week ago was just a small bump on the current graphs."
scale  scalability  draw-something  games  haproxy  mysql  membase  couchbase 
5 weeks ago by jm
Storage Infrastructure Behind Facebook Messages
HBase and Haystack; all data LZO-compressed; very interesting approach to testing -- they 'shadow the real production workload into the test cluster to test before going into production'. This catches a 'high percentage' of issues before production. nice
testing  shadowing  haystack  hbase  facebook  scalability  lzo  messaging  sms  via:james-hamilton 
october 2011 by jm
Storm
'The past decade has seen a revolution in data processing. MapReduce, Hadoop, and related technologies have made it possible to store and process data at scales previously unthinkable. Unfortunately, these data processing technologies are not realtime systems, nor are they meant to be. There's no hack that will turn Hadoop into a realtime system; realtime data processing has a fundamentally different set of requirements than batch processing.

However, realtime data processing at massive scale is becoming more and more of a requirement for businesses. The lack of a "Hadoop of realtime" has become the biggest hole in the data processing ecosystem. Storm fills that hole.'
data  scaling  twitter  realtime  scalability  storm  queueing 
september 2011 by jm
good taxonomy of memcached use cases
via Jeff Barr's announcement of the Elasticache launch. from 2008, but a better taxonomy than I've seen elsewhere
memcached  caching  mysql  performance  scalability  via:jeffbarr 
august 2011 by jm
The Secrets of Building Realtime Big Data Systems
great slides, via HN. recommends a canonical Hadoop long-term store and a quick, realtime, separate datastore for "not yet processed by Hadoop" data
hadoop  big-data  data  scalability  datamining  realtime  slides  presentations 
may 2011 by jm
Facebook's New Realtime Analytics System: HBase to Process 20 Billion Events Per Day
Scribe logs events, "ptail" (parallel tail presumably) tails logs from Scribe stores, Puma batch-aggregates, writes to HBase.  Java and Thrift on the backend, PHP in front
facebook  hbase  scalability  performance  hadoop  scribe  events  analytics  architecture  tail  append  from delicious
march 2011 by jm
Akka
'platform for event-driven, scalable, and fault-tolerant architectures on the JVM' .. Actor-based, 'let-it-crash', Apache-licensed, Java and Scala APIs, remote Actors, transactional memory -- looks quite nice
scala  java  concurrency  scalability  apache  akka  actors  erlang  fault-tolerance  events  from delicious
march 2011 by jm
Thousands of Threads and Blocking I/O [PDF]
classic presentation from Paul Tyma of Mailinator regarding the java.nio (event-driven, non-threaded) vs java.io (threaded) model of server concurrency, backing up the scalability of threads on modern JVMs
java  async  io  jvm  linux  performance  scalability  threading  threads  server  nio  paul-tyma  mailinator  from delicious
july 2010 by jm
How do we kick our synchronous addiction?
great post on the hazards of programming in an async framework, and how damn hard it is. good comments thread too (via jzawodny)
via:jzawodny  coding  python  javascript  scalability  ruby  concurrency  erlang  async  node.js  twisted  from delicious
february 2010 by jm
What Second Life can teach your datacenter about scaling Web apps
good scaling advice from Linden Labs' Ian Wilkes (who doesn't seem to have a blog, sadly)
linden  ian-wilkes  scaling  datacenters  scalability  deployment  ops  services  from delicious
february 2010 by jm
Google employees now discouraged from using Python for new projects
'You have to balance
Python's strengths with its weaknesses: your engineers may be more
productive using Python, but if they have to work around more
platform-level performance/scaling limitations as volume increases, do
you come out ahead? etc.'
google  performance  scalability  python  unladen-swallow  languages  via:preddit  from delicious
november 2009 by jm
The technology behind Tornado, FriendFeed's web server
more on the new async HTTP server from FriendFeed/Facebook, in Python. looks lovely
async  http  epoll  python  comet  long-poll  facebook  scaling  scalability  web  friendfeed  tornado  opensource  from delicious
september 2009 by jm
Tornado Web Server
'an open source version of the scalable, non-blocking web server and tools that power FriendFeed. The FriendFeed application is written using a web framework that looks a bit like web.py or Google's webapp, but with additional tools and optimizations to take advantage of the underlying non-blocking (epoll) infrastructure.'
epoll  open-source  python  http  scalability  facebook  scaling  web  from delicious
september 2009 by jm

Copy this bookmark:



description:


tags: