jm + load-balancing   29

HN thread on the new Network Load Balancer AWS product
looks like @colmmacc works on it. Lots and lots of good details here
nlb  aws  load-balancing  ops  architecture  lbs  tcp  ip 
9 days ago by jm
consistent hashing with bounded loads
'an algorithm that combined consistent hashing with an upper limit on any one server’s load, relative to the average load of the whole pool.'

Lovely blog post from Vimeo's eng blog on a new variation on consistent hashing -- incorporating a concept of overload-avoidance -- and adding it to HAProxy and using it in production in Vimeo. All sounds pretty nifty! (via Toby DiPasquale)
via:codeslinger  algorithms  networking  performance  haproxy  consistent-hashing  load-balancing  lbs  vimeo  overload  load 
5 weeks ago by jm
Service discovery at Stripe
Writeup of their Consul-based service discovery system, a bit similar to smartstack. Good description of the production problems that they saw with Consul too, and also they figured out that strong consistency isn't actually what you want in a service discovery system ;)

HN comments are good too:
consul  api  microservices  service-discovery  dns  load-balancing  l7  tcp  distcomp  smartstack  stripe  cap-theorem  scalability 
november 2016 by jm
Maglev: A Fast and Reliable Software Network Load Balancer
Maglev is Google’s network load balancer. It is a large distributed software system that runs on commodity Linux servers. Unlike traditional hardware network load balancers, it does not require a specialized physical rack deployment, and its capacity can be easily adjusted by adding or removing servers. Network routers distribute packets evenly to the Maglev machines via Equal Cost Multipath (ECMP); each Maglev machine then matches the packets to their corresponding services and spreads them evenly to the service endpoints. To accommodate high and ever-increasing traffic, Maglev is specifically optimized for packet processing performance. A single Maglev machine is able to saturate a 10Gbps link with small packets. Maglev is also equipped with consistent hashing and connection tracking features, to minimize the negative impact of unexpected faults and failures on connection-oriented protocols. Maglev has been serving Google's traffic since 2008. It has sustained the rapid global growth of Google services, and it also provides network load balancing for Google Cloud Platform.

Something we argued for quite a lot in Amazon, back in the day....
google  paper  scale  ecmp  load-balancing  via:conall  maglev  lbs 
february 2016 by jm
Neutrino Software Load Balancer
eBay's software LB, supporting URL matching, comparable to haproxy, built using Netty and Scala. Used in their QA infrastructure it seems
netty  scala  ebay  load-balancing  load-balancers  url  http  architecture 
february 2016 by jm
Seesaw: scalable and robust load balancing from Google
After evaluating a number of platforms, including existing open source projects, we were unable to find one that met all of our needs and decided to set about developing a robust and scalable load balancing platform. The requirements were not exactly complex - we needed the ability to handle traffic for unicast and anycast VIPs, perform load balancing with NAT and DSR (also known as DR), and perform adequate health checks against the backends. Above all we wanted a platform that allowed for ease of management, including automated deployment of configuration changes.

One of the two existing platforms was built upon Linux LVS, which provided the necessary load balancing at the network level. This was known to work successfully and we opted to retain this for the new platform. Several design decisions were made early on in the project — the first of these was to use the Go programming language, since it provided an incredibly powerful way to implement concurrency (goroutines and channels), along with easy interprocess communication (net/rpc). The second was to implement a modular multi-process architecture. The third was to simply abort and terminate a process if we ended up in an unknown state, which would ideally allow for failover and/or self-recovery.
seesaw  load-balancers  google  load-balancing  vips  anycast  nat  lbs  go  ops  networking 
january 2016 by jm
ELS: latency based load balancer, part 1
ELS measures the following things:

Success latency and success rate of each machine;
Number of outstanding requests between the load balancer and each machine. These are the requests that have been sent out but we haven’t yet received a reply;
Fast failures are better than slow failures, so we also measure failure latency for each machine.

Since users care a lot about latency, we prefer machines that are expected to answer quicker. ELS therefore converts all the measured metrics into expected latency from the client’s perspective.[...]

In short, the formula ensures that slower machines get less traffic and failing machines get much less traffic. Slower and failing machines still get some traffic, because we need to be able to detect when they come back up again.
latency  spotify  proxies  load-balancing  els  algorithms  c3  round-robin  load-balancers  routing 
december 2015 by jm
Google Cloud Platform HTTP/HTTPS Load Balancing
GCE's LB product is pretty nice -- HTTP/2 support, and a built-in URL mapping feature (presumably based on how Google approach that problem internally, I understand they take that approach). I'm hoping AWS are taking notes for the next generation of ELB, if that ever happens
elb  gce  google  load-balancing  http  https  spdy  http2  urls  request-routing  ops  architecture  cloud 
october 2015 by jm
fast, modern, zero-conf load balancing HTTP(S) router managed by consul; serves 15k reqs/sec, in Go, from eBay
load-balancing  consul  http  https  routing  ebay  go  open-source  fabio 
october 2015 by jm
Librato's service discovery library using Zookeeper (so strongly consistent, but with the ZK downside that an AZ outage can stall service discovery updates region-wide)
zookeeper  service-discovery  librato  java  open-source  load-balancing 
october 2015 by jm
Baker Street
client-side 'service discovery and routing system for microservices' -- another Smartstack, then
python  router  smartstack  baker-street  microservices  service-discovery  routing  load-balancing  http 
october 2015 by jm
Patrick Shuff - Building A Billion User Load Balancer - SCALE 13x - YouTube
'Want to learn how Facebook scales their load balancing infrastructure to support more than 1.3 billion users? We will be revealing the technologies and methods we use to route and balance Facebook's traffic. The Traffic team at Facebook has built several systems for managing and balancing our site traffic, including both a DNS load balancer and a software load balancer capable of handling several protocols. This talk will focus on these technologies and how they have helped improve user performance, manage capacity, and increase reliability.'

Can't find the standalone slides, unfortunately.
facebook  video  talks  lbs  load-balancing  http  https  scalability  scale  linux 
june 2015 by jm
_Blade: a Data Center Garbage Collector_
Essentially, add a central GC scheduler to improve tail latencies in a cluster, by taking instances out of the pool to perform slow GC activity instead of letting them impact live operations. I've been toying with this idea for a while, nice to see a solid paper about it
gc  latency  tail-latencies  papers  blade  go  java  scheduling  clustering  load-balancing  low-latency  performance 
april 2015 by jm
'Join-Idle-Queue: A Novel Load Balancing Algorithm for Dynamically Scalable Web Services' [paper]
We proposed the JIQ algorithms for web server farms that are dynamically scalable. The JIQ algorithms significantly outperform the state-of-the-art SQ(d) algorithm in terms of response time at the servers, while incurring no communication overhead on the critical path. The overall complexity of JIQ is no greater than that of SQ(d).

The extension of the JIQ algorithms proves to be useful at very high load. It will be interesting to acquire a better understanding of the algorithm with a varying reporting threshold. We would also like to understand better the relationship of the reporting frequency to response times, as well as an algorithm to further reduce the complexity of the JIQ-SQ(2) algorithm while maintaining its superior performance.
join-idle-queue  algorithms  scheduling  load-balancing  via:norman-maurer  jiq  microsoft  load-balancers  performance 
august 2014 by jm
a client side IPC library that is battle-tested in cloud. It provides the following features:

Load balancing;
Fault tolerance;
Multiple protocol (HTTP, TCP, UDP) support in an asynchronous and reactive model;
Caching and batching.

I like the integration of Eureka and Hystrix in particular, although I would really like to read more about Eureka's approach to availability during network partitions and CAP. has some interesting discussion on the topic. It actually sounds like the Eureka approach is more correct than using ZK: 'Eureka is available. ZooKeeper, while tolerant against single node failures, doesn't react well to long partitioning events. For us, it's vastly more important that we maintain an available registry than a necessary consistent registry. If us-east-1d sees 23 nodes, and us-east-1c sees 22 nodes for a little bit, that's OK with us.'

See also which corroborates this:

I went into one of the instances and quickly did an iptables DROP on all packets coming from the other two instances. This would simulate an availability zone continuing to function, but that zone losing network connectivity to the other availability zones. What I saw was that the two other instances noticed that the first server “going away”, but they continued to function as they still saw a majority (66%). More interestingly the first instance noticed the other two servers “going away” dropping the ensemble availability to 33%. This caused the first server to stop serving requests to clients (not only writes, but also reads). [...]

To me this seems like a concern, as network partitions should be considered an event that should be survived. In this case (with this specific configuration of zookeeper) no new clients in that availability zone would be able to register themselves with consumers within the same availability zone. Adding more zookeeper instances to the ensemble wouldn’t help considering a balanced deployment as in this case the availability would always be majority (66%) and non-majority (33%).
netflix  ribbon  availability  libraries  java  hystrix  eureka  aws  ec2  load-balancing  networking  http  tcp  architecture  clients  ipc 
july 2014 by jm
Shutterbits replacing hardware load balancers with local BGP daemons and anycast
Interesting approach. Potentially risky, though -- heavy use of anycast on a large-scale datacenter network could increase the scale of the OSPF graph, which scales exponentially. This can have major side effects on OSPF reconvergence time, which creates an interesting class of network outage in the event of OSPF flapping.

Having said that, an active/passive failover LB pair will already announce a single anycast virtual IP anyway, so, assuming there are a similar number of anycast IPs in the end, it may not have any negative side effects.

There's also the inherent limitation noted in the second-to-last paragraph; 'It comes down to what your hardware router can handle for ECMP. I know a Juniper MX240 can handle 16 next-hops, and have heard rumors that a software update will bump this to 64, but again this is something to keep in mind'. Taking a leaf from the LB design, and using BGP to load-balance across a smaller set of haproxy instances, would seem like a good approach to scale up.
scalability  networking  performance  load-balancing  bgp  exabgp  ospf  anycast  routing  datacenters  scaling  vips  juniper  haproxy  shutterstock 
may 2014 by jm
SmartStack vs. Consul
One of the SmartStack developers at AirBNB responds to's comments. FWIW, we use SmartStack in Swrve and it works pretty well...
smartstack  airbnb  ops  consul  serf  load-balancing  availability  resiliency  network-partitions  outages 
may 2014 by jm
"H" in cron syntax
This is something Jenkins have come up to randomize and distribute load, in order to avoid the "thundering-herd" bug. Good call
jenkins  randomization  load-balancing  load  thundering-herd  ops  capacity  sleep 
april 2014 by jm
Shuffle Sharding
Colm MacCarthaigh writes about a simple sharding/load-balancing algorithm which uses randomized instance selection and optional additional compartmentalization. See also: continuous hashing, and
hashing  load-balancing  sharding  partitions  dist-sys  distcomp  architecture  coding 
april 2014 by jm
Building a Balanced Universe - EVE Community
Good blog post about EVE's algorithm to load-balance a 3D map of star systems
eve  eve-online  algorithms  3d  space  load-balancing  sharding  games 
december 2013 by jm
Airbnb's Smartstack
Service discovery a la Airbnb -- Nerve and Synapse: two external daemons that run on each node, Nerve to manage registration in Zookeeper, and Synapse to generate a haproxy configuration file from that, running on each host, allowing connections to all other hosts.
haproxy  services  ops  load-balancing  service-discovery  nerve  synapse  airbnb 
october 2013 by jm
Announcing Zuul: Edge Service in the Cloud
Netflix' library to implement "edge services" -- ie. a front end to their API, web servers, and streaming servers. Some interesting features: dynamic filtering using Groovy scripts; Hystrix for software load balancing, fault tolerance, and error handling for originated HTTP requests; fine-grained service metrics; Archaius for configuration; and canary requests to detect overload risks. Pretty complex though
edge-services  api  netflix  zuul  archaius  canary-requests  http  groovy  hystrix  load-balancing  fault-tolerance  error-handling  configuration 
june 2013 by jm
Marc Brooker's "two-randoms" load balancing approach
Marc Brooker on this interesting load-balancing algorithm, including simulation results:
Using stale data for load balancing leads to a herd behavior, where requests will herd toward a previously quiet host for much longer than it takes to make that host very busy indeed. The next refresh of the cached load data will put the server high up the load list, and it will become quiet again. Then busy again as the next herd sees that it's quiet. Busy. Quiet. Busy. Quiet. And so on. One possible solution would be to give up on load balancing entirely, and just pick a host at random. Depending on the load factor, that can be a good approach. With many typical loads, though, picking a random host degrades latency and reduces throughput by wasting resources on servers which end up unlucky and quiet.

The approach taken by the studies surveyed by Mitzenmacher is to try two hosts, and pick the one with the least load. This can be done directly (by querying the hosts) but also works surprisingly well on cached load data. [...] Best of 2 is good because it combines the best of both worlds: it uses real information about load to pick a host (unlike random), but rejects herd behavior much more strongly than the other two approaches.

Having seen what Marc has worked on, and written, inside Amazon, I'd take this very seriously... cool to see he is blogging externally too.
algorithm  load-balancing  distcomp  distributed  two-randoms  marc-brooker  least-conns 
february 2013 by jm
Timelike 2: everything fails all the time
Fantastic post on large-scale distributed load balancing strategies from @aphyr. Random and least-conns routing comes out on top in his simulation (although he hasn't yet tried Marc Brooker's two-randoms routing strategy)
via:hn  routing  distributed  least-conns  load-balancing  round-robin  distcomp  networking  scaling 
february 2013 by jm
Heroku finds out that distributed queueing is hard
Stage 3 of the Rap Genius/Heroku blog drama. Summary (as far as I can tell): Heroku gave up on a fully-synchronised load-balancing setup ("intelligent routing"), since it didn't scale, in favour of randomised queue selection; they didn't sufficiently inform their customers, and metrics and docs were not updated to make this change public; the pessimal case became pretty damn pessimal; a customer eventually noticed and complained publicly, creating a public shit-storm.

Comments: 1. this is why you monitor real HTTP request latency (scroll down for crazy graphs!). 2. include 90/99 percentiles to catch the "tail" of poorly-performing requests. 3. Load balancers are hard. has more info on the intricacies of distributed load balancing -- worth a read.
heroku  rap-genius  via:hn  networking  distcomp  distributed  load-balancing  ip  queueing  percentiles  monitoring 
february 2013 by jm

related tags

3d  airbnb  alb  algorithm  algorithms  anycast  api  archaius  architecture  availability  aws  baker-street  bgp  blade  c3  canary-requests  cap-theorem  capacity  client-side  clients  cloud  clustering  coding  configuration  consistent-hashing  consul  datacenters  dist-sys  distcomp  distributed  distributed-systems  dns  dyn  ebay  ec2  ecmp  edge-services  elb  els  error-handling  eureka  eve  eve-online  exabgp  fabio  facebook  failover  fault-tolerance  games  gc  gce  geographical  go  google  groovy  haproxy  hashing  heroku  http  http2  https  hystrix  ip  ipc  java  jenkins  jiq  join-idle-queue  juniper  l7  latency  lbs  least-conns  libraries  librato  linux  load  load-balancers  load-balancing  low-latency  maglev  marc-brooker  microservices  microsoft  monitoring  nat  nerve  netflix  netty  network-partitions  networking  nlb  open-source  ops  ospf  outages  overload  paper  papers  partitions  percentiles  performance  proxies  python  queueing  randomization  rap-genius  request-routing  resiliency  ribbon  round-robin  router  routing  scala  scalability  scale  scaling  scheduling  seesaw  serf  service-discovery  services  sharding  shutterstock  sleep  smartstack  space  sparrow  spdy  spotify  stripe  synapse  tail-latencies  talks  tcp  thundering-herd  two-randoms  url  urls  via:codeslinger  via:conall  via:hn  via:norman-maurer  video  vimeo  vips  zookeeper  zuul 

Copy this bookmark: