jm + backpressure   7

Tuning Spark Back Pressure by Simulation
Interesting, Spark uses a PID controller algorithm to manage backpressure:
Spark back pressure, which can be enabled by setting spark.streaming.backpressure.enabled=true, will dynamically resize batches so as to avoid queue build up. It is implemented using a Proportional Integral Derivative (PID) algorithm. This algorithm has some interesting properties, including the lack of guarantee of a stable fixed point. This can manifest itself not just in transient overshoot, but in a batch size oscillating around a (potentially optimal) constant throughput. The overshoot incurs latency; the undershoot costs throughput. Catastrophic overshoot leading to OOM is possible in degenerate circumstances (you need to choose the parameters quite deviously to cause this to happen). Having witnessed undershoot and slow recovery in production streaming jobs, I decided to investigate further by testing the algorithm with a simulator.
backpressure  streaming  queueing  pid-controllers  algorithms  congestion-control 
4 weeks ago by jm
The Importance of Tuning Your Thread Pools
Excellent blog post on thread pools, backpressure, Little's Law, and other Hystrix-related topics (PS: use Hystrix)
hystrix  threadpools  concurrency  java  jvm  backpressure  littles-law  capacity 
january 2016 by jm
Adrian Colyer reviews the Twitter Heron paper
ouch, really sounds like Storm didn't cut the muster. 'It’s hard to imagine something more damaging to Apache Storm than this. Having read it through, I’m left with the impression that the paper might as well have been titled “Why Storm Sucks”, which coming from Twitter themselves is quite a statement.'

If I was to summarise the lessons learned, it sounds like: backpressure is required; and multi-tenant architectures suck.

Update: response from Storm dev ptgoetz here: http://blog.acolyer.org/2015/06/15/twitter-heron-stream-processing-at-scale/#comment-1738
storm  twitter  heron  big-data  streaming  realtime  backpressure 
june 2015 by jm
Twitter ditches Storm
in favour of a proprietary ground-up rewrite called Heron. Reading between the lines it sounds like Storm had problems with latency, reliability, data loss, and supporting back pressure.
analytics  architecture  twitter  storm  heron  backpressure  streaming  realtime  queueing 
june 2015 by jm
outbrain/gruffalo
an asynchronous Netty based graphite proxy. It protects Graphite from the herds of clients by minimizing context switches and interrupts; by batching and aggregating metrics. Gruffalo also allows you to replicate metrics between Graphite installations for DR scenarios, for example.

Gruffalo can easily handle a massive amount of traffic, and thus increase your metrics delivery system availability. At Outbrain, we currently handle over 1700 concurrent connections, and over 2M metrics per minute per instance.
graphite  backpressure  metrics  outbrain  netty  proxies  gruffalo  ops 
april 2015 by jm
Reactive Programming for a demanding world
"building event-driven and responsive applications with RxJava", slides by Mario Fusco. Good info on practical Rx usage in Java
rxjava  rx  reactive  coding  backpressure  streams  observables 
march 2015 by jm
Notes on Distributed Systems for Young Bloods
'Below is a list of some lessons I’ve learned as a distributed systems engineer that are worth being told to a new engineer. Some are subtle, and some are surprising, but none are controversial. This list is for the new distributed systems engineer to guide their thinking about the field they are taking on. It’s not comprehensive, but it’s a good beginning.' This is a pretty nice list, a little over-stated, but that's the format. I particularly like the following: 'Exploit data-locality'; 'Learn to estimate your capacity'; 'Metrics are the only way to get your job done'; 'Use percentiles, not averages'; 'Extract services'.
systems  distributed  distcomp  cap  metrics  coding  guidelines  architecture  backpressure  design  twitter 
january 2013 by jm

Copy this bookmark:



description:


tags: