jm + measurement   24

Java Flame Graphs Introduction: Fire For Everyone!
lots of good detail on flame graph usage in Java, and the Honest Profiler (honest because it's safepoint-free)
profiling  java  safepoints  jvm  flame-graphs  perf  measurement  benchmarking  testing 
10 weeks ago by jm
SQS performance and latency
Some decent benchmark data on SQS:
We were looking at four values in the tests:
total number of messages sent per second (by all nodes)
total number of messages received per second
95th percentile of message send latency (how fast a message send call completes)
95th percentile of message processing latency (how long it takes between sending and receiving a message)
sqs  benchmarking  measurement  aws  latency 
july 2017 by jm
usl4j And You | codahale.com
Coda Hale wrote a handy java library implementing a USL solver
usl  scalability  java  performance  optimization  benchmarking  measurement  ops  coda-hale 
june 2017 by jm
How to Quantify Scalability
good page on the Universal Scalability Law and how to apply it
usl  performance  scalability  concurrency  capacity  measurement  excel  equations  metrics 
september 2016 by jm
raboof/nethogs: Linux 'net top' tool
NetHogs is a small 'net top' tool. Instead of breaking the traffic down per protocol or per subnet, like most tools do, it groups bandwidth by process.
nethogs  cli  networking  performance  measurement  ops  linux  top 
may 2016 by jm
Gil Tene on benchmarking
'I would strongly encourage you to avoid repeating the mistakes of testing methodologies that focus entirely on max achievable throughput and then report some (usually bogus) latency stats at those max throughout modes. The techempower numbers are a classic example of this in play, and while they do provide some basis for comparing a small aspect of behavior (what I call the "how fast can this thing drive off a cliff" comparison, or "pedal to the metal" testing), those results are not very useful for comparing load carrying capacities for anything that actually needs to maintain some form of responsiveness SLA or latency spectrum requirements.'

Some excellent advice here on how to measure and represent stack performance.

Also: 'DON'T use or report standard deviation for latency. Ever. Except if you mean it as a joke.'
performance  benchmarking  testing  speed  gil-tene  latency  measurement  hdrhistogram  load-testing  load 
april 2016 by jm
Amazon EC2 2015 Benchmark: Testing Speeds Between AWS EC2 and S3 Regions
Here we are again, a year later, and still no bloody percentiles! Just amateurish averaging. This is not how you measure anything, ffs. Still, better than nothing I suppose
fail  latency  measurement  aws  ec2  percentiles  s3 
august 2015 by jm
SolarCapture Packet Capture Software
Interesting product line -- I didn't know this existed, but it makes good sense as a "network flight recorder". Big in finance.
SolarCapture is powerful packet capture product family that can transform every server into a precision network monitoring device, increasing network visibility, network instrumentation, and performance analysis. SolarCapture products optimize network monitoring and security, while eliminating the need for specialized appliances, expensive adapters relying on exotic protocols, proprietary hardware, and dedicated networking equipment.


See also Corvil (based in Dublin!): 'I'm using a Corvil at the moment and it's awesome- nanosecond precision latency measurements on the wire.'

(via mechanical sympathy list)
corvil  timing  metrics  measurement  latency  network  solarcapture  packet-capture  financial  performance  security  network-monitoring 
may 2015 by jm
Glowroot
"Open source APM for Java" -- profiling in production, with a demo benchmark showing about a 2% performance impact. Wonder about effects on memory/GC, though
apm  java  metrics  measurement  new-relic  profiling  glowroot 
march 2015 by jm
Good advice on running large-scale database stress tests
I've been bitten by poor key distribution in tests in the past, so this is spot on: 'I'd run it with Zipfian, Pareto, and Dirac delta distributions, and I'd choose read-modify-write transactions.'

And of course, a dataset bigger than all combined RAM.

Also: http://smalldatum.blogspot.ie/2014/04/biebermarks.html -- the "Biebermark", where just a single row out of the entire db is contended on in a read/modify/write transaction: "the inspiration for this is maintaining counts for [highly contended] popular entities like Justin Bieber and One Direction."
biebermark  benchmarks  testing  performance  stress-tests  databases  storage  mongodb  innodb  foundationdb  aphyr  measurement  distributions  keys  zipfian 
december 2014 by jm
wrk2
'A constant throughput, correct latency-recording variant of wrk. This is a must-have when measuring network service latency -- corrects for Coordinated Omission error:
wrk's model, which is similar to the model found in many current load generators, computes the latency for a given request as the time from the sending of the first byte of the request to the time the complete response was received. While this model correctly measures the actual completion time of individual requests, it exhibits a strong Coordinated Omission effect, through which most of the high latency artifacts exhibited by the measured server will be ignored. Since each connection will only begin to send a request after receiving a response, high latency responses result in the load generator coordinating with the server to avoid measurement during high latency periods.
wrk  latency  measurement  tools  cli  http  load-testing  testing  load-generation  coordinated-omission  gil-tene 
november 2014 by jm
testing latency measurements using CTRL-Z
An excellent tip from Gil "HDRHistogram" Tene:
Good example of why I always "calibrate" latency tools with ^Z tests. If ^Z results don't make sense, don't use [the] tool. ^Z test math examples: If you ^Z for half the time, Max is obvious. [90th percentile] should be 80% of the ^Z stall time.
control-z  suspend  unix  testing  latencies  latency  measurement  percentiles  tips 
november 2014 by jm
FelixGV/tehuti
Felix says:

'Like I said, I'd like to move it to a more general / non-personal repo in the future, but haven't had the time yet. Anyway, you can still browse the code there for now. It is not a big code base so not that hard to wrap one's mind around it.

It is Apache licensed and both Kafka and Voldemort are using it so I would say it is pretty self-contained (although Kafka has not moved to Tehuti proper, it is essentially the same code they're using, minus a few small fixes missing that we added).

Tehuti is a bit lower level than CodaHale (i.e.: you need to choose exactly which stats you want to measure and the boundaries of your histograms), but this is the type of stuff you would build a wrapper for and then re-use within your code base. For example: the Voldemort RequestCounter class.'
asl2  apache  open-source  tehuti  metrics  percentiles  quantiles  statistics  measurement  latency  kafka  voldemort  linkedin 
october 2014 by jm
Tehuti
An embryonic metrics library for Java/Scala from Felix GV at LinkedIn, extracted from Kafka's metric implementation and in the new Voldemort release. It fixes the major known problems with the Meter/Timer implementations in Coda-Hale/Dropwizard/Yammer Metrics.

'Regarding Tehuti: it has been extracted from Kafka's metric implementation. The code was originally written by Jay Kreps, and then maintained improved by some Kafka and Voldemort devs, so it definitely is not the work of just one person. It is in my repo at the moment but I'd like to put it in a more generally available (git and maven) repo in the future. I just haven't had the time yet...

As for comparing with CodaHale/Yammer, there were a few concerns with it, but the main one was that we didn't like the exponentially decaying histogram implementation. While that implementation is very appealing in terms of (low) memory usage, it has several misleading characteristics (a lack of incoming data points makes old measurements linger longer than they should, and there's also a fairly high possiblity of losing interesting outlier data points). This makes the exp decaying implementation robust in high throughput fairly constant workloads, but unreliable in sparse or spiky workloads. The Tehuti implementation provides semantics that we find easier to reason with and with a small code footprint (which we consider a plus in terms of maintainability). Of course, it is still a fairly young project, so it could be improved further.'

More background at the kafka-dev thread: http://mail-archives.apache.org/mod_mbox/kafka-dev/201402.mbox/%3C131A7649-ED57-45CB-B4D6-F34063267664@linkedin.com%3E
kafka  metrics  dropwizard  java  scala  jvm  timers  ewma  statistics  measurement  latency  sampling  tehuti  voldemort  linkedin  jay-kreps 
october 2014 by jm
Alexey Shipilev on Java's System.nanoTime()
System.nanoTime is as bad as String.intern now: you can use it, but use it wisely. The latency, granularity, and scalability effects introduced by timers may and will affect your measurements if done without proper rigor. This is one of the many reasons why System.nanoTime should be abstracted from the users by benchmarking frameworks, monitoring tools, profilers, and other tools written by people who have time to track if the underlying platform is capable of doing what we want it to do.

In some cases, there is no good solution to the problem at hand. Some things are not directly measurable. Some things are measurable with unpractical overheads. Internalize that fact, weep a little, and move on to building the indirect experiments. This is not the Wonderland, Alice. Understanding how the Universe works often needs side routes to explore.

In all seriousness, we should be happy our $1000 hardware can measure 30 nanosecond intervals pretty reliably. This is roughly the time needed for the Internet packets originating from my home router to leave my apartment. What else do you want, you spoiled brats?
benchmarking  jdk  java  measurement  nanoseconds  nsecs  nanotime  jvm  alexey-shipilev  jmh 
may 2014 by jm
LatencyUtils by giltene
The LatencyUtils package includes useful utilities for tracking latencies. Especially in common in-process recording scenarios, which can exhibit significant coordinated omission sensitivity without proper handling.
gil-tene  metrics  java  measurement  coordinated-omission  latency  speed  service-metrics  open-source 
november 2013 by jm
Benchmarking Redis on AWS ElastiCache
good data points, but could do with latency percentiles
latency  redis  measurement  benchmarks  ec2  elasticache  aws  storage  tests 
september 2013 by jm
Low Overhead Method Profiling with Java Mission Control now enabled in the most recent HotSpot JVM release
Built into the HotSpot JVM [in JDK version 7u40] is something called the Java Flight Recorder. It records a lot of information about/from the JVM runtime, and can be thought of as similar to the Data Flight Recorders you find in modern airplanes. You normally use the Flight Recorder to find out what was happening in your JVM when something went wrong, but it is also a pretty awesome tool for production time profiling. Since Mission Control (using the default templates) normally don’t cause more than a per cent overhead, you can use it on your production server.


I'm intrigued by the idea of always-on profiling in production. This could be cool.
performance  java  measurement  profiling  jvm  jdk  hotspot  mission-control  instrumentation  telemetry  metrics 
september 2013 by jm
Coordinated Omission
Gil Tene raises an extremely good point about load testing, high-percentile response-time measurement, and behaviour when testing a system under load:

I've been harping for a while now about a common measurement technique problem I call "Coordinated Omission" for a while, which can often render percentile data useless. [...] I believe that this problem occurs extremely frequently in test results, but it's usually hard to deduce it's existence purely from the final data reported. But every once in a while, I see test results where the data provided is enough to demonstrate the huge percentile-misreporting effect of Coordinated Omission based purely on the summary report.

I ran into just such a case in Attila's cool posting about log4j2's truly amazing performance, so I decided to avoid polluting his thread with an elongated discussion of how to compute 99.9%'ile data, and started this topic here. That thread should really be about how cool log4j2 is, and I'm certain that it really is cool, even after you correct the measurements. [...] Basically, I think that the 99.99% observation computation is wrong, and demonstrably (using the data in the graph data posted) exhibits the classic "coordinated omission" measurement problem I've been preaching about. This test is not alone in exhibiting this, and there is nothing to be ashamed of when you find yourself making this mistake. I only figured it out after doing it myself many many times, and then I noticed that everyone else seems to also be doing it but most of them haven't yet figured it out. In fact, I run into this issue so often in percentile reporting and load testing that I'm starting to wonder if coordinated omission is there in 99.9% of latency tests ;-)
measurement  testing  latency  load-testing  gil-tene  coordinated-omission  validity  log4j  percentiles 
august 2013 by jm
Reality, Reactivity, Relevance and Repeatability in Java Application Profiling
this product from JInspired appears to support runtime profiling of java apps with < 5% performance impact
profiling  performance  java  coding  measurement 
april 2013 by jm
Determining response times with tcprstat
'Tcprstat is a free, open-source TCP analysis tool that watches network traffic and computes the delay between requests and responses. From this it derives response-time statistics and prints them out.' Computes percentiles, too
tcp  tcprstat  tcp-ip  networking  measurement  statistics  performance  instrumentation  linux  unix  tools  cli 
november 2011 by jm

related tags

alexey-shipilev  apache  aphyr  apm  asl2  aws  benchmarking  benchmarks  biebermark  capacity  cli  coda-hale  code  coding  concurrency  control-z  coordinated-omission  corvil  databases  distributions  dropwizard  ec2  elasticache  equations  errors  ewma  excel  fail  financial  flame-graphs  foundationdb  gil-tene  glowroot  hdr  hdrhistogram  histograms  hotspot  http  innodb  instrumentation  internet  isps  java  jay-kreps  jdk  jmh  jvm  kafka  keys  latencies  latency  linkedin  linux  load  load-generation  load-testing  log4j  marketing  measurement  metrics  mission-control  mongodb  nanoseconds  nanotime  nethogs  network  network-monitoring  networking  new-relic  nitsan-wakart  nsecs  ofcom  open-source  ops  optimization  packet-capture  percentiles  perf  performance  profiling  quantiles  redis  s3  safepoints  sampling  scala  scalability  security  service-metrics  solarcapture  speed  sqs  statistics  storage  stress-tests  suspend  tcp  tcp-ip  tcprstat  tehuti  telemetry  testing  tests  timers  timing  tips  tools  top  uk  unix  usl  validity  voldemort  wrk  zipfian 

Copy this bookmark:



description:


tags: