jm + tehuti   2

Felix says:

'Like I said, I'd like to move it to a more general / non-personal repo in the future, but haven't had the time yet. Anyway, you can still browse the code there for now. It is not a big code base so not that hard to wrap one's mind around it.

It is Apache licensed and both Kafka and Voldemort are using it so I would say it is pretty self-contained (although Kafka has not moved to Tehuti proper, it is essentially the same code they're using, minus a few small fixes missing that we added).

Tehuti is a bit lower level than CodaHale (i.e.: you need to choose exactly which stats you want to measure and the boundaries of your histograms), but this is the type of stuff you would build a wrapper for and then re-use within your code base. For example: the Voldemort RequestCounter class.'
asl2  apache  open-source  tehuti  metrics  percentiles  quantiles  statistics  measurement  latency  kafka  voldemort  linkedin 
october 2014 by jm
An embryonic metrics library for Java/Scala from Felix GV at LinkedIn, extracted from Kafka's metric implementation and in the new Voldemort release. It fixes the major known problems with the Meter/Timer implementations in Coda-Hale/Dropwizard/Yammer Metrics.

'Regarding Tehuti: it has been extracted from Kafka's metric implementation. The code was originally written by Jay Kreps, and then maintained improved by some Kafka and Voldemort devs, so it definitely is not the work of just one person. It is in my repo at the moment but I'd like to put it in a more generally available (git and maven) repo in the future. I just haven't had the time yet...

As for comparing with CodaHale/Yammer, there were a few concerns with it, but the main one was that we didn't like the exponentially decaying histogram implementation. While that implementation is very appealing in terms of (low) memory usage, it has several misleading characteristics (a lack of incoming data points makes old measurements linger longer than they should, and there's also a fairly high possiblity of losing interesting outlier data points). This makes the exp decaying implementation robust in high throughput fairly constant workloads, but unreliable in sparse or spiky workloads. The Tehuti implementation provides semantics that we find easier to reason with and with a small code footprint (which we consider a plus in terms of maintainability). Of course, it is still a fairly young project, so it could be improved further.'

More background at the kafka-dev thread:
kafka  metrics  dropwizard  java  scala  jvm  timers  ewma  statistics  measurement  latency  sampling  tehuti  voldemort  linkedin  jay-kreps 
october 2014 by jm

Copy this bookmark: