jm + analysis   18

AdoptOpenJDK/jitwatch
Log analyser and visualiser for the HotSpot JIT compiler. Inspect inlining decisions, hot methods, bytecode, and assembly. View results in the JavaFX user interface.
analysis  java  jvm  performance  tools  debugging  optimization  jit 
august 2017 by jm
Tad
'A Desktop Viewer App for Tabular Data' -- pivot CSV data easily; works well with large files; free, from Antony Courtney
dataviz  osx  csv  data  pivot-tables  analysis  desktop 
april 2017 by jm
Reproducible research: Stripe’s approach to data science
This is intriguing -- using Jupyter notebooks to embody data analysis work, and ensure it's reproducible, which brings better rigour similarly to how unit tests improve coding. I must try this.
Reproducibility makes data science at Stripe feel like working on GitHub, where anyone can obtain and extend others’ work. Instead of islands of analysis, we share our research in a central repository of knowledge. This makes it dramatically easier for anyone on our team to work with our data science research, encouraging independent exploration.

We approach our analyses with the same rigor we apply to production code: our reports feel more like finished products, research is fleshed out and easy to understand, and there are clear programmatic steps from start to finish for every analysis.
stripe  coding  data-science  reproducability  science  jupyter  notebooks  analysis  data  experiments 
november 2016 by jm
Structural and semantic deficiencies in the systemd architecture for real-world service management, a technical treatise
Despite its overarching abstractions, it is semantically non-uniform and its complicated transaction and job scheduling heuristics ordered around a dependently networked object system create pathological failure cases with little debugging context that would otherwise not necessarily occur on systems with less layers of indirection. The use of bus APIs complicate communication with the service manager and lead to duplication of the object model for little gain. Further, the unit file options often carry implicit state or are not sufficiently expressive. There is an imbalance with regards to features of an eager service manager and that of a lazy loading service manager, having rusty edge cases of both with non-generic, manager-specific facilities. The approach to logging and the circularly dependent architecture seem to imply that lots of prior art has been ignored or understudied.
analysis  systemd  linux  unix  ops  init  critiques  software  logging 
november 2015 by jm
VividCortex uses K-Means Clustering to discover related metrics
After selecting an interesting spike in a metric, the algorithm can automate picking out a selection of other metrics which spiked at the same time. I can see that being pretty damn useful
metrics  k-means-clustering  clustering  algorithms  discovery  similarity  vividcortex  analysis  data 
march 2015 by jm
The Infinite Hows, instead of the Five Whys
John Allspaw with an interesting assertion that we need to ask "how", not "why" in five-whys postmortems:
“Why?” is the wrong question.

In order to learn (which should be the goal of any retrospective or post-hoc investigation) you want multiple and diverse perspectives. You get these by asking people for their own narratives. Effectively, you’re asking “how?“

Asking “why?” too easily gets you to an answer to the question “who?” (which in almost every case is irrelevant) or “takes you to the ‘mysterious’ incentives and motivations people bring into the workplace.”

Asking “how?” gets you to describe (at least some) of the conditions that allowed an event to take place, and provides rich operational data.
ops  five-whys  john-allspaw  questions  postmortems  analysis  root-causes 
november 2014 by jm
Hydra Takes On Hadoop
The intuition behind Hydra is something like this, "I have a lot of data, and there are a lot of things I could try to learn about it -- so many that I'm not even sure what I want to know.” It's about the curse of dimensionality -- more dimensions means exponentially more cost for exhaustive analysis. Hydra tries to make it easy to reduce the number of dimensions, or the cost of watching them (via probabilistic data structures), to just the right point where everything runs quickly but can still answer almost any question you think you might care about.


Code: https://github.com/addthis/hydra

Getting Started blog post: https://www.addthis.com/blog/2014/02/18/getting-started-with-hydra/
hyrda  hadoop  data-processing  big-data  trees  clusters  analysis 
april 2014 by jm
error-prone - Catch common Java mistakes as compile-time errors
It's common for even the best programmers to make simple mistakes. And commonly, a refactoring which seems safe can leave behind code which will never do what's intended. We're used to getting help from the compiler, but it doesn't do much beyond static type checking. Using error-prone to augment the compiler's static analysis, you can catch more mistakes before they cost you time, or end up as bugs in production. We use error-prone in Google's Java build system to eliminate classes of serious bugs from entering our code, and we've open-sourced it, so you can too!
analysis  java  static-analysis  code  errors  bugs 
november 2013 by jm
What can data scientists learn from DevOps?
Interesting.

'Rather than continuing to pretend analysis is a one-time, ad hoc action, automate it. [...] you need to maintain the automation machinery, but a cost-benefit analysis will show that the effort rapidly pays off — particularly for complex actions such as analysis that are nontrivial to get right.' (via @fintanr)
via:fintanr  data-science  data  automation  devops  analytics  analysis 
november 2012 by jm
Microsoft's Azure Feb 29th, 2012 outage postmortem
'The leap day bug is that the GA calculated the valid-to date by simply taking the current date and adding one to its year. That meant that any GA that tried to create a transfer certificate on leap day set a valid-to date of February 29, 2013, an invalid date that caused the certificate creation to fail.' This caused cascading failures throughout the fleet. Ouch -- should have been spotted during code review
azure  dev  dates  leap-years  via:fanf  microsoft  outages  post-mortem  analysis  failure 
march 2012 by jm
Dr. Neal Krawetz explains perceptual hashing
ie. TinEye and other "images like this one" search engines. nice explanation
algorithm  images  analysis  programming  dct  hashing  perceptual-hash  tineye  via:hn  image 
june 2011 by jm
Gamasutra - News - Opinion: Minecraft And The Question Of Luck
'Notch’s luck was that he came across the idea of doing a first-person fortress building game. His alignment was that the game that he wanted to make was culturally connected to [he PC gamer] tribe. While the game may appear ugly, and its purchase process etc seem naive to many a gaming professional, all of those decisions that Notch made along the road to releasing his game were from the point of view of a particular perspective of what games are, what matters and what were the things that he could trust the tribe to figure out for themselves.'
tribes  viral  minecraft  gaming  analysis  games  culture  gamasutra  via:nelson  future  software  marketing  from delicious
february 2011 by jm
Dublin bikes revisited
Fantastic comparative number crunching on the JC Decaux Dublin Bikes scheme, compared to their other European cities (Brussels, Lyons, Paris, Seville), times of day, busiest stations, rainfall, etc.
bikes  dublin-bikes  cycling  dublin  ireland  jc-decaux  number-crunching  analysis  statistics  from delicious
february 2011 by jm
First logging-as-a-service tool for the cloud wins NovaUCD award - siliconrepublic.com
first, eh? not sure about that. still, good going for Irish startup JLizard, logging in the cloud seems to be hot
logging  metrics  analysis  cloud  ireland  startups  novaucd  from delicious
november 2010 by jm
Petit: Log Analysis
log analyzer; removes common strings and patterns from log files, identifying outliers and hapaxen as "interesting". also does charting of frequencies etc.
logs  logging  analysis  loganalysis  syslog  tools  from delicious
june 2010 by jm
Body By Victoria - Secure Computing: Sec-C
Dr. Neal Krawetz brings the science on detecting Photoshop retouching
pixels  images  forensics  jpeg  photoshop  fake  analysis  detection  from delicious
december 2009 by jm
glTail.rb - realtime logfile visualization
'View real-time data and statistics from any logfile on any server with SSH, in an intuitive and entertaining way', supporting postfix/spamd/clamd logs among loads of others. very cool if a little silly
dataviz  visualization  tail  gltail  opengl  linux  apache  spamd  spamassassin  logs  statistics  sysadmin  analytics  animation  analysis  server  ruby  monitoring  logging  logfiles 
july 2009 by jm

related tags

algorithm  algorithms  analysis  analytics  animation  apache  automation  azure  big-data  bikes  bugs  cloud  clustering  clusters  code  coding  communication  cool  critiques  csv  culture  cycling  data  data-processing  data-science  dataviz  dates  dct  debugging  desktop  detection  dev  devops  discovery  dublin  dublin-bikes  emoji  errors  experiments  failure  fake  five-whys  forensics  future  gamasutra  games  gaming  gltail  hadoop  hashing  hyrda  image  images  init  instagram  internet  ireland  java  jc-decaux  jit  john-allspaw  jpeg  jupyter  jvm  k-means-clustering  language  leap-years  linux  loganalysis  logfiles  logging  logs  machine-learning  marketing  metrics  microsoft  minecraft  monitoring  notebooks  novaucd  number-crunching  opengl  ops  optimization  osx  outages  perceptual-hash  performance  photoshop  pivot-tables  pixels  post-mortem  postmortems  programming  questions  reproducability  root-causes  ruby  science  server  similarity  software  spamassassin  spamd  speech  startups  static-analysis  statistics  stripe  sysadmin  syslog  systemd  tail  text  tineye  tools  trees  trends  tribes  unix  via:fanf  via:fintanr  via:hn  via:nelson  viral  visualization  vividcortex  web 

Copy this bookmark:



description:


tags: