jm + outliers   2

Introducing practical and robust anomaly detection in a time series
Twitter open-sources an anomaly-spotting R package:
Early detection of anomalies plays a key role in ensuring high-fidelity data is available to our own product teams and those of our data partners. This package helps us monitor spikes in user engagement on the platform surrounding holidays, major sporting events or during breaking news. Beyond surges in social engagement, exogenic factors – such as bots or spammers – may cause an anomaly in number of favorites or followers. The package can be used to find such bots or spam, as well as detect anomalies in system metrics after a new software release. We’re open-sourcing AnomalyDetection because we’d like the public community to evolve the package and learn from it as we have.
statistics  twitter  r  anomaly-detection  outliers  metrics  time-series  spikes  holt-winters 
january 2015 by jm
'Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm' [PDF]
'Unsupervised anomaly detection is the process of finding outliers in data sets without prior training. In this paper, a histogram-based outlier detection (HBOS) algorithm is presented, which scores records in linear time. It assumes independence of the features making it much faster than multivariate approaches at the cost of less precision. A comparative evaluation on three UCI data sets and 10 standard algorithms show, that it can detect global outliers as reliable as state-of-the-art algorithms, but it performs poor on local outlier problems. HBOS is in our experiments up to 5 times faster than clustering based algorithms and up to 7 times faster than nearest-neighbor based methods.'
histograms  anomaly-detection  anomalies  machine-learning  algorithms  via:paperswelove  outliers  unsupervised-learning  hbos 
november 2014 by jm

Copy this bookmark: