jm + via:kragen   8

Crowdsourcing isn’t broken — Backchannel — Medium
'A great compendium by @harper of techniques for handling trolls and griefers in online communities', via kragen
via:kragen  antispam  filtering  trolls  community  crowdsourcing  threadless  harper  griefers  abuse  tips 
february 2015 by jm
"Man vs Machine: Practical Adversarial Detection of Malicious Crowdsourcing Workers" [paper]
"traditional ML techniques are accurate (95%–99%) in detection but can be highly vulnerable to adversarial attacks". ain't that the truth
security  adversarial-attacks  machine-learning  paper  crowdsourcing  via:kragen 
february 2015 by jm
Linus Torvalds and others on Linux's systemd
ZDNet's Steven J. Vaughan-Nichols on the systemd mess (via Kragen)
via:kragen  systemd  linux  ubuntu  gnome  init  ops 
october 2014 by jm
Harry - A Tool for Measuring String Similarity
a small tool for comparing strings and measuring their similarity. The tool supports several common distance and kernel functions for strings as well as some exotic similarity measures. The focus of Harry lies on implicit similarity measures, that is, comparison functions that do not give rise to an explicit vector space. Examples of such similarity measures are the Levenshtein distance and the Jaro-Winkler distance.
For comparison Harry loads a set of strings from input, computes the specified similarity measure and writes a matrix of similarity values to output. The similarity measure can be computed based on the granularity of characters as well as words contained in the strings. The configuration of this process, such as the input format, the similarity measure and the output format, are specified in a configuration file and can be additionally refined using command-line options.
Harry is implemented using OpenMP, such that the computation time for a set of strings scales linear with the number of available CPU cores. Moreover, efficient implementations of several similarity measures, effective caching of similarity values and low-overhead locking further speedup the computation.

via kragen.
via:kragen  strings  similarity  levenshtein-distance  algorithms  openmp  jaro-winkler  edit-distance  cli  commandline  hamming-distance  compression 
january 2014 by jm
Finite State Entropy coding
As can be guessed, the higher the compression ratio, the more efficient FSE becomes compared to Huffman, since Huffman can't break the "1 bit per symbol" limit. FSE speed is also very stable, under all probabilities. I'm quite please with the result, especially considering that, since the invention of arithmetic coding in the 70's, nothing really new has been brought to this field.

This is still beta stuff, so please consider this first release for testing purposes mostly.

Looking forward to this making it into a production release of some form.
compression  algorithms  via:kragen  fse  finite-state-entropy-coding  huffman  arithmetic-coding 
january 2014 by jm
_Availability in Globally Distributed Storage Systems_ [pdf]
empirical BigTable and GFS failure numbers from Google are orders of magnitude higher than naïve independent-failure models. (via kragen)
via:kragen  failure  bigtable  gfs  statistics  outages  reliability 
september 2013 by jm
Skype's principal architect explains why they no longer have end-to-end crypto
Mobile devices can't handle the CPU and constantly-online requirements, and an increased reliance on dedicated routing supernodes to avoid Windows-client monoculture and p2p network fragility

(via the IP list, via kragen)
skype  p2p  mobile  architecture  networking  internet  snooping  crypto  via:ip  via:kragen  phones  windows 
june 2013 by jm
rendering pcm with simulated phosphor persistence
This is something readily applicable to display of sampled time-series metric data -- it really makes regular patterns visible (and is nicely retro to boot).
When PCM waveforms and similar function plots are displayed on screen, computational speed is often preferred over beauty and information content. For example, Audacity only draws the local maximum envelope amplitude and (what appears to be) RMS power when zoomed out, and when zoomed in, displays a very straightforward linear interpolation between samples.

Analogue oscilloscopes, on the other hand, do things differently. An electron beam scans a phosphor screen at a constant X velocity, lighting a dot everywhere it hits. The dot brightness is proportional to the time the electron beam was directed at it. Because the X speed of the beam is constant and the Y position is modulated by the waveform, brightness gives information about the local derivative of the function. Now how cool is that? It looks like an X-ray of the signal. We can see right away that the beep is roughly a square wave, because there's light on top and bottom of the oscillation envelope but mostly darkness in between. Minute changes in the harmonic content are also visible as interesting banding and ribbons.

(via an _amazing_ kragen post on ghetto electronics)
via:kragen  pcm  waveforms  oscilloscopes  analog  analogue  dataviz  time-series  waves  ui  phosphor  retro 
june 2013 by jm

Copy this bookmark: