Practical machine learning tricks from the KDD 2011 best industry paper
july 2012 by jm
Wow, this is a fantastic paper. It's a Google paper on detecting scam/spam ads using machine learning -- but not just that, it's how to build out such a classifier to production scale, and make it operationally resilient, and, indeed, operable.
I've come across a few of these ideas before, and I'm happy to say I might have reinvented a few (particularly around the feature space), but all of them together make extremely good sense. If I wind up working on large-scale classification again, this is the first paper I'll go back to. Great info! (via Toby diPasquale.)
classification
via:codeslinger
training
machine-learning
google
ops
kdd
best-practices
anti-spam
classifiers
ensemble
map-reduce
I've come across a few of these ideas before, and I'm happy to say I might have reinvented a few (particularly around the feature space), but all of them together make extremely good sense. If I wind up working on large-scale classification again, this is the first paper I'll go back to. Great info! (via Toby diPasquale.)
july 2012 by jm
related tags
anti-spam ⊕ best-practices ⊕ classification ⊕ classifiers ⊕ ensemble ⊖ google ⊕ kdd ⊕ machine-learning ⊕ map-reduce ⊕ ops ⊕ training ⊕ via:codeslinger ⊕Copy this bookmark: