jm + classification 10
These stickers make AI hallucinate things that aren’t there - The Verge
self-driving
cars
ai
adversarial-classification
security
stickers
hacks
vision
surveillance
classification
january 2018 by jm
The sticker “allows attackers to create a physical-world attack without prior knowledge of the lighting conditions, camera angle, type of classifier being attacked, or even the other items within the scene.” So, after such an image is generated, it could be “distributed across the Internet for other attackers to print out and use.”
This is why many AI researchers are worried about how these methods might be used to attack systems like self-driving cars. Imagine a little patch you can stick onto the side of the motorway that makes your sedan think it sees a stop sign, or a sticker that stops you from being identified up by AI surveillance systems. “Even if humans are able to notice these patches, they may not understand the intent [and] instead view it as a form of art,” the researchers write.
january 2018 by jm
Fooling Neural Networks in the Physical World with 3D Adversarial Objects · labsix
november 2017 by jm
This is amazingly weird stuff. Fooling NNs with adversarial objects:
ai
deep-learning
3d-printing
objects
security
hacking
rifles
models
turtles
adversarial-classification
classification
google
inceptionv3
images
image-classification
Here is a 3D-printed turtle that is classified at every viewpoint as a “rifle” by Google’s InceptionV3 image classifier, whereas the unperturbed turtle is consistently classified as “turtle”.
We do this using a new algorithm for reliably producing adversarial examples that cause targeted misclassification under transformations like blur, rotation, zoom, or translation, and we use it to generate both 2D printouts and 3D models that fool a standard neural network at any angle. Our process works for arbitrary 3D models - not just turtles! We also made a baseball that classifies as an espresso at every angle! The examples still fool the neural network when we put them in front of semantically relevant backgrounds; for example, you’d never see a rifle underwater, or an espresso in a baseball mitt.
november 2017 by jm
"Use trees. Not too deep. Mostly ensembles."
september 2017 by jm
snarky summary of 'Data-driven Advice for Applying Machine Learning to Bioinformatics Problems', a recent analysis paper of ML algorithms
algorithms
machine-learning
bioinformatics
funny
advice
classification
september 2017 by jm
Universal adversarial perturbations
adversarial-classification
spam
image-recognition
ml
machine-learning
dnns
neural-networks
images
classification
perturbation
papers
september 2017 by jm
in today’s paper Moosavi-Dezfooli et al., show us how to create a _single_ perturbation that causes the vast majority of input images to be misclassified.
september 2017 by jm
When DNNs go wrong – adversarial examples and what we can learn from them
february 2017 by jm
Excellent paper.
ai
deep-learning
dnns
neural-networks
adversarial-classification
classification
classifiers
machine-learning
papers
[The] results suggest that classifiers based on modern machine learning techniques, even those that obtain excellent performance on the test set, are not learning the true underlying concepts that determine the correct output label. Instead, these algorithms have built a Potemkin village that works well on naturally occuring data, but is exposed as a fake when one visits points in space that do not have high probability in the data distribution.
february 2017 by jm
The NSA’s SKYNET program may be killing thousands of innocent people
february 2016 by jm
Death by Random Forest: this project is a horrible misapplication of machine learning. Truly appalling, when a false positive means death:
terrorism
surveillance
nsa
security
ai
machine-learning
random-forests
horror
false-positives
classification
statistics
The NSA evaluates the SKYNET program using a subset of 100,000 randomly selected people (identified by their MSIDN/MSI pairs of their mobile phones), and a a known group of seven terrorists. The NSA then trained the learning algorithm by feeding it six of the terrorists and tasking SKYNET to find the seventh. This data provides the percentages for false positives in the slide above.
"First, there are very few 'known terrorists' to use to train and test the model," Ball said. "If they are using the same records to train the model as they are using to test the model, their assessment of the fit is completely bullshit. The usual practice is to hold some of the data out of the training process so that the test includes records the model has never seen before. Without this step, their classification fit assessment is ridiculously optimistic."
The reason is that the 100,000 citizens were selected at random, while the seven terrorists are from a known cluster. Under the random selection of a tiny subset of less than 0.1 percent of the total population, the density of the social graph of the citizens is massively reduced, while the "terrorist" cluster remains strongly interconnected. Scientifically-sound statistical analysis would have required the NSA to mix the terrorists into the population set before random selection of a subset—but this is not practical due to their tiny number.
This may sound like a mere academic problem, but, Ball said, is in fact highly damaging to the quality of the results, and thus ultimately to the accuracy of the classification and assassination of people as "terrorists." A quality evaluation is especially important in this case, as the random forest method is known to overfit its training sets, producing results that are overly optimistic. The NSA's analysis thus does not provide a good indicator of the quality of the method.
february 2016 by jm
Exclusive: Snowden intelligence docs reveal UK spooks' malware checklist / Boing Boing
february 2016 by jm
This is an excellent essay from Cory Doctorow on mass surveillance in the post-Snowden era, and the difference between HUMINT and SIGINT. So much good stuff, including this (new to me) cite for, "Goodhart's law", on secrecy as it affects adversarial classification:
adversarial-classification
classification
surveillance
nsa
gchq
cory-doctorow
privacy
snooping
goodharts-law
google
anti-spam
filtering
spying
snowden
The problem with this is that once you accept this framing, and note the happy coincidence that your paymasters just happen to have found a way to spy on everyone, the conclusion is obvious: just mine all of the data, from everyone to everyone, and use an algorithm to figure out who’s guilty. The bad guys have a Modus Operandi, as anyone who’s watched a cop show knows. Find the MO, turn it into a data fingerprint, and you can just sort the firehose’s output into ”terrorist-ish” and ”unterrorist-ish.”
Once you accept this premise, then it’s equally obvious that the whole methodology has to be kept from scrutiny. If you’re depending on three ”tells” as indicators of terrorist planning, the terrorists will figure out how to plan their attacks without doing those three things.
This even has a name: Goodhart's law. "When a measure becomes a target, it ceases to be a good measure." Google started out by gauging a web page’s importance by counting the number of links they could find to it. This worked well before they told people what they were doing. Once getting a page ranked by Google became important, unscrupulous people set up dummy sites (“link-farms”) with lots of links pointing at their pages.
february 2016 by jm
SAMOA, an open source platform for mining big data streams
november 2013 by jm
Yahoo!'s streaming machine learning platform, built on Storm, implementing:
storm
streaming
big-data
realtime
samoa
yahoo
machine-learning
ml
decision-trees
clustering
bagging
classification
As a library, SAMOA contains state-of-the-art implementations of algorithms for distributed machine learning on streams. The first alpha release allows classification and clustering. For classification, we implemented a Vertical Hoeffding Tree (VHT), a distributed streaming version of decision trees tailored for sparse data (e.g., text). For clustering, we included a distributed algorithm based on CluStream. The library also includes meta-algorithms such as bagging.
november 2013 by jm
Practical machine learning tricks from the KDD 2011 best industry paper
july 2012 by jm
Wow, this is a fantastic paper. It's a Google paper on detecting scam/spam ads using machine learning -- but not just that, it's how to build out such a classifier to production scale, and make it operationally resilient, and, indeed, operable.
I've come across a few of these ideas before, and I'm happy to say I might have reinvented a few (particularly around the feature space), but all of them together make extremely good sense. If I wind up working on large-scale classification again, this is the first paper I'll go back to. Great info! (via Toby diPasquale.)
classification
via:codeslinger
training
machine-learning
google
ops
kdd
best-practices
anti-spam
classifiers
ensemble
map-reduce
I've come across a few of these ideas before, and I'm happy to say I might have reinvented a few (particularly around the feature space), but all of them together make extremely good sense. If I wind up working on large-scale classification again, this is the first paper I'll go back to. Great info! (via Toby diPasquale.)
july 2012 by jm
Official Google Research Blog: Lessons learned developing a practical large scale machine learning system
april 2010 by jm
good info from Google's "Seti" project
google
machine-learning
classification
ai
from delicious
april 2010 by jm
related tags
3d-printing ⊕ adversarial-classification ⊕ advice ⊕ ai ⊕ algorithms ⊕ anti-spam ⊕ bagging ⊕ best-practices ⊕ big-data ⊕ bioinformatics ⊕ cars ⊕ classification ⊖ classifiers ⊕ clustering ⊕ cory-doctorow ⊕ decision-trees ⊕ deep-learning ⊕ dnns ⊕ ensemble ⊕ false-positives ⊕ filtering ⊕ funny ⊕ gchq ⊕ goodharts-law ⊕ google ⊕ hacking ⊕ hacks ⊕ horror ⊕ image-classification ⊕ image-recognition ⊕ images ⊕ inceptionv3 ⊕ kdd ⊕ machine-learning ⊕ map-reduce ⊕ ml ⊕ models ⊕ neural-networks ⊕ nsa ⊕ objects ⊕ ops ⊕ papers ⊕ perturbation ⊕ privacy ⊕ random-forests ⊕ realtime ⊕ rifles ⊕ samoa ⊕ security ⊕ self-driving ⊕ snooping ⊕ snowden ⊕ spam ⊕ spying ⊕ statistics ⊕ stickers ⊕ storm ⊕ streaming ⊕ surveillance ⊕ terrorism ⊕ training ⊕ turtles ⊕ via:codeslinger ⊕ vision ⊕ yahoo ⊕Copy this bookmark: