jm + adversarial-classification   9

Targeted Audio Adversarial Examples
This is phenomenal:
We have constructed targeted audio adversarial examples on speech-to-text transcription neural networks: given an arbitrary waveform, we can make a small perturbation that when added to the original waveform causes it to transcribe as any phrase we choose.

In prior work, we constructed hidden voice commands, audio that sounded like noise but transcribed to any phrases chosen by an adversary. With our new attack, we are able to improve this and make an arbitrary waveform transcribe as any target phrase.

The audio examples on this page are impressive -- a little bit of background noise, such as you might hear on a telephone call with high compression, hard to perceive if you aren't listening out for it.

Paper here:

(Via Parker Higgins, )
papers  audio  adversarial-classification  neural-networks  speech-to-text  speech  recognition  voice  attacks  exploits  via:xor 
january 2018 by jm
These stickers make AI hallucinate things that aren’t there - The Verge
The sticker “allows attackers to create a physical-world attack without prior knowledge of the lighting conditions, camera angle, type of classifier being attacked, or even the other items within the scene.” So, after such an image is generated, it could be “distributed across the Internet for other attackers to print out and use.”

This is why many AI researchers are worried about how these methods might be used to attack systems like self-driving cars. Imagine a little patch you can stick onto the side of the motorway that makes your sedan think it sees a stop sign, or a sticker that stops you from being identified up by AI surveillance systems. “Even if humans are able to notice these patches, they may not understand the intent [and] instead view it as a form of art,” the researchers write.
self-driving  cars  ai  adversarial-classification  security  stickers  hacks  vision  surveillance  classification 
january 2018 by jm
Fooling Neural Networks in the Physical World with 3D Adversarial Objects · labsix
This is amazingly weird stuff. Fooling NNs with adversarial objects:
Here is a 3D-printed turtle that is classified at every viewpoint as a “rifle” by Google’s InceptionV3 image classifier, whereas the unperturbed turtle is consistently classified as “turtle”.

We do this using a new algorithm for reliably producing adversarial examples that cause targeted misclassification under transformations like blur, rotation, zoom, or translation, and we use it to generate both 2D printouts and 3D models that fool a standard neural network at any angle. Our process works for arbitrary 3D models - not just turtles! We also made a baseball that classifies as an espresso at every angle! The examples still fool the neural network when we put them in front of semantically relevant backgrounds; for example, you’d never see a rifle underwater, or an espresso in a baseball mitt.
ai  deep-learning  3d-printing  objects  security  hacking  rifles  models  turtles  adversarial-classification  classification  google  inceptionv3  images  image-classification 
november 2017 by jm
Universal adversarial perturbations
in today’s paper Moosavi-Dezfooli et al., show us how to create a _single_ perturbation that causes the vast majority of input images to be misclassified.
adversarial-classification  spam  image-recognition  ml  machine-learning  dnns  neural-networks  images  classification  perturbation  papers 
september 2017 by jm
When DNNs go wrong – adversarial examples and what we can learn from them
Excellent paper.
[The] results suggest that classifiers based on modern machine learning techniques, even those that obtain excellent performance on the test set, are not learning the true underlying concepts that determine the correct output label. Instead, these algorithms have built a Potemkin village that works well on naturally occuring data, but is exposed as a fake when one visits points in space that do not have high probability in the data distribution.
ai  deep-learning  dnns  neural-networks  adversarial-classification  classification  classifiers  machine-learning  papers 
february 2017 by jm
Here's Why Facebook's Trending Algorithm Keeps Promoting Fake News - BuzzFeed News
Kalina Bontcheva leads the EU-funded PHEME project working to compute the veracity of social media content. She said reducing the amount of human oversight for Trending heightens the likelihood of failures, and of the algorithm being fooled by people trying to game it.
“I think people are always going to try and outsmart these algorithms — we’ve seen this with search engine optimization,” she said. “I’m sure that once in a while there is going to be a very high-profile failure.”
Less human oversight means more reliance on the algorithm, which creates a new set of concerns, according to Kate Starbird, an assistant professor at the University of Washington who has been using machine learning and other technology to evaluate the accuracy of rumors and information during events such as the Boston bombings.
“[Facebook is] making an assumption that we’re more comfortable with a machine being biased than with a human being biased, because people don’t understand machines as well,” she said.
facebook  news  gaming  adversarial-classification  pheme  truth  social-media  algorithms  ml  machine-learning  media 
october 2016 by jm
Exclusive: Snowden intelligence docs reveal UK spooks' malware checklist / Boing Boing
This is an excellent essay from Cory Doctorow on mass surveillance in the post-Snowden era, and the difference between HUMINT and SIGINT. So much good stuff, including this (new to me) cite for, "Goodhart's law", on secrecy as it affects adversarial classification:
The problem with this is that once you accept this framing, and note the happy coincidence that your paymasters just happen to have found a way to spy on everyone, the conclusion is obvious: just mine all of the data, from everyone to everyone, and use an algorithm to figure out who’s guilty. The bad guys have a Modus Operandi, as anyone who’s watched a cop show knows. Find the MO, turn it into a data fingerprint, and you can just sort the firehose’s output into ”terrorist-ish” and ”unterrorist-ish.”

Once you accept this premise, then it’s equally obvious that the whole methodology has to be kept from scrutiny. If you’re depending on three ”tells” as indicators of terrorist planning, the terrorists will figure out how to plan their attacks without doing those three things.

This even has a name: Goodhart's law. "When a measure becomes a target, it ceases to be a good measure." Google started out by gauging a web page’s importance by counting the number of links they could find to it. This worked well before they told people what they were doing. Once getting a page ranked by Google became important, unscrupulous people set up dummy sites (“link-farms”) with lots of links pointing at their pages.
adversarial-classification  classification  surveillance  nsa  gchq  cory-doctorow  privacy  snooping  goodharts-law  google  anti-spam  filtering  spying  snowden 
february 2016 by jm
Robot wars break out on poker sites • The Register
'The world's two largest poker sites, PokerStars and Full Tilt Poker, are battling to keep poker bots off their sites.' The anti-abuse arms race in poker hits the news
poker  arms-race  adversarial-classification  bots  abuse  from delicious
november 2010 by jm
Exploring the Spam Arms Race to Characterize Spam Evolution
from last week's CEAS conference; research comparing SpamAssassin releases against the evolution of the surrounding spam environment. Nice work, I always wanted to write up something like this (via JD)
spam  anti-spam  ceas  conference  papers  research  spamassassin  adversarial-classification  evolution  arms-race  via:jd  from delicious
july 2010 by jm

Copy this bookmark: