jm + deep-learning   8

The Case for Learned Index Structures
'Indexes are models: a B-Tree-Index can be seen as a model to map a key to the position of a record within a sorted array, a Hash-Index as a model to map a key to a position of a record within an unsorted array, and a BitMap-Index as a model to indicate if a data record exists or not. In this exploratory research paper, we start from this premise and posit that all existing index structures can be replaced with other types of models, including deep-learning models, which we term learned indexes. The key idea is that a model can learn the sort order or structure of lookup keys and use this signal to effectively predict the position or existence of records. We theoretically analyze under which conditions learned indexes outperform traditional index structures and describe the main challenges in designing learned index structures. Our initial results show, that by using neural nets we are able to outperform cache-optimized B-Trees by up to 70% in speed while saving an order-of-magnitude in memory over several real-world data sets. More importantly though, we believe that the idea of replacing core components of a data management system through learned models has far reaching implications for future systems designs and that this work just provides a glimpse of what might be possible.'

Excellent follow-up thread from Henry Robinson: https://threadreaderapp.com/thread/940344992723120128

'The fact that the learned representation is more compact is very neat. But also it's not really a surprise that, given the entire dataset, we can construct a more compact function than a B-tree which is *designed* to support efficient updates.' [...] 'given that the model performs best when trained on the whole data set - I strongly doubt B-trees are the best we can do with the current state-of-the art.'
data-structures  ml  google  b-trees  storage  indexes  deep-learning  henry-robinson 
6 days ago by jm
Fooling Neural Networks in the Physical World with 3D Adversarial Objects · labsix
This is amazingly weird stuff. Fooling NNs with adversarial objects:
Here is a 3D-printed turtle that is classified at every viewpoint as a “rifle” by Google’s InceptionV3 image classifier, whereas the unperturbed turtle is consistently classified as “turtle”.

We do this using a new algorithm for reliably producing adversarial examples that cause targeted misclassification under transformations like blur, rotation, zoom, or translation, and we use it to generate both 2D printouts and 3D models that fool a standard neural network at any angle. Our process works for arbitrary 3D models - not just turtles! We also made a baseball that classifies as an espresso at every angle! The examples still fool the neural network when we put them in front of semantically relevant backgrounds; for example, you’d never see a rifle underwater, or an espresso in a baseball mitt.
ai  deep-learning  3d-printing  objects  security  hacking  rifles  models  turtles  adversarial-classification  classification  google  inceptionv3  images  image-classification 
6 weeks ago by jm
Decoding the Enigma with Recurrent Neural Networks
I am blown away by this -- given that Recurrent Neural Networks are Turing-complete, they can actually automate cryptanalysis given sufficient resources, at least to the degree of simulating the internal workings of the Enigma algorithm given plaintext, ciphertext and key:
The model needed to be very large to capture all the Enigma’s transformations. I had success with a single-celled LSTM model with 3000 hidden units. Training involved about a million steps of batched gradient descent: after a few days on a k40 GPU, I was getting 96-97% accuracy!
machine-learning  deep-learning  rnns  enigma  crypto  cryptanalysis  turing  history  gpus  gradient-descent 
july 2017 by jm
A Neural Network Turned a Book of Flowers Into Shockingly Lovely Dinosaur Art
DeepArt.io, 'powered by an algorithm developed by Leon Gatys and a team from the University of Tübingen in Germany', did a really amazing job here
art  dinosaurs  ai  plants  deep-learning  graphics  cool 
june 2017 by jm
The Dark Secret at the Heart of AI - MIT Technology Review
'The mysterious mind of [NVidia's self-driving car, driven by machine learning] points to a looming issue with artificial intelligence. The car’s underlying AI technology, known as deep learning, has proved very powerful at solving problems in recent years, and it has been widely deployed for tasks like image captioning, voice recognition, and language translation. There is now hope that the same techniques will be able to diagnose deadly diseases, make million-dollar trading decisions, and do countless other things to transform whole industries.

But this won’t happen—or shouldn’t happen—unless we find ways of making techniques like deep learning more understandable to their creators and accountable to their users. Otherwise it will be hard to predict when failures might occur—and it’s inevitable they will. That’s one reason Nvidia’s car is still experimental.

Already, mathematical models are being used to help determine who makes parole, who’s approved for a loan, and who gets hired for a job. If you could get access to these mathematical models, it would be possible to understand their reasoning. But banks, the military, employers, and others are now turning their attention to more complex machine-learning approaches that could make automated decision-making altogether inscrutable. Deep learning, the most common of these approaches, represents a fundamentally different way to program computers. “It is a problem that is already relevant, and it’s going to be much more relevant in the future,” says Tommi Jaakkola, a professor at MIT who works on applications of machine learning. “Whether it’s an investment decision, a medical decision, or maybe a military decision, you don’t want to just rely on a ‘black box’ method.”'
ai  algorithms  ml  machine-learning  legibility  explainability  deep-learning  nvidia 
may 2017 by jm
When DNNs go wrong – adversarial examples and what we can learn from them
Excellent paper.
[The] results suggest that classifiers based on modern machine learning techniques, even those that obtain excellent performance on the test set, are not learning the true underlying concepts that determine the correct output label. Instead, these algorithms have built a Potemkin village that works well on naturally occuring data, but is exposed as a fake when one visits points in space that do not have high probability in the data distribution.
ai  deep-learning  dnns  neural-networks  adversarial-classification  classification  classifiers  machine-learning  papers 
february 2017 by jm
How a Japanese cucumber farmer is using deep learning and TensorFlow
Unfortunately the usual ML problem arises at the end:
One of the current challenges with deep learning is that you need to have a large number of training datasets. To train the model, Makoto spent about three months taking 7,000 pictures of cucumbers sorted by his mother, but it’s probably not enough. "When I did a validation with the test images, the recognition accuracy exceeded 95%. But if you apply the system with real use cases, the accuracy drops down to about 70%. I suspect the neural network model has the issue of "overfitting" (the phenomenon in neural network where the model is trained to fit only to the small training dataset) because of the insufficient number of training images."


In other words, as with ML since we were using it in SpamAssassin, maintaining the training corpus becomes a really big problem. :(
google  machine-learning  tensorflow  cucumbers  deep-learning  ml 
september 2016 by jm
Fast Forward Labs: Fashion Goes Deep: Data Science at Lyst
this is more than just data science really -- this is proper machine learning, with deep learning and a convolutional neural network. serious business
lyst  machine-learning  data-science  ml  neural-networks  supervised-learning  unsupervised-learning  deep-learning 
december 2015 by jm

Copy this bookmark:



description:


tags: