csantos + neuralnetworks   85

[1712.01208] The Case for Learned Index Structures
Updated version (?), I thought I had bookmarked it already.
machine-learning  indexing  NeuralNetworks 
july 2018 by csantos
Back Propagation Math Example
The example in this notebook demonstrates how to implement backpropagation for a multi layer convolutional neural network. We'll keep things as minimal as possible, use just one sample and a trivial network with just one hidden layer.

We'll also be using automatic differentiation, and ganja.js to help us with matrix math and multivariate dual numbers. We wont be using an AI library like tensorflow.js, as the point is showing the inner workings - not hiding them.

It'll only take a good 15 lines of code to tackle this problem, so fear not !
javascript  AutomaticDifferentiation  backpropagation  NeuralNetworks 
july 2018 by csantos
[1806.10758] Evaluating Feature Importance Estimates
Estimating the influence of a given feature to a model prediction is challenging. We introduce ROAR, RemOve And Retrain, a benchmark to evaluate the accuracy of interpretability methods that estimate input feature importance in deep neural networks. We remove a fraction of input features deemed to be most important according to each estimator and measure the change to the model accuracy upon retraining. The most accurate estimator will identify inputs as important whose removal causes the most damage to model performance relative to all other estimators. This evaluation produces thought-provoking results -- we find that several estimators are less accurate than a random assignment of feature importance. However, averaging a set of squared noisy estimators (a variant of a technique proposed by Smilkov et al. (2017)), leads to significant gains in accuracy for each method considered and far outperforms such a random guess.
MachineLearning  interpretability  NeuralNetworks 
june 2018 by csantos
RLN keras tutorial
This is a quick tutorial of the use of the Keras RLN implementation.
First, let's import and create the train and test set. In this tutorial, we're using the Boston housing price regression dataset, with additional noise features.
Statistics  MachineLearning  NeuralNetworks  DeepLearning  keras  regularization 
june 2018 by csantos
[1603.07285] A guide to convolution arithmetic for deep learning
DL people use "convolution" for many different operations, so yes, you'll probably need a guide; your linear systems course might not suffice.
convnet  convolution  machinelearning  NeuralNetworks 
march 2018 by csantos
Deep Learning, Structure and Innate Priors | Abigail See
Video. Yann LeCun and Christopher Manning discuss the role of priors/structure in machine learning.
yannlecun  ChrisManning  watchlist  papers  prior  statistics  NeuralNetworks  MachineLearning 
february 2018 by csantos
[1711.11561] Measuring the tendency of CNNs to Learn Surface Statistical Regularities
Our main finding is that CNNs exhibit a tendency to latch onto the Fourier image statistics of the training dataset, sometimes exhibiting up to a 28% generalization gap across the various test sets. Moreover, we observe that significantly increasing the depth of a network has a very marginal impact on closing the aforementioned generalization gap. Thus we provide quantitative evidence supporting the hypothesis that deep CNNs tend to learn surface statistical regularities in the dataset rather than higher-level abstract concepts.
machinelearning  deeplearning  deep-learning  machine-learning  by:YoshuaBengio  NeuralNetworks  generalization  statistics 
january 2018 by csantos
[1711.00489v1] Don't Decay the Learning Rate, Increase the Batch Size
"It is common practice to decay the learning rate. Here we show one can usually obtain the same learning curve on both training and test sets by instead increasing the batch size during training. This procedure is successful for stochastic gradient descent (SGD), SGD with momentum, Nesterov momentum, and Adam. It reaches equivalent test accuracies after the same number of training epochs, but with fewer parameter updates, leading to greater parallelism and shorter training times. We can further reduce the number of parameter updates by increasing the learning rate ϵ and scaling the batch size B∝ϵ. Finally, one can increase the momentum coefficient m and scale B∝1/(1−m), although this tends to slightly reduce the test accuracy. Crucially, our techniques allow us to repurpose existing training schedules for large batch training with no hyper-parameter tuning. We train Inception-ResNet-V2 on ImageNet to 77% validation accuracy in under 2500 parameter updates, efficiently utilizing training batches of 65536 images."
papers  neuralnetworks  optimization  sgd  batch-size  via:abiola  via:arsyed 
december 2017 by csantos
Fader Networks:Manipulating Images by Sliding Attributes
This paper introduces a new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of the image and the values of attributes directly in the latent space. As a result, after training, our model can generate different realistic versions of an input image by varying the attribute values. By using continuous attribute values, we can choose how much a specific attribute is perceivable in the generated image. This property could allow for applications where users can modify an image using sliding knobs, like faders on a mixing console, to change the facial expression of a portrait, or to update the color of some objects. Compared to the state-of-the-art which mostly relies on training adversarial networks in pixel space by altering attribute values at train time, our approach results in much simpler training schemes and nicely scales to multiple attributes. We present evidence that our model can significantly change the perceived value of the attributes while preserving the naturalness of images.
DeepLearning  NeuralNetworks  disentangling 
december 2017 by csantos
[1710.11381] Semantic Interpolation in Implicit Models
In implicit models, one often interpolates between sampled points in latent space. As we show in this paper, care needs to be taken to match-up the distributional assumptions on code vectors with the geometry of the interpolating paths. Otherwise, typical assumptions about the quality and semantics of in-between points may not be justified. Based on our analysis we propose to modify the prior code distribution to put significantly more probability mass closer to the origin. As a result, linear interpolation paths are not only shortest paths, but they are also guaranteed to pass through high-density regions, irrespective of the dimensionality of the latent space. Experiments on standard benchmark image datasets demonstrate clear visual improvements in the quality of the generated samples and exhibit more meaningful interpolation paths.
interpolation  latentspace  implicitmodels  gan  imagegeneration  NeuralNetworks  DeepLearning  MachineLearning 
november 2017 by csantos
The Neural Network Zoo - The Asimov Institute
Excellent overview of the various types of neural networks (via @mynameisfiber)
NeuralNetworks  DeepLearning  Visualization 
september 2016 by csantos
Why are Eight Bits Enough for Deep Neural Networks? « Pete Warden's blog
"when we are trying to teach a network, the aim is to have it understand the patterns that are useful evidence and discard the meaningless variations and irrelevant details. that means we expect the network to be able to produce good results despite a lot of noise. dropout is a good example of synthetic grit being thrown into the machinery, so that the final network can function even with very adverse data."
deepLearning  neuralnetworks  machinelearning  robustness  reduced-precision  via:chl 
may 2015 by csantos
« earlier      
per page:    204080120160

related tags

3d  adversarialexamples  adversary  agents  ai  algorithms  animation  ArchitectureSearch  art  attention  audio  AutomaticDifferentiation  backpropagation  batch-size  bayes  bayesian  BoltzmannMachine  book  BooleanNetworks  boost  bounds  by:FrancoisFleuret  by:GeoffreyHinton  by:JohnHopfield  by:YoshuaBengio  C  c++  ChrisManning  clojure  code  competitiveLearning  composition  computervision  computing  convnet  convolution  convolutionalNetworks  convolutional_networks  cool  course  data_augmentation  decisionTrees  deep-learning  deeplearning  design  development  DigitRecognition  disentangling  DynamicalSystems  EchoStateNetworks  erlang  evaluation  FeatureExtraction  gan  GaussianProcesses  generalization  generative  GeneticAlgorithms  google  gradientMethods  gui  history  Hopfield  html  imagegeneration  implicitmodels  indexing  interpolation  interpretability  invariance  ir  java  javascript  keras  KernelMethods  latentspace  learningRate  learningRates  learningRepresentations  lectures  libraries  library  linguistics  machine-learning  machinelearning  math  matlab  mcmc  MedicalImaging  medical_imaging  mit  modelling  multiagent  music  network  neural  neuralnetworks  NeurIPS  neurocomp  neuroscience  nlp  numerics  ObjectRecognition  octave  optimization  papers  PatternRecognition  performance  prior  programming  python  pytorch  quantization  reasoning  recursive  reduced-precision  regularization  repast  research  RestrictedBoltzmannMachine  rnn  robustness  science  scientific  segmentation  sgd  simulation  software  speech  statistical_mechanics  statistics  svm  swarm  synthesis  theory  thesis  tool  tools  toread  translation  tts  UserInterface  ux  vae  via:abiola  via:arsyed  via:chl  via:pskomoroch  via:Vaguery  video  vision  Visualization  watchlist  wavenet  wavenets  yannlecun  YoshuaBengio 

Copy this bookmark:



description:


tags: