[1902.06720] Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
"A longstanding goal in deep learning research has been to precisely characterize training and generalization. However, the often complex loss landscapes of neural networks have made a theory of learning dynamics elusive. In this work, we show that for wide neural networks the learning dynamics simplify considerably and that, in the infinite width limit, they are governed by a linear model obtained from the first-order Taylor expansion of the network around its initial parameters. Furthermore, mirroring the correspondence between wide Bayesian neural networks and Gaussian processes, gradient-based training of wide neural networks with a squared loss produces test set predictions drawn from a Gaussian process with a particular compositional kernel. While these theoretical results are only exact in the infinite width limit, we nevertheless find excellent empirical agreement between the predictions of the original network and those of the linearized version even for finite practically-sized networks. This agreement is robust across different architectures, optimization methods, and loss functions."
Do ImageNet Classifiers Generalize to ImageNet?
"We build new test sets for the CIFAR-10 and ImageNet datasets. Both benchmarks have been
the focus of intense research for almost a decade, raising the danger of overfitting to excessively
re-used test sets. By closely following the original dataset creation processes, we test to what
extent current classification models generalize to new data. We evaluate a broad range of models
and find accuracy drops of 3% – 15% on CIFAR-10 and 11% – 14% on ImageNet. However,
accuracy gains on the original test sets translate to larger gains on the new test sets. Our results
suggest that the accuracy drops are not caused by adaptivity, but by the models’ inability to
generalize to slightly “harder” images than those found in the original test sets."

--- The astonishing thing to me is the _linear_ relationship between accuracy on the old and new data-set versions. It's uncannily good. (Also: tiny changes in data-preparation make a big difference!)
NLP Learning Series: Text Preprocessing Methods for Deep Learning
Recently, I started up with an NLP competition on Kaggle called Quora Question insincerity challenge. It is an NLP Challenge on text classification and as the problem has become more clear after working through the competition as well as by going through the invaluable kernels put up by the kaggle ...
Size-Independent Sample Complexity of Neural Networks
"We study the sample complexity of learning neural networks, by providing new bounds on their Rademacher complexity assuming norm constraints on the parameter matrix of each layer. Compared to previous work, these complexity bounds have improved dependence on the network depth, and under some additional assumptions, are fully independent of the network size (both depth and width). These results are derived using some novel techniques, which may be of independent interest."
This repository is a collection of tutorials for MIT Deep Learning courses. More added as courses progress.
GitHub - BrainJS/brain.js: 🤖 Neural networks in JavaScript

brain.js is a library of Neural Networks written in JavaScript.

NEW! A fun and practical introduction to Brain.js

💡 Note: This is a continuation of the harthur/brain repository (which is not maintained anymore). For more details, check out this issue.
A Style-Based Generator Architecture for Generative Adversarial Networks - YouTube
Какое-то просто монстричество с созданием и модификацией одних изображений на основе других с помощью генеративно состязательных сетей.
Об алгоритме:
Neural networks - YouTube
Мини видеокурс по нейронным сетям
torch-rnn options & settings
Efficient, reusable RNNs and LSTMs for torch. Contribute to jcjohnson/torch-rnn development by creating an account on GitHub.
