dnn   931

« earlier    

Mute Background Noise | Noise Cancelling Software | krisp
Take calls from wherever you want without being embarassed
for a background noise. Get krisp for Mac and use with any conferecing app!
sound  background  suppression  noise  dnn 
28 days ago by gnuf
[1811.03804] Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient descent finds a global minimum in training deep neural networks despite the objective function being non-convex. The current paper proves gradient descent achieves zero training loss in polynomial time for a deep over-parameterized neural network with residual connections (ResNet). Our analysis relies on the particular structure of the Gram matrix induced by the neural network architecture. This structure allows us to show the Gram matrix is stable throughout the training process and this stability implies the global optimality of the gradient descent algorithm. Our bounds also shed light on the advantage of using ResNet over the fully connected feedforward architecture; our bound requires the number of neurons per layer scaling exponentially with depth for feedforward networks whereas for ResNet the bound only requires the number of neurons per layer scaling polynomially with depth. We further extend our analysis to deep residual convolutional neural networks and obtain a similar convergence result.
dnn  neural-net  analysis  gradient-descent  optimization 
4 weeks ago by arsyed
vdumoulin/conv_arithmetic: A technical report on convolution arithmetic in the context of deep learning
A technical report on convolution arithmetic in the context of deep learning - vdumoulin/conv_arithmetic
visual  visualisation  neural  network  cnn  dnn  convolution  ai  learning 
7 weeks ago by severin.smith
High Performance Computing for Big Data: Methodologies and Applications - Google Livros
High-Performance Computing for Big Data: Methodologies and Applications explores emerging high-performance architectures for data-intensive applications, novel efficient analytical strategies to boost data processing, and cutting-edge applications in diverse fields, such as machine learning, life science, neural networks, and neuromorphic engineering. The book is organized into two main sections. The first section covers Big Data architectures, including cloud computing systems, a...
dnn  energy 
11 weeks ago by carlosviansi
Twitter
Verharmlosung gehört hoffentlich auch bei den nicht zum journalistischen Handwerk.…
DNN  from twitter_favs
august 2018 by reinhard_codes
[1807.03819] Universal Transformers
Self-attentive feed-forward sequence models have been shown to achieve impressive results on sequence modeling tasks, thereby presenting a compelling alternative to recurrent neural networks (RNNs) which has remained the de-facto standard architecture for many sequence modeling problems to date. Despite these successes, however, feed-forward sequence models like the Transformer fail to generalize in many tasks that recurrent models handle with ease (e.g. copying when the string lengths exceed those observed at training time). Moreover, and in contrast to RNNs, the Transformer model is not computationally universal, limiting its theoretical expressivity. In this paper we propose the Universal Transformer which addresses these practical and theoretical shortcomings and we show that it leads to improved performance on several tasks. Instead of recurring over the individual symbols of sequences like RNNs, the Universal Transformer repeatedly revises its representations of all symbols in the sequence with each recurrent step. In order to combine information from different parts of a sequence, it employs a self-attention mechanism in every recurrent step. Assuming sufficient memory, its recurrence makes the Universal Transformer computationally universal. We further employ an adaptive computation time (ACT) mechanism to allow the model to dynamically adjust the number of times the representation of each position in a sequence is revised. Beyond saving computation, we show that ACT can improve the accuracy of the model. Our experiments show that on various algorithmic tasks and a diverse set of large-scale language understanding tasks the Universal Transformer generalizes significantly better and outperforms both a vanilla Transformer and an LSTM in machine translation, and achieves a new state of the art on the bAbI linguistic reasoning task and the challenging LAMBADA language modeling task.
DNN 
july 2018 by foodbaby

« earlier    

related tags

adversarial  ai  al  algorithm  analysis  analytics  apple  artificialintelligence  attack  attention  audio  autoland  background  bayesian  benchmark  big_data  blogs  book  brainwave  capsenet  chart  cheatsheet  cloud  cnn  code  coding  control  convnet  convolution  course  data  data_science  deep-learning  deep  deep_learning  deepclustering  deeplearning  demo  dl  docker  dpu  drone  dssm  dynamics  energy  entrepreneurship  evolution  expressivity  facebook  flight  flow  fpga  gallery-module  google  gradient-descent  graphics  grimmeathook  hack  hardware  health  history  ifttt  infosec  interpretation  ir  keras  ktfsc  language-models  learning  learningrate  linear  literature  lstm  ltr  machine-learning  machine  machine_learning  machinelearning  materials  math  microsoft  ml  mobile  model  network  networking  networks  neural-net  neural  neural_networks  neuralnetworks  nips2017  nlp  nn  noise  numerics  optimisation  optimization  overview  papers  perf  practice  probabilisitc  program  programming  project  python  pytorch  research  resources  review  rnn  rnns  robustness  scikit  security  sentence-embeddings  siri  slides  softcore  software-2.0  software  sound  sourcecode  statistics  suppression  survey  teaching  tensor  tensorflow  testing  text-classification  tflearn  themes  theory  tips  toolkit  tpu  tutorial  uav  unread  visual  visualisation  voice  work  writing 

Copy this bookmark:



description:


tags: