dnn   857

« earlier    

Verharmlosung gehört hoffentlich auch bei den nicht zum journalistischen Handwerk.…
DNN  from twitter_favs
6 weeks ago by reinhard_codes
[1807.03819] Universal Transformers
Self-attentive feed-forward sequence models have been shown to achieve impressive results on sequence modeling tasks, thereby presenting a compelling alternative to recurrent neural networks (RNNs) which has remained the de-facto standard architecture for many sequence modeling problems to date. Despite these successes, however, feed-forward sequence models like the Transformer fail to generalize in many tasks that recurrent models handle with ease (e.g. copying when the string lengths exceed those observed at training time). Moreover, and in contrast to RNNs, the Transformer model is not computationally universal, limiting its theoretical expressivity. In this paper we propose the Universal Transformer which addresses these practical and theoretical shortcomings and we show that it leads to improved performance on several tasks. Instead of recurring over the individual symbols of sequences like RNNs, the Universal Transformer repeatedly revises its representations of all symbols in the sequence with each recurrent step. In order to combine information from different parts of a sequence, it employs a self-attention mechanism in every recurrent step. Assuming sufficient memory, its recurrence makes the Universal Transformer computationally universal. We further employ an adaptive computation time (ACT) mechanism to allow the model to dynamically adjust the number of times the representation of each position in a sequence is revised. Beyond saving computation, we show that ACT can improve the accuracy of the model. Our experiments show that on various algorithmic tasks and a diverse set of large-scale language understanding tasks the Universal Transformer generalizes significantly better and outperforms both a vanilla Transformer and an LSTM in machine translation, and achieves a new state of the art on the bAbI linguistic reasoning task and the challenging LAMBADA language modeling task.
9 weeks ago by foodbaby
[1702.06106] An Attention-Based Deep Net for Learning to Rank
In information retrieval, learning to rank constructs a machine-based ranking model which given a query, sorts the search results by their degree of relevance or importance to the query. Neural networks have been successfully applied to this problem, and in this paper, we propose an attention-based deep neural network which better incorporates different embeddings of the queries and search results with an attention-based mechanism. This model also applies a decoder mechanism to learn the ranks of the search results in a listwise fashion. The embeddings are trained with convolutional neural networks or the word2vec model. We demonstrate the performance of this model with image retrieval and text querying data sets.
june 2018 by foodbaby
DeepTest: automated testing of deep-neural-network-driven autonomous cars | the morning paper
In this paper, we design, implement and evaluate DeepTest, a systematic testing tool for automatically detecting erroneous behaviors of DNN-driven vehicles that can potentially lead to fatal crashes. First, our tool is designed to automatically generated test cases leveraging real-world changes in driving conditions like rain, fog, lighting conditions, etc. DeepTest systematically explores different parts of the DNN logic by generating test inputs that maximize the numbers of activated neurons. DeepTest found thousands of erroneous behaviors under different realistic driving conditions (e.g., blurring, rain, fog, etc.) many of which lead to potentially fatal crashes in three top performing DNNs in the Udacity self-driving car challenge.
DNN  testing 
june 2018 by foodbaby
The mostly complete chart of Neural Networks, explained
The zoo of neural network types grows exponentially. One needs a map to navigate between many emerging architectures and approaches. Fortunately, Fjodor van Veen from Asimov institute compiled a…
ml  ai  analytics  big_data  chart  cheatsheet  data_science  deep-learning  deep_learning  dnn 
may 2018 by tranqy

« earlier    

related tags

activity  activitydetection  adversarial  ai  al  analytics  apple  artificialintelligence  attack  attention  audio  bayesian  benchmark  big_data  blogs  book  brainwave  capsenet  chart  cheatsheet  cloud  clustering  cnn  code  coding  compiler  course  cv  data  data_science  dataset  datasets  deep-learning  deep  deep_learning  deepclustering  deeplearning  demo  dl  docker  dpu  dssm  entrepreneurship  eventdetection  evolution  expressivity  facebook  flow  fpga  gallery-module  graphics  grimmeathook  hack  health  history  ifttt  infosec  interpretation  ir  js  keras  ktfsc  learning  learningrate  library  linear  literature  lstm  ltr  machine-learning  machine  machine_learning  machinelearning  materials  math  microsoft  ml  model  networks  neural  neural_network  neural_networks  neuralnetworks  nips2017  nlp  nn  numerics  optimisation  overview  papers  perf  practice  probabilisitc  program  programming  project  python  pytorch  relevance  research  resources  review  rnn  scikit  search  security  semantic  sentence-embeddings  siri  slides  softcore  software-2.0  sourcecode  statistics  survey  teaching  tensor  tensorflow  testing  text-classification  tflearn  themes  theory  tips  toolkit  tutorial  unread  voice  work  writing 

Copy this bookmark: