lstm   691

« earlier    

Seq2Seq-PyTorch/ at master · MaximumEntropy/Seq2Seq-PyTorch
GitHub is where people build software. More than 28 million people use GitHub to discover, fork, and contribute to over 85 million projects.
autoencoder  pytorch  nlp  lstm  deep-learning  seq2seq 
5 weeks ago by nharbour
Tal Perry - A word is worth a thousand pictures: Convolutional methods for text - YouTube
residual connections
dilated convolutions
vanishing gradients

residual connection CNN required less params than an LSTM, faster to train, same accuracy
CNN  vs  LSTM 
5 weeks ago by foodbaby
An empirical exploration of recurrent network architectures
The Recurrent Neural Network (RNN) is an extremely powerful sequence model that is often difficult to train. The Long Short-Term Memory (LSTM) is a specific RNN architecture whose design makes it much easier to train. While wildly successful in practice, the LSTM's architecture appears to be ad-hoc so it is not clear if it is optimal, and the significance of its individual components is unclear.

In this work, we aim to determine whether the LSTM architecture is optimal or whether much better architectures exist. We conducted a thorough architecture search where we evaluated over ten thousand different RNN architectures, and identified an architecture that outperforms both the LSTM and the recently-introduced Gated Recurrent Unit (GRU) on some but not all tasks. We found that adding a bias of 1 to the LSTM's forget gate closes the gap between the LSTM and the GRU.
rnn  lstm  gru  sequence-modeling 
6 weeks ago by arsyed
[1412.3555] Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
In this paper we compare different types of recurrent units in recurrent neural networks (RNNs). Especially, we focus on more sophisticated units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU). We evaluate these recurrent units on the tasks of polyphonic music modeling and speech signal modeling. Our experiments revealed that these advanced recurrent units are indeed better than more traditional recurrent units such as tanh units. Also, we found GRU to be comparable to LSTM.
rnn  lstm  gru  sequence-modeling 
6 weeks ago by arsyed

« earlier    

related tags

abstracts  activity  ai  algorithm  alternative  architecture  archive  arima  armpl  arxiv  asr  attention  audio  aug1  autoencoder  benchmarks  books  cell  churn  classification  classifier  clojure  cnn  cnns  code  convnet  convolution  data-science  data  datascience  dataset  datasets  deep-learning  deep  deep_learning  deep_learning_researcher  deeplearning  dl  drawing  ebooks  econometrics  embedding  embeddings  evolutionary  example  facebook  favorite  feature-engineering  finance  financial  fitness  fix  fluid  forecast  forecasting  fpga  fwam  gans  generation  generative  gist  github  good  google  graves  grid  gru  hierarchical  hierarchy  howto  hyperapp  ifttt  imageprocessing  imagerecognition  infernet  infographics  javascript  jazz  kaggle  keras  latex  learning  length  long-term  machine-learning  machine  machine_learning  machinelearning  market  matma  memory  ml  model  models  music  ner  neural-net  neural-networks  neural-turing-machine  neural  neuralnetwork  neuralnetworks  nlp  nn  numpy  pack  pad  padded  padding  papers  physics  prediction  python  pytorch  recurrent  reference  reinforcement-learning  reinforcementlearning  review  rnn  rnns  scores  scratch  search  security  sentence-embedding  sentence-embeddings  sentence  seq2seq  seqm  sequence-modeling  sequence-to-sequence  sequence  sequencelearning  sequential-modeling  speechrecognition  stock  stocks  study-group  tensorflow  text  time-series  toread  toxic-comments  tracker  training  transferlearning  transformer  translation  tutorial  tutoriale  tutorials  tweetit  twitter  understanding  variable  video  visualization  vs  wordembedding  xilinx  yelp  zynq 

Copy this bookmark: