lstm   793

« earlier    

Time Series Prediction with LSTM Recurrent Neural Networks in Python with Keras
A powerful type of neural network designed to handle sequence dependence is called recurrent neural networks. The Long Short-Term Memory network or LSTM network is a type of recurrent neural network used in deep learning because very large architectures can be successfully trained.
DeepLearning  LSTM  Tutorials 
13 days ago by neuralmarket
What Kagglers are using for Text Classification
With the problem of Image Classification is more or less solved by Deep learning, Text Classification is the next new developing theme in deep learning. For those who don't know, Text classification is a common task in natural language processing, which transforms a sequence of text of indefinite length ...
DeepLearning  bidirectional  attention  classification  lstm  text  nlp  ml 
24 days ago by grinful
[1803.01271] An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory. We conclude that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutional networks should be regarded as a natural starting point for sequence modeling tasks. To assist related work, we have made code available at this
CNN  vs  RNN  DL-theory  papers  LSTM  GRU  TCN 
28 days ago by foodbaby
> A neural network that transforms a design mock-up into a static website
cnn  jupyter-notebook  lstm  seq2seq  encoder-decoder  keras  deep-learning  floydhub  cnn-keras  jupyter  machine-learning 
4 weeks ago by jefftriplett
[1808.08949] Dissecting Contextual Word Embeddings: Architecture and Representation
Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks. However, many questions remain as to how and why these models are so effective. In this paper, we present a detailed empirical study of how the choice of neural architecture (e.g. LSTM, CNN, or self attention) influences both end task accuracy and qualitative properties of the representations that are learned. We show there is a tradeoff between speed and accuracy, but all architectures learn high quality contextual representations that outperform word embeddings for four challenging NLP tasks. Additionally, all architectures learn representations that vary with network depth, from exclusively morphological based at the word embedding layer through local syntax based in the lower contextual layers to longer range semantics such coreference at the upper layers. Together, these results suggest that unsupervised biLMs, independent of architecture, are learning much more about the structure of language than previously appreciated.
ML-interpretability  CNN  LSTM  language-models  embeddings  evaluation 
5 weeks ago by foodbaby

« earlier    

related tags

ai  allocation  andrew  anomaly-detection  anomaly  arfima  arxiv  asr  associative  attention  audio  autoencoder  awd-lstm  benchmarks  bi-directional  bidirectional  blogs  books  cell  churn-model  classification  cloud  cnn-keras  cnn  code  coding  color  computervision  crnn  dataset  decoding  deep-learning  deep  deeplearning  detection  development  dl-theory  dnn  dropout  ebooks  embeddings  encoder-decoder  evaluation  explodinggradients  favorite  feature-engineering  floydhub  forecasting  games  gan  generation  generative  github  google  gradients  grid  gru  hyperapp  ifttt  imageprocessing  imagerecognition  inference  jit  jupyter-notebook  jupyter  kaggle  keras  language-modeling  language-models  language  learning  long-memory  machine-learning  machine_learning  machinelearning  memory  ml-interpretability  ml  model  models  music  network  neural-net  neural  neuralnetwork  neuralnetworks  nlp  nn  numpy  ocr  online  pad  padded  padding  paper  papers  prediction  presentation  production  python  pytorch  qrnn  quantitative_finance  r  regression  reinforcement  reinforcementlearning  research  resource  resources  rnn  rnns  scratch  script  seq2seq  sequence-modeling  sequence  sequencelearning  series  speechrecognition  statistics  study-group  tcn  teacher-student  teaching  tensorflow  text  theory  time-series  time  time_series  timeseries  trace  training  transferlearning  transformer  trends  tutorial  tutorials  tweetit  twitter  vanishinggradients  vs  window  wordembedding 

Copy this bookmark: