nharbour + nlp   297

Understanding building blocks of ULMFIT – Kerem Turgutlu – Medium
Last week I had the time to tackle a Kaggle NLP competition: Quora Insincere Questions Classification. As it’s easy to understand from the name, the task is to identify sincere and insincere…
ulmfit  nlp  fast.ai  deep-learning 
11 days ago by nharbour
bheinzerling/bpemb: Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) - bheinzerling/bpemb
embedding  embeddings  bpe  deep-learning  nlp  oov 
29 days ago by nharbour
noisemix/noisemix: NoiseMix - data generation for natural language
NoiseMix - data generation for natural language. Contribute to noisemix/noisemix development by creating an account on GitHub.
data-augmentation  nlp 
10 weeks ago by nharbour
pfnet-research/contextual_augmentation: Contextual augmentation, a text data augmentation using a bidirectional language model.
Contextual augmentation, a text data augmentation using a bidirectional language model. - pfnet-research/contextual_augmentation
data-augmentation  nlp  github 
10 weeks ago by nharbour
EMNLP 2018 | Patrick Lewis
I just got back from EMNLP in Brussels. We were presenting our dataset paper ShARC (a blog post about ShARC will be coming soon). The scale and breadth of the conference was really something, with so many smart people doing amazing things. It was also great to meet, network and talk research with all kinds of academics in NLP. We’ve got some exciting projects planned already, and I’m really just starting out.
enlp  nlp  summary  conference  deep-learning  ruder 
november 2018 by nharbour
Why you should care about byte-level sequence-to-sequence models in NLP
This blogpost explains how byte-level models work, how this brings about the benefits they have, and how they relate to other models — character-level and word-level models, in particular.
rnn  seq2seq  byte  bytes  deep-learning  oov  nlp 
november 2018 by nharbour
huggingface/hmtl: 🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP
🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP - huggingface/hmtl
ruder  embeddings  nlp  deep-learning  code 
november 2018 by nharbour
Semantic Scholar — Allen Institute for Artificial Intelligence
An academic search engine that utilizes artificial intelligence methods to provide highly relevant results and novel tools to filter them with ease.
academic  paper  papers  pdf  extraction  allennlp  api  models  deep-learning  nlp 
november 2018 by nharbour
roamresearch/modern-tensorflow.ipynb at master · roamanalytics/roamresearch
Contribute to roamanalytics/roamresearch development by creating an account on GitHub.
nlp  jupyter  tensorflow  best-practice  deep-learning 
october 2018 by nharbour
How to do multiclass textcat? - Prodigy Support
I think I’m not understanding something basic about the API. If I need to categorize text into 20 classes, do I need to make 20 different datasets? Or do I need to pretrain a spacy model to randomly output those classes …
nlp  prodigy  deep-learning  spacy  multiclass  multiple-labels  labels 
august 2018 by nharbour
cjhutto/vaderSentiment: VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts fro
VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.
sentiment  analysis  nlp  deep-learning  sentiment-analysis  leo 
august 2018 by nharbour
Natural Language Processing is Fun! – Adam Geitgey – Medium
Computers are great at working with structured data like spreadsheets and database tables. But us humans usually communicate in words, not in tables. That’s unfortunate for computers. A lot of…
nlp  intro  introduction  tutorial  deep-learning  ruder 
july 2018 by nharbour
Detecting True & Deceptive Hotel Reviews (article) - DataCamp
In this tutorial, you’ll use a machine learning algorithm to implement a real-life problem in Python.
toxic-comments  hotel-reviews  kaggle  tutorial  nlp  classifier  deep-sleep 
july 2018 by nharbour
Universe · spaCy
This section collects the many great resources developed with or for spaCy. It includes standalone packages, plugins, extensions, educational materials, operational utilities and bindings for other languages.
emd  earth-movers-distance  nlp  spacy  deep-learning 
july 2018 by nharbour
nikitakit/self-attentive-parser: Constituency Parsing with a Self-Attentive Encoder (ACL 2018)
GitHub is where people build software. More than 28 million people use GitHub to discover, fork, and contribute to over 85 million projects.
nlp  parse  parser  tree  deep-learning  spacy  constituency-parse  consitituency 
july 2018 by nharbour
Kyubyong/nlp-datasets-1: Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP)
GitHub is where people build software. More than 28 million people use GitHub to discover, fork, and contribute to over 85 million projects.
nlp  dataset  datasets  deep-learning 
july 2018 by nharbour
Seq2Seq-PyTorch/nmt_autoencoder.py at master · MaximumEntropy/Seq2Seq-PyTorch
GitHub is where people build software. More than 28 million people use GitHub to discover, fork, and contribute to over 85 million projects.
autoencoder  pytorch  nlp  lstm  deep-learning  seq2seq 
july 2018 by nharbour
jacobeisenstein/gt-nlp-class: Course materials for Georgia Tech CS 4650 and 7650, "Natural Language"
GitHub is where people build software. More than 28 million people use GitHub to discover, fork, and contribute to over 85 million projects.
nlp  course 
june 2018 by nharbour
« earlier      
per page:    204080120160

related tags

academic  agent  ai  algorithm  allennlp  amazon  analysis  annotation  annoy  answer  anthony  api  article  artificial-intelligence  arxiv  attention  audio  author  autoencoder  aws  beam  beam-search  benchmark  benchmarks  best-practice  bias  big-data  bing  bleu  blog  bm25  book  books  bot  bots  bpe  budget  byte  bytes  capsule  capsules  caption  captions  cfg  charcnn  chat-bot  chatbot  chatbots  classes  classification  classifier  clean  cnn  co-occurence  code  collaborative  collaborative-filtering  comments  conceptnet  conditional-random-fields  conference  consitituency  constituency-parse  content  content-free-grammar  content-generation  content-samurai  convolution  convolutional  convolutions  copy-writing  copywriter  copywriting  coref  coreference  corenlp  corpora  corpus  correct  correction  cosine-similarity  course  crf  cs224n  da  data  data-augmentation  data-science  data-set  database  dataset  datasets  decanlp  deep-learning  deep-sleep  dense-net  densenet  dependencies  dependency  depthwise-convolutions  derren-brown  detection  doc2vec  docker  download  dragnet  duplicate  duplicate-content  duplicates  earth-movers-distance  elmo  embedding  embeddings  emd  emoji  emotion  emotional  emotions  encoder-decoder  enlp  entity  estimator  example  explained  extract  extraction  facebook  fake-news  fast-text  fast.ai  fasttext  fcc  file  filling  filtering  finance  fine-tuning  flickr  format  framework  free  frequency  ftfy  gan  gender  generalisation  generalization  generation  generative  generative-adversarial-networks  generator  gensim  gigaword  github  glove  glue  gluon  google  google-adwords  google-analytics  grammar  graph  gym  haiku  haiku-deck  hierarchical  hierarchy  highway  highway-networks  hinton  history  hotel-reviews  html  hypnosis  hypnotism  inflections  information-retrieval  intel  intent  intro  introduction  ir  java  javascript  jigsaw  joel-grus  julian  jupyter  jupyter-lab  jupyter-notebook  kaggle  keras  keyphrase  keyword  keyword-extraction  keywords  knn  labeling  labelling  labels  language  language-detection  language-modeling  lda  lda2vec  leaderboard  learning  leo  library  limitations  lm1b  local-attention  loss  loss-function  lsa  lsi  lstm  machine  machine-learning  machine-learnning  mapping  marketing  marketing-assets  markov  medium  meiling  metaphor  metric  microsoft  mikolov  mind-control  mlai  model  models  morphological  movie  movies  moz  multiclass  multiple-labels  music  naacl  naive-bayes  named-entity-recognition  natural  natural-language-processing  natural-language-understanding  nce  ner  net-neutrality  neural  neural-machine-translation  neural-network  ngram  ngrams  nlp  nltk  nlu  nmt  node.js  noise-contrastive-loss  notebook  noun  noun-chunks  numbers  ontology  oov  openai  overview  oxford  paper  paper2vec  papers  paraphrase  parse  parser  parsing  part-of-speech  pdf  pipeline  podcast  pointer  pointer-sentinel  pos  pos-tagger  presentation  pretrained  processing  prodigy  project  projects  psychology  python  pytorch  qna  query  question  question-answer  quora  ranking  ranknet  rap  readability  reader  recognition  recommendation  recurrent-neural-networks  reddit  reference  references  reinforcement  reinforcement-learning  relationship  reproducibility  research  research2vec  resnet  resource  resources  rnn  rouge  ruder  saas  sagemaker  sales  sales-force  salesforce  salesforce.com  salesletter  salesletters  sam  scikit-learn  score  script  search  semantic  sense2vec  sent2vec  sentence  sentence-vector  sentimenet  sentiment  sentiment-analysis  sentimental  sentinel  seq2seq  sequence  service  shannon  siamese  similarity  simpsons  skip-gram  skip-thought  slot  slot-filling  slot-tagging  slowai  software-development  software-engineering  sota  spacy  spam  speech  speech-recognition  spell-check  spelling  spinning  split  splitter  splitting  stack  stacks  stanford  state-of-art  stemming  suggestion  summarisation  summarise  summariser  summarization  summarize  summarizer  summary  svd  svm  syntax  syntaxnet  tagging  talks  temporal  tensorflow  text  text-adventures  text-classification  text-processing  text2vec  textacy  textrank  tfidf  tokenisation  tokenization  tool  topic  topic-modeling  topics  torchtext  toxic-comments  training  transfer-learning  translate  translation  tree  trump  tutorial  tutorials  tweet  tweets  twitter  typos  ulmfit  unicode  utf-8  utf8  vector  vectors  video  videos  vidsy  visualize  voice-recognition  web  wiki  wikipedia  word  word-embedding  word-embeddings  word-frequency  word-vector  word-vectors  word2vec  words  writers  writing  youtube 

Copy this bookmark:



description:


tags: