natural-language-processing   385

« earlier    

How to solve 90% of NLP problems: a step-by-step guide
Whether you are an established company or working to launch a new service, you can always leverage text data to validate, improve, and expand the functionalities of your product. The science of…
nlp  natural-language-processing 
16 days ago by hschilling
[1709.08878] Generating Sentences by Editing Prototypes
We propose a new generative model of sentences that first samples a prototype sentence from the training corpus and then edits it into a new sentence. Compared to traditional models that generate from scratch either left-to-right or by first sampling a latent sentence vector, our prototype-then-edit model improves perplexity on language modeling and generates higher quality outputs according to human evaluation. Furthermore, the model gives rise to a latent edit vector that captures interpretable semantics such as sentence similarity and sentence-level analogies.
machine-learning  natural-language-processing  rather-interesting  representation  to-write-about  nudge-targets  consider:representation  consider:performance-measures  generative-models 
26 days ago by Vaguery
[1705.00441] Learning Topic-Sensitive Word Representations
Distributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling topics and integrating topic distributions for each document we obtain representations that are able to distinguish between different meanings of a given word. Our models yield statistically significant improvements for the lexical substitution task indicating that commonly used single word representations, even when combined with contextual information, are insufficient for this task.
natural-language-processing  feature-construction  neural-networks  algorithms  nudge-targets  consider:looking-to-see  consider:performance-measures 
28 days ago by Vaguery
mewo2/ketchum: Use word vectors to interactively generate lists of similar words
This is code for generating lists of similar words, using word vector similarities from the fasttext data, released by Facebook. It's useful for building collections of words to slot into generative grammars, such as Kate Compton's tracery.
natural-language-processing  software  library  to-understand  to-do  software-development 
5 weeks ago by Vaguery
[1710.03370] iVQA: Inverse Visual Question Answering
In recent years, visual question answering (VQA) has become topical as a long-term goal to drive computer vision and multi-disciplinary AI research. The premise of VQA's significance, is that both the image and textual question need to be well understood and mutually grounded in order to infer the correct answer. However, current VQA models perhaps `understand' less than initially hoped, and instead master the easier task of exploiting cues given away in the question and biases in the answer distribution.
In this paper we propose the inverse problem of VQA (iVQA), and explore its suitability as a benchmark for visuo-linguistic understanding. The iVQA task is to generate a question that corresponds to a given image and answer pair. Since the answers are less informative than the questions, and the questions have less learnable bias, an iVQA model needs to better understand the image to be successful. We pose question generation as a multi-modal dynamic inference process and propose an iVQA model that can gradually adjust its focus of attention guided by both a partially generated question and the answer. For evaluation, apart from existing linguistic metrics, we propose a new ranking metric. This metric compares the ground truth question's rank among a list of distractors, which allows the drawbacks of different algorithms and sources of error to be studied. Experimental results show that our model can generate diverse, grammatically correct and content correlated questions that match the given answer.
artificial-intelligence  image-analysis  rather-interesting  jeopardy-questions  inverse-problems  natural-language-processing  to-write-about  nudge-targets  benchmarks 
12 weeks ago by Vaguery
Cornell NLVR
Cornell Natural Language Visual Reasoning (NLVR) is a language grounding dataset. It contains 92,244 pairs of natural language statements grounded in synthetic images. The task is to determine whether a sentence is true or false about an image.
ai  natural-language-processing 
november 2017 by HighCharisma
[1707.05589] On the State of the Art of Evaluation in Neural Language Models
Ongoing innovations in recurrent neural network architectures have provided a steady influx of apparently state-of-the-art results on language modelling benchmarks. However, these have been evaluated using differing code bases and limited computational resources, which represent uncontrolled sources of experimental variation. We reevaluate several popular architectures and regularisation methods with large-scale automatic black-box hyperparameter tuning and arrive at the somewhat surprising conclusion that standard LSTM architectures, when properly regularised, outperform more recent models. We establish a new state of the art on the Penn Treebank and Wikitext-2 corpora, as well as strong baselines on the Hutter Prize dataset.
natural-language-processing  representation  machine-learning  deep-learning  to-write-about 
november 2017 by Vaguery
[1706.04902] A Survey Of Cross-lingual Word Embedding Models
Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages. In this survey, we provide a comprehensive typology of cross-lingual word embedding models. We compare their data requirements and objective functions. The recurring theme of the survey is that many of the models presented in the literature optimize for the same objectives, and that seemingly different models are often equivalent modulo optimization strategies, hyper-parameters, and such. We also discuss the different ways cross-lingual word embeddings are evaluated, as well as future challenges and research horizons.
natural-language-processing  representation  review  rather-interesting  to-write-about  to-do  algorithms  feature-extraction 
november 2017 by Vaguery
Neural Language Modeling From Scratch (Part 1)
Language models assign probability values to sequences of words. Those three words that appear right above your keyboard on your phone that try to predict the next word you’ll type are one of the uses of language modeling. In the case shown below, the language model is predicting that “from”, “on” and “it” have a high probability of being the next word in the given sentence. Internally, for each word in its vocabulary, the language model computes the probability that it will be the next word, but the user only gets to see the top three most probable words.
machine-learning  natural-language-processing  deep-learning  blog-post 
november 2017 by doneata
Bixby 2.0: The Start of the Next Paradigm Shift in Devices By Eui-Suk Chung, EVP, Head of Service Intelligence of Mobile Communications Business
Samsung announced the next version of Bixby, their digital voice assistant. Like other companies offering an assistant service, Samsung view the assistant as the control centre for connected smart and IOT devices.
@Samsung  Bixby-Samsung  Bixby-2.0  digital-voice-assistant  natural-language-processing  Bixby-SDK  Samsung-Developer-Conference  SDC-2017  connected-devices  #IOT  person:Eui-Suk-Chung 
october 2017 by elliottbledsoe
Corporate Gibberish Generator on AndrewDavidson.com
Welcome to the Corporate Gibberish Generator™ by Andrew Davidson. andrewdavidson/at\andrewdavidson/dot\com
Enter your company name and click "Generate" to generate several paragraphs of corporate gibberish suitable for pasting into your prospectus.
(The gibberish is geared more toward Internet and technology companies.)
branding  corporatism  humor  algorithms  natural-language-processing  generative-art 
october 2017 by Vaguery

« earlier    

related tags

#iot  @samsung  aesthetics  affect  ai  algorithms  amazon  amusing  api  approximation  architecture  article  artificial-intelligence  audio-video  auto  benchmarks  bias  bioinformatics  bixby-2.0  bixby-samsung  bixby-sdk  blog-post  book  bots  branding  calculus  cfg  chatbot  chatterbot  classification  connected-devices  conservatism  consider:architecture  consider:cause-and-effect  consider:feature-discovery  consider:generative-art  consider:impossible-tasks  consider:looking-to-see  consider:other-applications  consider:performance-measures  consider:rediscovery  consider:representation  content-free-grammar  content-samurai  corporatism  corpus  data-analysis  data-fusion  data-mining  data-science  data-visualization  data  dataset  deep-learning  dialog  digital-humanities  digital-voice-assistant  diss  documentation  ecommerce  embedded-systems  embeddings  emoji  encoders  english  essay  feature-construction  feature-extraction  feature-selection  free-speech  generative-art  generative-models  genetic-programming  gensim  google  grammar  have-read  history-of-science  how-to  humor  ia  image-analysis  image-processing  inference  interpretability  inverse-problems  ios  ios11  iosdev  javascript  jeopardy-questions  journals  kinda-scary  language  latex  learning-by-doing  learning-by-watching  learning-from-data  lecture-notes  library  linear-algebra  linguistics  looking-to-see  machine-learning  mathematical-recreations  mathematics  max-grigorev  metaheuristics  metal  ml  music  natural-language  natural  neural-networks  nlp  node.js  node  not-so-deep  nslinguistictagger  nudge-targets  ocr  open-source  optimization  package  parse  parser  parsing  part-of-speech  pattern-discovery  person:eui-suk-chung  pos  preprocessing  privacy  probability  processing  programming  python  r  racism  rather-interesting  reference  reinforcement-learning  representation  review  rita  robustness  salesforce.com  salesforce  samsung-developer-conference  sdc-2017  search  seeing-like-a-state  semantics  sentence  sentiment-analysis  sexism  skip-grams  smart-home  social-media  social-networks  social-norms  software-development  software  sonos  sources  spark  speech-recognition  speech  stanford  statistics  strings  stylometry  summarize  summarizer  summary  swift  syntax  system-of-professions  technology  text-analysis  text-mining  tfidf  the-objective-truth-oh-right  theory-and-practice-sitting-in-a-tree  time-series  to-do  to-read  to-understand  to-write-about  tools  topic-modeling  tree  tutorial  twitter  type-systems  uni  unsupervised-learning  user-centric?  user-interface  video  voice-interface  wikipedia  word-frequency  word-nets  word-order 

Copy this bookmark:



description:


tags: