word2vec   1564

« earlier    

Just posted a new blog post on the Community on how to use in
Word2Vec  from twitter_favs
2 days ago by neuralmarket
Julia Silge - Word Vectors with tidy data principles
Last week I saw Chris Moody’s post on the Stitch Fix blog about calculating word vectors from a corpus of text using word counts and matrix factorization, and I was so excited! This blog post illustrates how to implement that approach to find word vector representations in R using tidy data principles and sparse matrices.
word2vec  pmi  example  counts  matrix  factorization 
10 days ago by foodbaby
Interested in the resurgence of "SVD as alternative to "? My article on the topic, incl. benchmarks & code…
word2vec  from twitter_favs
12 days ago by hustwj
Relevance-based Word Embedding - Semantic Scholar
Learning a high-dimensional dense representation for vocabulary terms, also known as a word embedding, has recently attracted much attention in natural language processing and information retrieval tasks. The embedding vectors are typically learned based on term proximity in a large corpus. This means that the objective in well-known word embedding algorithms, e.g., word2vec, is to accurately predict adjacent word(s) for a given word or context. However, this objective is not necessarily equivalent to the goal of many information retrieval (IR) tasks. The primary objective in various IR tasks is to capture <i>relevance</i> instead of term proximity, syntactic, or even semantic similarity. This is the motivation for developing unsupervised relevance-based word embedding models that learn word representations based on query-document relevance information. In this paper, we propose two learning models with different objective functions; one learns a relevance distribution over the vocabulary set for each query, and the other classifies each term as belonging to the relevant or non-relevant class for each query. To train our models, we used over six million unique queries and the top ranked documents retrieved in response to each query, which are assumed to be relevant to the query. We extrinsically evaluate our learned word representation models using two IR tasks: query expansion and query classification. Both query expansion experiments on four TREC collections and query classification experiments on the KDD Cup 2005 dataset suggest that the relevance-based word embedding models significantly outperform state-of-the-art proximity-based embedding models, such as word2vec and GloVe.
IR  word  embeddings  word2vec 
12 days ago by foodbaby
Query Expansion with Locally-Trained Word Embeddings - Semantic Scholar
Continuous space word embeddings have received a great deal of attention in the natural language processing and machine learning communities for their ability to model term similarity and other relationships. We study the use of term relatedness in the context of query expansion for ad hoc information retrieval. We demonstrate that word embeddings such as word2vec and GloVe, when trained globally, underperform corpus and query specific embeddings for retrieval tasks. These results suggest that other tasks benefiting from global embeddings may also benefit from local embeddings.
IR  query  expansion  word  embeddings  papers  word2vec 
12 days ago by foodbaby

« earlier    

related tags

ai  algorithms  alternative  alternatives  analytics  by:christopher-moody  candidate  classification  clustering  counts  data-science  dataset  deep-learning  deeplearning  del-dup?  dev  documentation  embedding  embeddings  example  expansion  facebook  factorization  finance  food  gauss2vec  gensim  google  ir  job  keras  language  lda  learning  literature  lsa  lsi  machine-learning  machine  machine_learning  machinelearning  matrix  ml  natural  neuralnets  neuralnetwork  nlp  nlproc  nltk  nn  overview  papers  pmi  pretrained  processing  programming  python  query  r  references  search  sentence2vec  sentimentanalysis  similarity  spark  stitchfix  survey  svd  tensor  text  textanalytics  tf-idf  tidytext  topics  tsne  vector  vectors  vsm  word-embeddings  word-vectors  word  wordembedding  wordtensors  wordvectors 

Copy this bookmark: