jm + nlp   6

Spotify’s Discover Weekly: How machine learning finds your new music
Not sure how accurate this is (it's not written by a Spotify employee), but seems pretty well researched -- according to this Discover Weekly is a mix of 3 different algorithms
discover-weekly  spotify  nlp  music  ai  ml  machine-learning 
5 weeks ago by jm
How to do named entity recognition: machine learning oversimplified
Good explanation of this NLP tokenization/feature-extraction technique. Example result: "Jimi/B-PER Hendrix/I-PER played/O at/O Woodstock/B-LOC ./O"
named-entities  feature-extraction  tokenization  nlp  ml  algorithms  machine-learning 
may 2015 by jm
Five Takeaways on the State of Natural Language Processing
Good overview of the state of the art in NLP nowadays. I particularly like word2vec interesting:
Embedding words as real-numbered vectors using a skip-gram, negative-sampling model (word2vec code) was mentioned in nearly every talk I attended. Either companies are using various word2vec implementations directly or they are building diffs off of the basic framework. Trained on large corpora, the vector representations encode concepts in a large dimensional space (usually 200-300 dim).


Quite similar to some tokenization approaches we experimented with in SpamAssassin, so I don't find this too surprising....
word2vec  nlp  tokenization  machine-learning  language  parsing  doc2vec  skip-grams  data-structures  feature-extraction  via:lemonodor 
may 2015 by jm
How the NSA Converts Spoken Words Into Searchable Text - The Intercept
This hits the nail on the head, IMO:
To Phillip Rogaway, a professor of computer science at the University of California, Davis, keyword-search is probably the “least of our problems.” In an email to The Intercept, Rogaway warned that “When the NSA identifies someone as ‘interesting’ based on contemporary NLP methods, it might be that there is no human-understandable explanation as to why beyond: ‘his corpus of discourse resembles those of others whom we thought interesting'; or the conceptual opposite: ‘his discourse looks or sounds different from most people’s.' If the algorithms NSA computers use to identify threats are too complex for humans to understand, it will be impossible to understand the contours of the surveillance apparatus by which one is judged.  All that people will be able to do is to try your best to behave just like everyone else.”
privacy  security  gchq  nsa  surveillance  machine-learning  liberty  future  speech  nlp  pattern-analysis  cs 
may 2015 by jm
Sirius: An open end-to-end voice and vision personal assistant and its implications for future warehouse scale computers
How to build an Intelligent Personal Assistant:

'Sirius is an open end-to-end standalone speech and vision based intelligent personal assistant (IPA) similar to Apple’s Siri, Google’s Google Now, Microsoft’s Cortana, and Amazon’s Echo. Sirius implements the core functionalities of an IPA including speech recognition, image matching, natural language processing and a question-and-answer system. Sirius is developed by Clarity Lab at the University of Michigan. Sirius is published at the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) 2015.'
sirius  siri  cortana  google-now  echo  ok-google  ipa  assistants  search  video  audio  speech  papers  clarity  nlp  wikipedia 
april 2015 by jm

Copy this bookmark:



description:


tags: