Word associations provided for multiple languages based on human entries
datasets  language  nlp  textbook  data 
december 2018 by jerid.francom
This is a book about Natural Language Processing. By "natural language" we mean a language that is used for everyday communication by humans; languages like English, Hindi or Portuguese. In contrast to artificial languages such as programming languages and mathematical notations, natural languages have evolved as they pass from generation to generation, and are hard to pin down with explicit rules. We will take Natural Language Processing — or NLP for short — in a wide sense to cover any kind of computer manipulation of natural language. At one extreme, it could be as simple as counting word frequencies to compare different writing styles. At the other extreme, NLP involves "understanding" complete human utterances, at least to the extent of being able to give useful responses to them.
language  programming  python  textbook  examples 
april 2018 by jerid.francom
R package interface focusing on getting the user full text via the Crossref search API.
r  textbook  language  api 
october 2017 by jerid.francom
R package interface to query arXiv, a repository of electronic preprints for computer science, mathematics, physics, quantitative biology, quantitative finance, and statistics.
r  textbook  language  api 
october 2017 by jerid.francom
R package interface to query the Internet Archive.
r  textbook  language  api 
october 2017 by jerid.francom
R package interface to access to the Dataverse Network APIs.
r  textbook  language  api 
october 2017 by jerid.francom
R package interface to download and process public domain works from the Project Gutenberg collection.
r  textbook  language  api 
october 2017 by jerid.francom
R package interface to query open access journals, such as PLOS.
r  textbook  language  api  packages 
october 2017 by jerid.francom
R package interface to query the Internet Archive and GDELT Television Explorer
r  textbook  language  api 
october 2017 by jerid.francom
R package interface to query any OAI-PMH repository, including Zenodo.
r  textbook  repository  api  language 
october 2017 by jerid.francom
R package interface to query the data sharing platform FigShare.
r  textbook  figshare  reproducible  research  publishing  data  language  api 
august 2017 by jerid.francom
Who is talking about the French Open? | R-bloggers
I don’t think rOpenSci’s Jeroen Ooms can ever top the coolness of his magick package but I have to admit other things he’s developped are not bad at all. He’s
twitter  r  language  detection  cldr  tutorials 
june 2017 by jerid.francom
Enron Email Dataset
Enron email data from about 150 users, mostly senior management.
data  enron  corpora  dataset  language  380  textbook 
september 2016 by jerid.francom
Humans may speak a 'universal' language
From nose to knee and red to round, the sounds humans use to construct basic words are similar around the world
150  380  datascience  language  variation 
september 2016 by jerid.francom
How Vector Space Mathematics Reveals the Hidden Sexism in Language
As neural networks tease apart the structure of language, they are finding a hidden gender bias that nobody knew was there.
vector-space-models  word2vec  sexism  language  380 
july 2016 by jerid.francom
Teenagers Are Not Ruining The English Language
If you think that terms like “YOLO” and “fleek” are poisoning the English language, then fret not: The way Western teenagers speak is in fact not ruining the world’s lingua franca. According to linguistics research published by the American Dialect Society, teenagers do not stand out from several other age brackets when it comes to influencing the evolution of English.
variation  language  change  150  socialmedia 
january 2016 by jerid.francom
How to Learn Any Language in Less Than 90 Days
With the right framework for learning faster, anyone can reach conversation fluency in any language in 90 days. Here's how to learn any language in 90 days.
language  learning  hacks  frequency  teaching 
august 2015 by jerid.francom
Linguistic Mapping Reveals How Word Meanings Sometimes Change Overnight | MIT Technology Review
Data mining the way we use words is revealing the linguistic earthquakes that constantly change our language.
linguistics  corpora  nlp  vector-space-models  semantics  language  change  variation 
november 2014 by jerid.francom
Endangered Languages Project
The Endangered Languages Project is a collaborative online platform for sharing knowledge and resources for endangered languages. Join this global effort to conserve linguistic diversity.
language  documentation  maps  linguistics  150  endangeredlanguage 
october 2014 by jerid.francom
The Corpora
The Providence (English) Corpus
The Lyon (French) Corpus
The Demuth Sesotho Corpus
corpora  corpus  children  language  acquisition  brown-university 
october 2014 by jerid.francom
