jerid.francom + textbook   204

TED talks as Data
The files in this folder are the data files released as part of the paper, "TED talks as Data," submitted to the Journal of Cultural Analyics. The first of which is the exported CSV (from a Google sheet) of a list of TED talks maintained by anonymous authors
corpus  ted  textbook  resources 
6 days ago by jerid.francom
Enron Email Corpus
This dataset contains data from about 150 users, mostly senior management of Enron, organized into folders. The corpus contains a total of about 0.5M messages. This data was originally made public, and posted to the web, by the Federal Energy Regulatory Commission during its investigation.
data  cmc  textbook  email 
7 weeks ago by jerid.francom
Small World of Words Home
Word associations provided for multiple languages based on human entries
datasets  language  nlp  textbook  data 
december 2018 by jerid.francom
CRAN - Package fakeR
R package to simulate datasets from various distributions
textbook  data  simulation  tidyverse 
october 2018 by jerid.francom
Simulating study data
R package to simulate datasets with various distributions
textbook  data  simulation  datasets  tidyverse 
october 2018 by jerid.francom
The aim of this repository is to promote research on the learning of French and Spanish as L2, by making parallel learner corpora for each language freely available to the research community.
corpus  learner  spanish  french  textbook  data  corpora 
october 2018 by jerid.francom
Rachael Tatman | Kaggle
A great series of tutorials on various aspects of R and doing text analytics with R.
textbook  tutorials  r  nlp  textmining  transformation  modeling 
september 2018 by jerid.francom
Tutorials on Advanced Stats and Machine Learning With R
A good introduction to ggplot plotting and regression models for data science.
datascience  r  statistics  tutorial  textbook 
july 2018 by jerid.francom
xkcd: Online Communities 2
XKCD map of language use; spoken versus commuter mediated.
internet  maps  social-media  textbook  380 
june 2018 by jerid.francom
An Introduction to Statistical and Data Sciences via R
A bookdown book which provides a tidyverse approach to data science. Includes basic aspects of the data science workflow and practical statistical coding exercises.
textbook  textbooks  example  r  datascience 
june 2018 by jerid.francom
RecommendR is an app that suggests R packages you might be interested in based on packages you are already considering for a project.
r  packages  shiny  recommendations  textbook 
may 2018 by jerid.francom
Home Page for 20 Newsgroups Data Set
The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across 20 different newsgroups. To the best of my knowledge, it was originally collected by Ken Lang, probably for his Newsweeder: Learning to filter netnews paper, though he does not explicitly mention this collection. The 20 newsgroups collection has become a popular data set for experiments in text applications of machine learning techniques, such as text classification and text clustering.
dataset  text  textbook 
april 2018 by jerid.francom
This is a book about Natural Language Processing. By "natural language" we mean a language that is used for everyday communication by humans; languages like English, Hindi or Portuguese. In contrast to artificial languages such as programming languages and mathematical notations, natural languages have evolved as they pass from generation to generation, and are hard to pin down with explicit rules. We will take Natural Language Processing — or NLP for short — in a wide sense to cover any kind of computer manipulation of natural language. At one extreme, it could be as simple as counting word frequencies to compare different writing styles. At the other extreme, NLP involves "understanding" complete human utterances, at least to the extent of being able to give useful responses to them.
language  programming  python  textbook  examples 
april 2018 by jerid.francom
Introduction to Text Analysis - A Coursebook
This workbook provides a brief introduction to digital text analysis through a series of three-part units. Each unit introduces students to a concept, a tool for or method of digital text analysis, and a series of exercises for practicing the new skills. In some cases, studies of particular projects are presented instead of tools in the third section of each unit.
textbook  examples 
april 2018 by jerid.francom
Parsed Corpora/Treebanks
This list excludes the parsed historical corpora listed above
textbook  treebanks  corpora 
december 2017 by jerid.francom
Stanford CoreNLP
Natural language software from Stanford for providing various lexical, syntactic, and semantic annotations for text.
nlp  textbook  treebanks  tagger  sentiment 
december 2017 by jerid.francom
Linguist's Search Engine
Web interface to various English corpora
web  corpus  search  textbook 
december 2017 by jerid.francom
The Switchboard Dialog Act Corpus
A corpus of 1155 5-minute conversations in American English, comprising 205,000 utterances and 1.4 million words, from the Switchboard corpus of telephone conversations.
textbook  data  corpora  switchboard  spoken 
november 2017 by jerid.francom
Reproducibility · Advanced R.
Creating a reproducible example for getting help from sites like StackOverflow.
r  dput  reproducible  example  reprex  textbook 
november 2017 by jerid.francom
Partitioned data frames for 'dplyr'
r  packages  transformation  parallel  textbook 
november 2017 by jerid.francom
Sentiment lexicon for Portuguese
r  packages  sentiment  portuguese  data  textbook 
november 2017 by jerid.francom
Data Viz Project
Collection of data visualizations to get inspired and finding the right type.
r  visualization  guide  data  textbook 
november 2017 by jerid.francom
Non invasive pretty printing of R code
r  packages  formatting  code  textbook 
october 2017 by jerid.francom
Journal of Quantitative Linguistics
Journal of Quantitative Linguistics: Vol 24, No 4
journals  publications  textbook 
october 2017 by jerid.francom
Teaching Yourself to Code in DH – the scottbot irregular

Book-length introductions to programming or analytic methods (math / statistics / etc.) aimed at or useful for humanists with limited coding experience.
programming  textbook  digitalhumanities 
october 2017 by jerid.francom
ggplotgui/ at master · gertstulp/ggplotgui · GitHub
An R package that allows the user to create various plots with ggplot2 using a Shiny gui.
textbook  visualization  shiny 
october 2017 by jerid.francom
Load US Census boundary and attribute data as 'tidyverse' and 'sf'-ready data frames in R
r  packages  census  textbook 
october 2017 by jerid.francom
BYU corpora
A repository of corpora that includes billions of words of data.
textbook  repository  corpora 
october 2017 by jerid.francom
« earlier      
per page:    204080120160

related tags

6th-edition  2016-elections  academia  ACTIV-ES  ai  algorithm  amazon  analysis  analytics  ANEW  annotated  apa  api  apis  arizona  articles  artificial-intelligence  attribution  author  aws  basic-language  bias  blog  blogging  bookdown  books  census  cheat-sheet  checkpoint  children  citations  citr  classification  cleaning  cli  cloud  clustering  cmc  code  coding  communication  computation  computing  conferences  corpora  corpus  courses  csv  culture  data  data-journalism  database  datascience  dataset  datasets  dataviz  dats  decision-trees  demo  development  digitalhumanities  docker  documentation  dplyr  dput  ebooks  education  effectsize  email  emotion  english  enron  entropy  español  europarle  example  examples  excel  exercises  experimental  exploration  fantasy  figshare  files  film  forensics  formatting  french  frequency  gender  geology  ggplot2  git  github  glossary  google  graphics  guide  hadleyverse  hathitrust  history  hpc  identification  imdb  internet  introduction  journals  k-means  keras  language  language-generation  latex  learner  learning  learnr  lexical  lexical-diversity  lexicon  library  linguistic  linguistics  listing  literacy  literature  localization  lrec  lsa  lyrics  machinelearning  mapping  maps  markdown  metadata  modeling  music  naive_bayes  narrative  network-analysis  neural-networks  news  newsflash  newspapers  ngrams  nlp  ocr  online  p-hacking  p-values  package  packages  packrat  parallel  pca  plagiarism  politics  popularity  portuguese  programming  project-management  projects  ProjectTemplate  psycholinguistics  publications  publishing  python  quateda  r  r-bloggers  recommendations  reference  regex  regular-expressions  rent  repetition  repository  reprex  reproducible  research  resources  rmarkdown  rmarkdwon  rnc  rstats  rstudio  russian  science  search  semantic  sentiment  shakespeare  sharing  shiny  simulation  sms  social-media  sociolinguistics  spanish  spark  speeches  spoken  sports  spss  statistics  students  styleguide  stylistics  Subtitles  swirl  switchboard  syntax  tagger  teaching  ted  television  templates  testing  text  Text-Analysis  textbook  textbooks  textmining  tidyr  tidytext  tidyverse  tools  topicModel  transformation  translation  treebank  treebanks  trump  tutorial  tutorials  twitter  UBCMDS  unibomber  usenet  venn  visualization  water  web  website  wfu  wikipedia  wordlists  workflow 

Copy this bookmark: