[1907.07073] RadioTalk: a large-scale corpus of talk radio transcripts
RadioTalk, a corpus of radio program transcripts, that includes program metadata, turn boundaries, gender, etc.
interesting  data  talk  radio  from twitter_favs
5 weeks ago by jenlowe
curran/data · GitHub
@arnicas Here are some public datasets I've collected over the years there might be some gems in there for a class.
public  data  dataset  teaching 
january 2016 by jenlowe
RT @tinysubversions: Announcing "corpora", a Github repo for JSON files of short lists of things, to help ppl make weird internet stuff.
text  data 
march 2014 by jenlowe
NYC Public School Parents: inBloom's student and teacher data screenshots
WTF MT @sebastienmarion Privacy concerns? InBloom student metadata: race, eco. status, foster care, disciplinary rec.
data  student  teacher  education  nyc  gates  fuckedup 
july 2013 by jenlowe
Scraping for… by Paul Bradshaw [Leanpub PDF/iPad/Kindle]
@dvsch oooo... not yet! Thanks, Derrick! (ps I didn't recognize you with the new avatar.) link for @dwtkns @blprnt:
journalism  data  scraping 
june 2013 by jenlowe
Invasion of the Data Snatchers -
Last month Julia Angwin of The Wall Street Journal disclosed that Attorney General Eric Holder had authorized the National Counterterrorism Center to copy and examine pretty much any information the government has collected about you. In the past, the agency couldn’t store information about ordinary Americans unless they were suspects in or party to a specific investigation. Under the new orders, flight records, lists of Americans hosting foreign-exchange students, financial records of people seeking federally backed mortgages, health records of patients at veterans’ hospitals — pick a database, and this obscure agency has permission to study it for patterns that ostensibly predict terrorist behavior, and to share it with foreign governments, whether or not you are suspected of any wrongdoing. The new rules were subjected to robust official debate — all behind closed doors.
privacy  government  data  sxswtalk 
january 2013 by jenlowe
RE: Open Knowledge Festival—Between Open Data and Public Knowledge
I often use the DIKW model which generally says that DATA given context becomes INFORMATION give meaning becomes KNOWLEDGE given insight becomes WISDOM. I would add “…given organizing becomes ACTION”. Yet, the Open Knowledge Festival was much more concerned with open data than with open knowledge. It seems like this movement is breeding a new phenomenon: the spreadsheet activist. It’s a somewhat apolitical way of dealing with politics and that’s not good. When there was any discussion of contextualizing the data, the automatic solution was always: Data visualization. And you know what I think about that…
open  data  knowledge  via@mushon  spreadsheet  activism  warning  criticism 
october 2012 by jenlowe
sed, awk, grep for JSON. RT @ReaderMeter Can't wait to start playing with jq, a command line processor for #JSON.
commandline  json  data 
october 2012 by jenlowe
