cshalizi + o'neil.cathy   6

[1907.09013] Conscientious Classification: A Data Scientist's Guide to Discrimination-Aware Classification
"Recent research has helped to cultivate growing awareness that machine learning systems fueled by big data can create or exacerbate troubling disparities in society. Much of this research comes from outside of the practicing data science community, leaving its members with little concrete guidance to proactively address these concerns. This article introduces issues of discrimination to the data science community on its own terms. In it, we tour the familiar data mining process while providing a taxonomy of common practices that have the potential to produce unintended discrimination. We also survey how discrimination is commonly measured, and suggest how familiar development processes can be augmented to mitigate systems' discriminatory potential. We advocate that data scientists should be intentional about modeling and reducing discriminatory outcomes. Without doing so, their efforts will result in perpetuating any systemic discrimination that may exist, but under a misleading veil of data-driven objectivity."
to:NB  classifiers  algorithmic_fairness  prediction  to_teach:data-mining  o'neil.cathy 
12 weeks ago by cshalizi
War of the machines, college edition | mathbabe
"OK, so we know that the machine can grade essays written for human consumption pretty accurately. But it hasn’t had to deal with essays written for machine consumption yet. There’s major room for gaming here, and only a matter of time before there’s a competing algorithm to build a great essay. I even know how to train that algorithm. Email me privately and we can make a deal on profit-sharing.
"And considering that students will be able to get their drafts graded as many times as they want, as Mayfield advertised, this will only be easier. If I build an essay that I think should game the machine, by putting in lots of (relevant) long vocabulary words and erudite phrases, then I can always double check by having the system give me a grade. If it doesn’t work, I’ll try again.
"And the essays built this way won’t get caught via the fraud detection software that finds plagiarism, because any good essay-builder will only steal smallish phrases.
"One final point. The fact that the machine-learning grading algorithm only works when it’s been trained on thousands of essays points to yet another depressing trend: large-scale classes with the same exact assignments every semester so last year’s algorithm can be used, in the name of efficiency."

But that means last year’s essay-building algorithm can be used as well. Pretty soon it will just be a war of the machines.
education  machine_learning  text_mining  standardized_testing  o'neil.cathy 
april 2013 by cshalizi
Open data is not a panacea « mathbabe
"Which brings me to my second point about open data. It’s general wisdom that we should hope for the best but prepare for the worst. My feeling is that as we move towards open data we are doing plenty of the hoping part but not enough of the preparing part.
"If there’s one thing I learned working in finance, it’s not to be naive about how information will be used. You’ve got to learn to think like an asshole to really see what to worry about. It’s a skill which I don’t regret having.
"So, if you’re giving me information on where public schools need help, I’m going to imagine using that information to cut off credit for people who live nearby. If you tell me where environmental complaints are being served, I’m going to draw a map and see where they aren’t being served so I can take my questionable business practices there."

--- Where has the concept of "thinking like an asshole" been all my life?
data_mining  open_data  thinking_like_an_asshole  o'neil.cathy  to_teach:data-mining 
december 2012 by cshalizi
Nate Silver confuses cause and effect, ends up defending corruption « mathbabe
I see I am going to have to read this. At the least, I'd like to see how he explains all the financial crises we had before (bad) financial models.
book_reviews  silver.nate  o'neil.cathy  prediction  modeling  data_mining  statistics  the_objective_function_which_can_be_admitted_to_is_not_the_true_objective_function 
december 2012 by cshalizi

Copy this bookmark: