Uncovering Big Bias with Big Data
What follows is the story of how I used Virginia court cases to discover what best predicts defendant outcomes: race or income.
crime  bias  legal 
june 2016
20 lines of code that will beat A/B testing every time
Multi-armed bandits: One strategy that has been shown to perform well time after time in practical problems is the epsilon-greedy method. We always keep track of the number of pulls of the lever and the amount of rewards we have received from that lever. 10% of the time, we choose a lever at random. The other 90% of the time, we choose the lever that has the highest expectation of rewards.
abtesting  statistics 
june 2016
“I thought we’re the good guys?” — Humane Tech — Medium
No matter how good our intentions, people who work in tech will be seen as the new robber barons if we don’t start being more self-critical. Who’s going to step up?
tech  siliconvalley 
june 2016
Stanislav Nikolov on Twitter: "Dancing is a kind of encoding of music"
I'm going to use this metaphor in a talk one day so I'm bookmarking it now
neuralnetworks  music  dancing  word2vec  language 
june 2016
Fear and Loathing of the English Passive Geoffrey K. Pullum
Writing advisers have been condemning the English passive since the early 20th century. I provide an informal but comprehensive syntactic description of passive clauses in English, and then exhibit numerous published examples of incompetent criticism in which critics reveal that they cannot tell passives from actives. Some seem to confuse the grammatical concept with a rhetorical one involving inadequate attribution of agency or responsibility, but not all examples are thus explained. The specific stylistic charges leveled against the passive are entirely baseless. The evidence demonstrates an extraordinary level of grammatical ignorance among educated English language critics.
english  grammar 
june 2016
The Land of the Free and The Elements of Style by Geoffrey K Pullum
I believe the success of Elements to be one of the worst things to have happened to English language education in America in the past century. The book's style advice, largely vapid and obvious ("Do not overwrite"; "Be clear"), may do little damage; but the numerous statements about grammatical correctness are actually harmful. They are riddled with inaccuracies, uninformed by evidence, and marred by bungled analysis. Elements is a dogmatic bookful of bad usage advice, and the people who rely on it have no idea how badly off-beam its grammatical claims are. In this essay I provide some illustrations, and a review of some of the book's most striking faults.
language  style  usage  english  usa 
june 2016
Uncanny Valley | Issue 25 | n+1
A book-related start-up holds a small and sad library, the shelves half-empty, paperbacks and object-oriented-programming manuals sloping against one another. It reminds me of the people who dressed like Michael Jackson to attend Michael Jackson’s funeral.

Around here, we nonengineers are pressed to prove our value. The hierarchy is pervasive, ingrained in the industry’s dismissal of marketing and its insistence that a good product sells itself; evident in the few “office hours” established for engineers (our scheduled opportunity to approach with questions and bugs); reflected in our salaries and equity allotment, even though it’s harder to find a good copywriter than a liberal-arts graduate with a degree in history and twelve weeks’ training from an uncredentialed coding dojo.

Our soft skills are a necessary inconvenience. We bloat payroll; we dilute conversation; we create process and bureaucracy; we put in requests for yoga classes and Human Resources. We’re a dragnet — though we tend to contribute positively to diversity metrics. There is quiet pity for the MBAs.

VENTURE CAPITALISTS HAVE SPEARHEADED massive innovation in the past few decades, not least of which is their incubation of this generation’s very worst prose style. The internet is choked with blindly ambitious and professionally inexperienced men giving each other anecdote-based instruction and bullet-point advice.
siliconvalley  startup  professional 
may 2016
Algorithms, clickworkers, and the befuddled fury around Facebook Trends | Social Media Collective
We prefer the idea that algorithms run on their own, free of the messy bias, subjectivity, and political aims of people. It’s a seductive and persistent myth, one Facebook has enjoyed and propagated. But its simply false.

I’ve already commented on this, and many of those who study the social implications of information technology have made this point abundantly clear (including Pasquale, Crawford, Ananny, Tufekci, boyd, Seaver, McKelvey, Sandvig, Bucher, and nearly every essay on this list). But it persists: in statements made by Facebook, in the explanations offered by journalists, even in the words of Facebook’s critics.

If you still think algorithms are neutral because they’re not people, here’s a list, not even an exhaustive one, of the human decisions that have to be made to produce something like Facebook’s Trending Topics (which, keep in mind, pales in scope and importance to Facebook’s larger algorothmic endeavor, the “news feed” listing your friends’ activity). Some are made by the engineers designing the algorithm, others are made by curators who turn the output of the algorithm into something presentable. If your eyes start to glaze over, that’s the point; read any three points and then move on, they’re enough to dispel the myth. Ready?
facebook  bias 
may 2016
The Twelve-Factor App
Store config in the environment
may 2016
Alex Payne — Letter To A Young Programmer Considering A Startup
"I am deeply skeptical of this system. I’m skeptical of this system’s slavering, self-congratulatory fetishization of “disruption” while so obviously becoming the sort of stolid institution it seeks to displace. I’m skeptical of the startup community’s often short-term outlook. I’m particularly skeptical of its callous disregard for both the lives of the people who participate in it and the lives of those who live in the world that startups seek to reshape."
startup  business 
may 2016
Startup advice, briefly - Sam Altman
lol he forgot "be a young white man and learn to write like you have suffered from a series of strokes", which is the #1 piece of YC advice
may 2016
google webfonts helper
A Hassle-Free Way to Self-Host Google Fonts
typography  web 
may 2016
Plutocrats at Work: How Big Philanthropy Undermines Democracy | Dissent Mag
“One hundred years later, big philanthropy still aims to solve the world’s problems—with foundation trustees deciding what is a problem and how to fix it. They may act with good intentions, but they define “good.” The arrangement remains thoroughly plutocratic: it is the exercise of wealth-derived power in the public sphere with minimal democratic controls and civic obligations.”

“The main rationale for both the tax exemption and the charitable contribution tax deduction (created in 1917) is to stimulate private giving. Yet this is a weak rationale when applied to the super-rich; a more effective way to stimulate their giving would be to raise the estate and capital gains taxes. It is a meaningless rationale for the 65 percent of American taxpayers who don’t itemize their deductions and therefore can’t use the charity tax break.”

“Sycophancy is built into the structure of philanthropy: grantees shape their work to please their benefactors; they are perpetual supplicants for future funding. As a result, foundation executives and trustees almost never receive critical feedback. They are treated like royalty, which breeds hubris—the occupational disorder of philanthro-barons”

“When the creator of a mega-foundation says, “I can do what I want because it’s my money,” he or she is wrong. A substantial portion of the wealth—35 percent or more, depending on tax rates—has been diverted from the public treasury, where voters would have determined its use.”
charity  philanthropy  usa  Tax  politics  education 
may 2016
Is active investing a zero sum game?
So in practice, active management is worse than a zero sum game.

The fact is it must hold – ‘alpha’ cannot be magicked out of thin air.

The only place an active manager can go to get more or fewer shares than are held by the market is by dealing with other active investors in that market. (Because the passive investors by definition hold the market).

And then you have to subtract those higher costs.
may 2016
autoreload — IPython 3.2.1 documentation
autoreload reloads modules automatically before entering the execution of code typed at the IPython prompt.
python  jupyter 
may 2016
[1604.00289] Building Machines That Learn and Think Like People
Recent progress in artificial intelligence (AI) has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn, and how they learn it. Specifically, we argue that these machines should (a) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (b) ground learning in intuitive theories of physics and psychology, to support and enrich the knowledge that is learned; and (c) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes towards these goals that can combine the strengths of recent neural network advances with more structured cognitive models.
arxiv  ai  neuralnetworks  reinforcementlearning 
may 2016
Airline On-Time Performance Data
This database contains scheduled and actual departure and arrival times reported by certified U.S. air carriers that account for at least one percent of domestic scheduled passenger revenues. The data is collected by the Office of Airline Information, Bureau of Transportation Statistics (BTS).
data  travel  usa 
may 2016
Photos, dates, and xargs - All this
Unusually clear/concise explanation of the why of find and xargs
april 2016
Chris Albon
Lots of basic/semi-idiomatic python/pandas snippets
python  pandas 
april 2016
DS12 | Overview
🤔 "Python and R are the legacy languages of data science; however, both were designed during the single processor era and are beginning to show their limitations. That’s why we’ve chosen to teach Scala, DataScience’s programming language of choice. Despite being embraced by companies like Twitter, Netflix, and LinkedIn, Scala is largely perceived as “too difficult” for the average data scientist."
functional  datascience  losangeles  education 
april 2016
GRAIL Text Recognizer
An Active Essay Revisiting the GRAIL Handwriting Recognizer
computers  history  20c  ux 
april 2016
USDA ERS - Natural Amenities Scale
The natural amenities scale is a measure of the physical characteristics of a county area that enhance the location as a place to live. The scale was constructed by combining six measures of climate, topography, and water area that reflect environmental qualities most people prefer. These measures are warm winter, winter sun, temperate summer, low summer humidity, topographic variation, and water area. The data are available for counties in the lower 48 States. The file contains the original measures and standardized scores for each county as well as the amenities scale.
data  usa  geography 
april 2016
Baby Names from Social Security Card Applications-National Level Data - Data.gov
"To safeguard privacy, we restrict our list of names to those with at least 5 occurrences."
data  usa 
april 2016
« earlier      later »
1970s 20c abtesting academia adversarial advertising ai algorithms amazon anecdata antarctica api apple architecture art arxiv astro async audio aws backup bash bayes bias bitcoin book books brexit business c california capitalism car causality churn cia climatechange cloudfront concurrency conference crime cryptocurrency cryptography cs culture data database dataengineering datascience deeplearning design devops differentialprivacy diversity docker economics education engineering english espionage ethics eu europe facebook family federatedlearning feminism fiction film finance functional git github golang google h1b hardware haskell health hiring history housing immigration infrastructure internet interpretability interview investments jobs journalism js jupyter kubernetes labour lambda language law legal linearalgebra linux literature losangeles machinelearning macos make management map mapreduce maps marketing math maths me media module money music neuralnetworks newyork nlp notebook numpy nyc oop optimization package packaging pandas parenting patterns phone physics politics predictivemaintenance presentation privacy probabilisticprogramming probability product professional programming psephology publishing pycon2017 pymc3 pytest python pytorch quant r racism recipe recommendation reinforcementlearning remote republican research review rnn rust s3 sanfrancisco science scientism scifi scikitlearn security sentiment serverless sexism siliconvalley slack socialism socialmedia spark sql ssh ssl stan startup statistics summarization surveillance talk tax tech technology tensorflow testing text timeseries tmux transport travel trump tutorial tv twitter uber uk unix urbanism usa versioncontrol video vim visualization vpn web webdev word2vec writing

Copy this bookmark: