As far as I can tell, the pitch for data scientists from Silicon Valley is: "Come work here, you can build advertising models and pretend that you're saving the world," while the pitch for data scientists from Wall Street is: "Come work here, you can build trading models and not have to pretend that you're saving the world." I actually think that is a useful sorting metric, and I know which one I would take.
3 days ago
Econometrics Beat: Dave Giles' Blog: Micronumerosity
"Econometrics texts devote many pages to the problem of multicollinearity in multiple regression, but they say little about the closely analogous problem of small sample size in estimation a univariate mean. Perhaps that imbalance is attributable to the lack of an exotic polysyllabic name for 'small sample size'. If so, we can remove that impediment by introducing the term micronumerosity.

Suppose an econometrician set out to write a chapter about small sample size in sampling from a univariate population. Judging from what is now written about multicollinearity, the chapter might look like this:

1. Micronumerosity
The extreme case, 'exact micronumerosity', arises when n = 0; in which case the sample estimate of μ is not unique. (Technically, there is a violation of the rank condition n > 0: the matrix 0 is singular.) The extreme case is easy enough to recognize. 'Near micronumerosity' is more subtle, and yet very serious. It arises when the rank condition n > 0 is barely satisfied. Near micronumerosity is very prevalent in empirical economics.
8 weeks ago
The Long Con - Rick Perlstein on Republicans and lying
These are bedtime stories, meant for childlike minds. Or, more to the point, they are in the business of producing childlike minds. Conjuring up the most garishly insatiable monsters precisely in order to banish them from underneath the bed, they aim to put the target to sleep.
And that, at last, may be the explanation for Mitt Romney’s apparently bottomless penchant for lying in public. If the 2012 GOP nominee lied louder than most—and even more astoundingly than he has during his prior campaigns—it’s just because he felt like he had more to prove to his core following. Lying is an initiation into the conservative elite. In this respect, as in so many others, it’s like multilayer marketing: the ones at the top reap the reward—and then they preen, pleased with themselves for mastering the game. Closing the sale, after all, is mainly a question of riding out the lie: showing that you have the skill and the stones to just brazen it out, and the savvy to ratchet up the stakes higher and higher. Sneering at, or ignoring, your earnest high-minded mandarin gatekeepers—“we’re not going to let our campaign be dictated by fact-checkers,” as one Romney aide put it—is another part of closing the deal. For years now, the story in the mainstream political press has been Romney’s difficulty in convincing conservatives, finally, that he is truly one of them. For these elites, his lying—so dismaying to the opinion-makers at the New York Times, who act like this is something new—is how he has pulled it off once and for all. And at the grassroots, his fluidity with their preferred fables helps them forget why they never trusted the guy in the first place.
8 weeks ago
The Crisis of Attention Theft—Ads That Steal Your Time for Nothing in Return | WIRED
“I tremble for the sanity of a society that talks, on the level of abstract principle, of the precious integrity of the individual mind, and all the while, on the level of concrete fact, forces the individual mind to spend a good part of every day under bombardment with whatever some crowd of promoters want to throw at it.”
9 weeks ago
CS 228 notes
These notes form a concise introductory course on probabilistic graphical models. Probabilistic graphical models are a subfield of machine learning that studies how to describe and reason about the world in terms of probabilities. . They are based on Stanford CS228, taught by Stefano Ermon, and are written by Volodymyr Kuleshov, with the help of many students and course staff.
9 weeks ago
jtleek/datasharing: The Leek group guide to data sharing
To facilitate the most efficient and timely analysis this is the information you should pass to a statistician:

The raw data.
A tidy data set
A code book describing each variable and its values in the tidy data set.
An explicit and exact recipe you used to go from 1 -> 2,3
10 weeks ago
Reproducible Data Analysis in Jupyter | Pythonic Perambulations
Jupyter notebooks provide a useful environment for interactive exploration of data. A common question I get, though, is how you can progress from this nonlinear, interactive, trial-and-error style of exploration to a more linear and reproducible analysis based on organized, packaged, and tested code. This series of videos presents a case study in how I personally approach reproducible data analysis within the Jupyter notebook.
10 weeks ago
The Long, Lucrative Right-wing Grift Is Blowing Up in the World's Face
Rather rapidly, two things happened: First, Republicans realized they’d radicalized their base to a point where nothing they did in power could satisfy their most fervent constituents. Then—in a much more consequential development—a large portion of the Republican Congressional caucus became people who themselves consume garbage conservative media, and nothing else.

That, broadly, explains the dysfunction of the Obama era, post-Tea Party freakout. Congressional Republicans went from people who were able to turn their bullshit-hose on their constituents, in order to rile them up, to people who pointed it directly at themselves, mouths open.
10 weeks ago
The DADSS Midterm Grading Procedure
score = 1 + ln(p)/ln(4), where p is probability you assign to correct answer
10 weeks ago
Equal pay: New York banned employers from asking job candidates about past salaries - The Washington Post
The New York City Council isn't fond of it, either. In a vote Wednesday, it approved legislation that will ban employers from asking job applicants about what they make in their current or past job and could have far-reaching consequences beyond the city as employers try to standardize their practices. It's an idea that's starting to spread: In passing the measure, New York City joins Massachusetts, Puerto Rico and the city of Philadelphia — where the local Chamber of Commerce filed a lawsuit against that measure Thursday — in banning the question from job interviews. More than 20 other city and state legislatures have introduced similar provisions.
10 weeks ago
Why bots aren’t the real AI disruption – Textio Word Nerd
Many of today’s bots are kind of a hipster façade around the same basic command line interfaces consumers abandoned in the 1980s. They require specific syntaxes and understand only a limited vocabulary—but they sure have personality!
While the added convenience of language recognition is a benefit, until bots are capable of performing very complex and novel tasks that richly combine actions and context across the boundaries of apps and sites in unique ways the first time they are asked, we will be limited to trying to remember the 489 commands Siri recognizes. (Yes that link is a man page for Siri. ~sigh~)
10 weeks ago
Astronomers explore uses for AI-generated images : Nature News & Comment
"Generative AIs look promising for basic science, too, says Welling, who is helping to develop software for the Square Kilometre Array (SKA), a radio-astronomy observatory to be built in South Africa and Australia. The SKA will produce such vast amounts of data that its images will need to be compressed into low-noise but patchy data. Generative AI models will help to reconstruct and fill in blank parts of those data, producing the images of the sky that astronomers will examine.

A team led by Rachel Mandelbaum, an astrophysicist at Carnegie Mellon University, has been experimenting with both GANs and VAEs to simulate images of galaxies that look deformed because of gravitational lensing — when the gravity of objects in the foreground distorts space-time and warps light rays. Researchers are planning to survey huge numbers of galaxies to map gravitational lensing across the Universe’s history. This could show how the distribution of the Universe’s matter has changed over time, providing clues to the nature of the dark energy that is thought to have driven cosmic expansion. But to do this, astronomers need software that can reliably separate gravitational lensing from other effects. Synthetic images will improve the programs’ accuracy, Mandelbaum says."

10 weeks ago
The Corporation Does Not Always Have To Win
You are not the corporation. You are the human. It is okay for the corporation to lose a small portion of what it has in terrifying overabundance (money, time, efficiency) in order to preserve what a human has that cannot ever be replaced (dignity, humanity, conscience, life). It is okay for you to prioritize your affinity with your fellow humans over your subservience to the corporation, and to imagine and broker outcomes based on this ordering of things. It is okay for the corporation to lose. It will return to its work of churning the living world into dead sand presently.
11 weeks ago
