gelman.andrew   37

[1507.02646] Pareto Smoothed Importance Sampling
"Importance weighting is a general way to adjust Monte Carlo integration to account for draws from the wrong distribution, but the resulting estimate can be noisy when the importance ratios have a heavy right tail. This routinely occurs when there are aspects of the target distribution that are not well captured by the approximating distribution, in which case more stable estimates can be obtained by modifying extreme importance ratios. We present a new method for stabilizing importance weights using a generalized Pareto distribution fit to the upper tail of the distribution of the simulated importance ratios. The method, which empirically performs better than existing methods for stabilizing importance sampling estimates, includes stabilized effective sample size estimates, Monte Carlo error estimates and convergence diagnostics."
to:NB  monte_carlo  importance_sampling  heavy_tails  computational_statistics  statistics  re:fitness_sampling  gelman.andrew
17 days ago by cshalizi
When the Revolution Came for Amy Cuddy - The New York Times
Morals under the "to teach" tag:
1. Don't do science like this.
2. Don't be a jerk when criticizing others for doing bad science.
(I realize that I am one to talk about #2.)
january 2018 by cshalizi
Red State/Blue State Divisions in the 2012 Presidential Election
"The so-called “red/blue paradox” is that rich individuals are more likely to vote Republican but rich states are more likely to support the Democrats. Previ- ous research argued that this seeming paradox could be explained by comparing rich and poor voters within each state – the difference in the Republican vote share between rich and poor voters was much larger in low-income, con- servative, middle-American states like Mississippi than in high-income, liberal, coastal states like Connecticut. We use exit poll and other survey data to assess whether this was still the case for the 2012 Presidential election. Based on this preliminary analysis, we find that, while the red/ blue paradox is still strong, the explanation offered by Gel- man et al. no longer appears to hold. We explore several empirical patterns from this election and suggest possible avenues for resolving the questions posed by the new data."
march 2014 by resteorts
Red State/Blue State Divisions in the 2012 Presidential Election
"The so-called “red/blue paradox” is that rich individuals are more likely to vote Republican but rich states are more likely to support the Democrats. Previ- ous research argued that this seeming paradox could be explained by comparing rich and poor voters within each state – the difference in the Republican vote share between rich and poor voters was much larger in low-income, con- servative, middle-American states like Mississippi than in high-income, liberal, coastal states like Connecticut. We use exit poll and other survey data to assess whether this was still the case for the 2012 Presidential election. Based on this preliminary analysis, we find that, while the red/ blue paradox is still strong, the explanation offered by Gel- man et al. no longer appears to hold. We explore several empirical patterns from this election and suggest possible avenues for resolving the questions posed by the new data."
july 2013 by cshalizi
Taylor & Francis Online :: Infovis and Statistical Graphics: Different Goals, Different Looks - Journal of Computational and Graphical Statistics - Volume 22, Issue 1
"The importance of graphical displays in statistical practice has been recognized sporadically in the statistical literature over the past century, with wider awareness following Tukey's Exploratory Data Analysis and Tufte's books in the succeeding decades. But statistical graphics still occupy an awkward in-between position: within statistics, exploratory and graphical methods represent a minor subfield and are not well integrated with larger themes of modeling and inference. Outside of statistics, infographics (also called information visualization or Infovis) are huge, but their purveyors and enthusiasts appear largely to be uninterested in statistical principles.
"We present here a set of goals for graphical displays discussed primarily from the statistical point of view and discuss some inherent contradictions in these goals that may be impeding communication between the fields of statistics and Infovis. One of our constructive suggestions, to Infovis practitioners and statisticians alike, is to try not to cram into a single graph what can be better displayed in two or more. We recognize that we offer only one perspective and intend this article to be a starting point for a wide-ranging discussion among graphic designers, statisticians, and users of statistical methods. The purpose of this article is not to criticize but to explore the different goals that lead researchers in different fields to value different aspects of data visualization."

--- The comment by Wickham looks especially useful.
to:NB  visual_display_of_quantitative_information  statistics  gelman.andrew
march 2013 by cshalizi
Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses « Statistical Modeling, Causal Inference, and Social Science
"Some things I respect
"When it comes to meta-models of statistics, here are two philosophies that I respect:
"1. (My) Bayesian approach, which I associate with E. T. Jaynes, in which you construct models with strong assumptions, ride your models hard, check their fit to data, and then scrap them and improve them as necessary.
"2. At the other extreme, model-free statistical procedures that are designed to work well under very weak assumptions—for example, instead of assuming a distribution is Gaussian, you would just want the procedure to work well under some conditions on the smoothness of the second derivative of the log density function.
"Both the above philosophies recognize that (almost) all important assumptions will be wrong, and they resolve this concern via aggressive model checking or via robustness. And of course there are intermediate positions, such as working with Bayesian models that have been shown to be robust, and then still checking them. Or, to flip it around, using robust methods and checking their implicit assumptions.
"I don’t like these
"The statistical philosophies I don’t like so much are those that make strong assumptions with no checking and no robustness. For example, the purely subjective Bayes approach in which it’s illegal to check the fit of a model because it’a supposed to represent your personal belief. I’ve always thought this was ridiculous, first because personal beliefs should be checked where possible, second because it’s hard for me to believe that all these analysts happen to be using logistic regression, normal distributions, and all the other standard tools, out of personal belief. Or the likelihood approach, advocated by those people who refuse to make any assumptions or restrictions on parameters but are willing to rely 100% on the normal distributions, logistic regressions, etc., that they pull out of the toolbox."
march 2013 by cshalizi
[1302.2142] Simulation-efficient shortest probability intervals
"Bayesian highest posterior density (HPD) intervals can be estimated directly from simulations via empirical shortest intervals. Unfortunately, these can be noisy (that is, have a high Monte Carlo error). We derive an optimal weighting strategy using bootstrap and quadratic programming to obtain a more compu- tationally stable HPD, or in general, shortest probability interval (Spin). We prove the consistency of our method. Simulation studies on a range of theoret- ical and real-data examples, some with symmetric and some with asymmetric posterior densities, show that intervals constructed using Spin have better cov- erage (relative to the posterior distribution) and lower Monte Carlo error than empirical shortest intervals. We implement the new method in an R package (SPIn) so it can be routinely used in post-processing of Bayesian simulations."
in_NB  confidence_sets  monte_carlo  kith_and_kin  gelman.andrew
february 2013 by cshalizi
What Too Close to Call Really Means - NYTimes.com
"A political statistician explains why it's hard to get our heads around the presidential polls."
op-ed  politics  political.science  gelman.andrew  statistics
november 2012 by jimmykduong
Why we (usually) don’t have to worry about multiple comparison
Abstract: "Applied researchers often find themselves making statistical inferences in settings that would seem to require multiple comparisons adjustments. We challenge the Type I error paradigm that underlies these corrections. Moreover we posit that the problem of multiple comparisons can disappear entirely when viewed from a hierarchical Bayesian perspective. We propose building multilevel models in the settings where multiple comparisons arise.
Multilevel models perform partial pooling (shifting estimates toward each other), whereas classical procedures typically keep the centers of intervals stationary, adjusting for multiple comparisons by making the intervals wider (or, equivalently, adjusting the p-values corresponding to intervals of fixed width). Thus, multilevel models address the multiple comparisons problem and also yield more efficient estimates, especially in settings with low group-level variation, which is where multiple comparisons are a particular concern."
Gelman.Andrew  bayesian_inference  hierarchical_modeling  multiple_comparisons  Type_S_error  statistical_significance
november 2011 by edanielson
Even the liberal New Republic opposes Occupy Wall Street: What does that mean? — The Monkey Cage
Why, at this late date, would anyone think that the New Republic _wants_ to advance liberal goals? (Which is Andrew's point, I think, when one unwraps all the nested ironies.)
even_the_liberal_new_republic  us_politics  occupy_wall_street  gelman.andrew
october 2011 by cshalizi

Copy this bookmark:

description:

tags: