gelman.andrew   37

[1507.02646] Pareto Smoothed Importance Sampling
"Importance weighting is a general way to adjust Monte Carlo integration to account for draws from the wrong distribution, but the resulting estimate can be noisy when the importance ratios have a heavy right tail. This routinely occurs when there are aspects of the target distribution that are not well captured by the approximating distribution, in which case more stable estimates can be obtained by modifying extreme importance ratios. We present a new method for stabilizing importance weights using a generalized Pareto distribution fit to the upper tail of the distribution of the simulated importance ratios. The method, which empirically performs better than existing methods for stabilizing importance sampling estimates, includes stabilized effective sample size estimates, Monte Carlo error estimates and convergence diagnostics."
to:NB  monte_carlo  importance_sampling  heavy_tails  computational_statistics  statistics  re:fitness_sampling  gelman.andrew 
17 days ago by cshalizi
When the Revolution Came for Amy Cuddy - The New York Times
Morals under the "to teach" tag:
1. Don't do science like this.
2. Don't be a jerk when criticizing others for doing bad science.
(I realize that I am one to talk about #2.)
have_read  social_science_methodology  social_psychology  psychology  replication_crisis  gelman.andrew  popular_social_science  data_analysis  to_teach:undergrad-research 
january 2018 by cshalizi
Red State/Blue State Divisions in the 2012 Presidential Election
"The so-called “red/blue paradox” is that rich individuals are more likely to vote Republican but rich states are more likely to support the Democrats. Previ- ous research argued that this seeming paradox could be explained by comparing rich and poor voters within each state – the difference in the Republican vote share between rich and poor voters was much larger in low-income, con- servative, middle-American states like Mississippi than in high-income, liberal, coastal states like Connecticut. We use exit poll and other survey data to assess whether this was still the case for the 2012 Presidential election. Based on this preliminary analysis, we find that, while the red/ blue paradox is still strong, the explanation offered by Gel- man et al. no longer appears to hold. We explore several empirical patterns from this election and suggest possible avenues for resolving the questions posed by the new data."
to:NB  have_read  us_politics  statistics  to_teach:undergrad-ADA  to_teach:statcomp  kith_and_kin  gelman.andrew  via:cshalizi 
march 2014 by resteorts
Red State/Blue State Divisions in the 2012 Presidential Election
"The so-called “red/blue paradox” is that rich individuals are more likely to vote Republican but rich states are more likely to support the Democrats. Previ- ous research argued that this seeming paradox could be explained by comparing rich and poor voters within each state – the difference in the Republican vote share between rich and poor voters was much larger in low-income, con- servative, middle-American states like Mississippi than in high-income, liberal, coastal states like Connecticut. We use exit poll and other survey data to assess whether this was still the case for the 2012 Presidential election. Based on this preliminary analysis, we find that, while the red/ blue paradox is still strong, the explanation offered by Gel- man et al. no longer appears to hold. We explore several empirical patterns from this election and suggest possible avenues for resolving the questions posed by the new data."
to:NB  have_read  us_politics  statistics  to_teach:undergrad-ADA  to_teach:statcomp  kith_and_kin  gelman.andrew 
july 2013 by cshalizi
Taylor & Francis Online :: Infovis and Statistical Graphics: Different Goals, Different Looks - Journal of Computational and Graphical Statistics - Volume 22, Issue 1
"The importance of graphical displays in statistical practice has been recognized sporadically in the statistical literature over the past century, with wider awareness following Tukey's Exploratory Data Analysis and Tufte's books in the succeeding decades. But statistical graphics still occupy an awkward in-between position: within statistics, exploratory and graphical methods represent a minor subfield and are not well integrated with larger themes of modeling and inference. Outside of statistics, infographics (also called information visualization or Infovis) are huge, but their purveyors and enthusiasts appear largely to be uninterested in statistical principles.
"We present here a set of goals for graphical displays discussed primarily from the statistical point of view and discuss some inherent contradictions in these goals that may be impeding communication between the fields of statistics and Infovis. One of our constructive suggestions, to Infovis practitioners and statisticians alike, is to try not to cram into a single graph what can be better displayed in two or more. We recognize that we offer only one perspective and intend this article to be a starting point for a wide-ranging discussion among graphic designers, statisticians, and users of statistical methods. The purpose of this article is not to criticize but to explore the different goals that lead researchers in different fields to value different aspects of data visualization."

--- The comment by Wickham looks especially useful.
to:NB  visual_display_of_quantitative_information  statistics  gelman.andrew 
march 2013 by cshalizi
Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses « Statistical Modeling, Causal Inference, and Social Science
"Some things I respect
"When it comes to meta-models of statistics, here are two philosophies that I respect:
"1. (My) Bayesian approach, which I associate with E. T. Jaynes, in which you construct models with strong assumptions, ride your models hard, check their fit to data, and then scrap them and improve them as necessary.
"2. At the other extreme, model-free statistical procedures that are designed to work well under very weak assumptions—for example, instead of assuming a distribution is Gaussian, you would just want the procedure to work well under some conditions on the smoothness of the second derivative of the log density function.
"Both the above philosophies recognize that (almost) all important assumptions will be wrong, and they resolve this concern via aggressive model checking or via robustness. And of course there are intermediate positions, such as working with Bayesian models that have been shown to be robust, and then still checking them. Or, to flip it around, using robust methods and checking their implicit assumptions.
"I don’t like these
"The statistical philosophies I don’t like so much are those that make strong assumptions with no checking and no robustness. For example, the purely subjective Bayes approach in which it’s illegal to check the fit of a model because it’a supposed to represent your personal belief. I’ve always thought this was ridiculous, first because personal beliefs should be checked where possible, second because it’s hard for me to believe that all these analysts happen to be using logistic regression, normal distributions, and all the other standard tools, out of personal belief. Or the likelihood approach, advocated by those people who refuse to make any assumptions or restrictions on parameters but are willing to rely 100% on the normal distributions, logistic regressions, etc., that they pull out of the toolbox."
to_teach:undergrad-ADA  statistics  foundations_of_statistics  gelman.andrew 
march 2013 by cshalizi
[1302.2142] Simulation-efficient shortest probability intervals
"Bayesian highest posterior density (HPD) intervals can be estimated directly from simulations via empirical shortest intervals. Unfortunately, these can be noisy (that is, have a high Monte Carlo error). We derive an optimal weighting strategy using bootstrap and quadratic programming to obtain a more compu- tationally stable HPD, or in general, shortest probability interval (Spin). We prove the consistency of our method. Simulation studies on a range of theoret- ical and real-data examples, some with symmetric and some with asymmetric posterior densities, show that intervals constructed using Spin have better cov- erage (relative to the posterior distribution) and lower Monte Carlo error than empirical shortest intervals. We implement the new method in an R package (SPIn) so it can be routinely used in post-processing of Bayesian simulations."
in_NB  confidence_sets  monte_carlo  kith_and_kin  gelman.andrew 
february 2013 by cshalizi
What Too Close to Call Really Means - NYTimes.com
"A political statistician explains why it's hard to get our heads around the presidential polls."
op-ed  politics  political.science  gelman.andrew  statistics 
november 2012 by jimmykduong
Why we (usually) don’t have to worry about multiple comparison
Abstract: "Applied researchers often find themselves making statistical inferences in settings that would seem to require multiple comparisons adjustments. We challenge the Type I error paradigm that underlies these corrections. Moreover we posit that the problem of multiple comparisons can disappear entirely when viewed from a hierarchical Bayesian perspective. We propose building multilevel models in the settings where multiple comparisons arise.
Multilevel models perform partial pooling (shifting estimates toward each other), whereas classical procedures typically keep the centers of intervals stationary, adjusting for multiple comparisons by making the intervals wider (or, equivalently, adjusting the p-values corresponding to intervals of fixed width). Thus, multilevel models address the multiple comparisons problem and also yield more efficient estimates, especially in settings with low group-level variation, which is where multiple comparisons are a particular concern."
Gelman.Andrew  bayesian_inference  hierarchical_modeling  multiple_comparisons  Type_S_error  statistical_significance 
november 2011 by edanielson
Even the liberal New Republic opposes Occupy Wall Street: What does that mean? — The Monkey Cage
Why, at this late date, would anyone think that the New Republic _wants_ to advance liberal goals? (Which is Andrew's point, I think, when one unwraps all the nested ironies.)
even_the_liberal_new_republic  us_politics  occupy_wall_street  gelman.andrew 
october 2011 by cshalizi

related tags

**  academia  affirmative_action  bad_data_analysis  bad_science  bad_science_journalism  bayesian.statistics  bayesian_inference  bayesianism  biases  blogging  book_reviews  books:recommended  brooks.clem  causality  class_struggles_in_america  computational_statistics  confidence_sets  data  data_analysis  debunking  decision_theory  democracy  discrimination  economics  even_the_liberal_new_republic  evisceration  evolutionary_psychology  experimental-economics  foundations_of_statistics  freese.jeremy  funny:geeky  funny:malicious  gelman  gives_economists_a_bad_name  goel.sharad  graphics  haidt.jonathan  have_read  healthcare  heavy_tails  hierarchical_modeling  hierarchical_statistical_models  hill.jennifer  importance_sampling  in_nb  inequality  kanazawa.satoshi  kith_and_kin  liberman.mark  manza.jeff  meritocracy  methodology  model_checking  modeling  monte_carlo  moral_philosophy  moral_responsibility  multiple-imputation  multiple_comparisons  murray.charles  natural_history_of_truthiness  occupy_wall_street  op-ed  philosophy_of_science  political.science  political_economy  politics  popular_social_science  power-computations  probability  psychology  rationality  re:fitness_sampling  re:phil-of-bayes_paper  rectification_of_names  red_state_blue_state  replication_crisis  rhetoric  rhetorical_self-fashioning  running_dogs_of_reaction  self-centered  sex_differences  silver.nate  social.sciences  social_democracy  social_life_of_the_mind  social_psychology  social_science_methodology  sports  statistical_significance  statistics  to:blog  to:nb  to:read-and-understand  to_teach:statcomp  to_teach:undergrad-ada  to_teach:undergrad-research  type_s_error  us_politics  utter_stupidity  visual_display_of_quantitative_information  welfare_state  why_oh_why_cant_we_have_a_better_academic_publishing_system  zingales.luigi 

Copy this bookmark:



description:


tags: