causal_inference   630

« earlier    

Sensitivity analysis for inverse probability weighting estimators via the percentile bootstrap
"To identify the estimand in missing data problems and observational studies, it is common to base the statistical estimation on the ‘missingness at random’ and ‘no unmeasured confounder’ assumptions. However, these assumptions are unverifiable by using empirical data and pose serious threats to the validity of the qualitative conclusions of statistical inference. A sensitivity analysis asks how the conclusions may change if the unverifiable assumptions are violated to a certain degree. We consider a marginal sensitivity model which is a natural extension of Rosenbaum's sensitivity model that is widely used for matched observational studies. We aim to construct confidence intervals based on inverse probability weighting estimators, such that asymptotically the intervals have at least nominal coverage of the estimand whenever the data‐generating distribution is in the collection of marginal sensitivity models. We use a percentile bootstrap and a generalized minimax–maximin inequality to transform this intractable problem into a linear fractional programming problem, which can be solved very efficiently. We illustrate our method by using a real data set to estimate the causal effect of fish consumption on blood mercury level."
to:NB  causal_inference  model_checking  misspecification  statistics  small.dylan 
yesterday by cshalizi
Robust causal structure learning with some hidden variables
"We introduce a new method to estimate the Markov equivalence class of a directed acyclic graph (DAG) in the presence of hidden variables, in settings where the underlying DAG among the observed variables is sparse, and there are a few hidden variables that have a direct effect on many of the observed variables. Building on the so‐called low rank plus sparse framework, we suggest a two‐stage approach which first removes the effect of the hidden variables and then estimates the Markov equivalence class of the underlying DAG under the assumption that there are no remaining hidden variables. This approach is consistent in certain high dimensional regimes and performs favourably when compared with the state of the art, in terms of both graphical structure recovery and total causal effect estimation."
to:NB  causal_discovery  causal_inference  graphical_models  maathuis.marloes  statistics 
yesterday by cshalizi
Novel criteria to exclude the surrogate paradox and their optimalities - Yin - - Scandinavian Journal of Statistics - Wiley Online Library
"When the primary outcome is hard to collect, a surrogate endpoint is typically used as a substitute. However, even when a treatment has a positive average causal effect (ACE) on a surrogate endpoint, which also has a positive ACE on the primary outcome, it is still possible that the treatment has a negative ACE on the primary outcome. Such a phenomenon is called the surrogate paradox and greatly challenges the use of surrogates. In this paper, we provide criteria to exclude the surrogate paradox. Our criteria are optimal in the sense that they are sufficient and “almost necessary” to exclude the paradox: If the conditions are satisfied, the surrogate paradox is guaranteed to be absent, whereas if the conditions fail, there exists a data‐generating process with surrogate paradox that can generate the same observed data. That is, our criteria capture all the observed information to exclude the surrogate paradox."
to:NB  causal_inference  statistics 
yesterday by cshalizi
Flexible Sensitivity Analysis for Observational Studies Without Observable Implications: Journal of the American Statistical Association: Vol 0, No 0
"A fundamental challenge in observational causal inference is that assumptions about unconfoundedness are not testable from data. Assessing sensitivity to such assumptions is therefore important in practice. Unfortunately, some existing sensitivity analysis approaches inadvertently impose restrictions that are at odds with modern causal inference methods, which emphasize flexible models for observed data. To address this issue, we propose a framework that allows (1) flexible models for the observed data and (2) clean separation of the identified and unidentified parts of the sensitivity model. Our framework extends an approach from the missing data literature, known as Tukey’s factorization, to the causal inference setting. Under this factorization, we can represent the distributions of unobserved potential outcomes in terms of unidentified selection functions that posit a relationship between treatment assignment and unobserved potential outcomes. The sensitivity parameters in this framework are easily interpreted, and we provide heuristics for calibrating these parameters against observable quantities. We demonstrate the flexibility of this approach in two examples, where we estimate both average treatment effects and quantile treatment effects using Bayesian nonparametric models for the observed data."
to:NB  partial_identification  causal_inference  sensitivity_analysis  statistics 
3 days ago by cshalizi
Life after Lead: Effects of Early Interventions for Children Exposed to Lead
"Lead pollution is consistently linked to cognitive and behavioral impairments, yet little is known about the benefits of public health interventions for children exposed to lead. This paper estimates the long-term impacts of early-life interventions (e.g. lead remediation, nutritional assessment, medical evaluation, developmental surveillance, and public assistance referrals) recommended for lead-poisoned children. Using linked administrative data from Charlotte, NC, we compare outcomes for children who are similar across observable characteristics but differ in eligibility for intervention due to blood lead test results. We find that the negative outcomes previously associated with early-life exposure can largely be reversed by intervention."

--- The last tag, as usual, is conditional on liking the paper after reading it, and on replication data being available.
to:NB  to_read  lead  cognitive_development  sociology  causal_inference  to_teach:undergrad-ADA 
6 days ago by cshalizi
[1808.09521] Bounds on the conditional and average treatment effect with unobserved confounding factors
"We study estimation of causal effects when the dependence of treatment assignments on unobserved confounding factors is bounded. First, we develop a loss minimization approach to quantify bounds on the conditional average treatment effect under a bounded unobserved confounding model, first studied by Rosenbaum for the average treatment effect. Then, we propose a semi-parametric model to bound the average treatment effect and provide a corresponding inferential procedure, allowing us to derive confidence intervals of the true average treatment effect. Our semi-parametric method extends the classical doubly robust estimator of the average treatment effect, which assumes all confounding variables are observed. As a result, our method allows applications in problems involving covariates of a higher dimension than traditional sensitivity analyses, e.g., covariate matching, allow. We complement our methodological development with optimality results showing that in certain cases, our proposed bounds are tight. In addition to our theoretical results, we perform simulation and real data analyses to investigate the performance of the proposed method, demonstrating accurate coverage of the new confidence intervals in practical finite sample regimes."
to:NB  statistics  causal_inference  partial_identification 
11 days ago by cshalizi
[1806.06802] Almost-Exact Matching with Replacement for Causal Inference
"We aim to create the highest possible quality of treatment-control matches for categorical data in the potential outcomes framework. Matching methods are heavily used in the social sciences due to their interpretability, but most matching methods do not pass basic sanity checks: they fail when irrelevant variables are introduced, and tend to be either computationally slow or produce low-quality matches. The method proposed in this work aims to match units on a weighted Hamming distance, taking into account the relative importance of the covariates; the algorithm aims to match units on as many relevant variables as possible. To do this, the algorithm creates a hierarchy of covariate combinations on which to match (similar to downward closure), in the process solving an optimization problem for each unit in order to construct the optimal matches. The algorithm uses a single dynamic program to solve all of the optimization problems simultaneously. Notable advantages of our method over existing matching procedures are its high-quality matches, versatility in handling different data distributions that may have irrelevant variables, and ability to handle missing data by matching on as many available covariates as possible."
to:NB  causal_inference  matching  statistics  computational_statistics 
11 days ago by cshalizi
Analytics, Policy, and Governance | Yale University Press
"The first available textbook on the rapidly growing and increasingly important field of government analytics
"This first textbook on the increasingly important field of government analytics provides invaluable knowledge and training for students of government in the synthesis, interpretation, and communication of “big data,” which is now an integral part of governance and policy making. Integrating all the major components of this rapidly growing field, this invaluable text explores the intricate relationship of data analytics to governance while providing innovative strategies for the retrieval and management of information."
to:NB  data_mining  statistics  causal_inference  us_politics  social_science_methodology  books:noted 
16 days ago by cshalizi
[1905.12020] Matching on What Matters: A Pseudo-Metric Learning Approach to Matching Estimation in High Dimensions
"When pre-processing observational data via matching, we seek to approximate each unit with maximally similar peers that had an alternative treatment status--essentially replicating a randomized block design. However, as one considers a growing number of continuous features, a curse of dimensionality applies making asymptotically valid inference impossible (Abadie and Imbens, 2006). The alternative of ignoring plausibly relevant features is certainly no better, and the resulting trade-off substantially limits the application of matching methods to "wide" datasets. Instead, Li and Fu (2017) recasts the problem of matching in a metric learning framework that maps features to a low-dimensional space that facilitates "closer matches" while still capturing important aspects of unit-level heterogeneity. However, that method lacks key theoretical guarantees and can produce inconsistent estimates in cases of heterogeneous treatment effects. Motivated by straightforward extension of existing results in the matching literature, we present alternative techniques that learn latent matching features through either MLPs or through siamese neural networks trained on a carefully selected loss function. We benchmark the resulting alternative methods in simulations as well as against two experimental data sets--including the canonical NSW worker training program data set--and find superior performance of the neural-net-based methods."
to:NB  matching  causal_inference  statistics  metric_learning 
17 days ago by cshalizi
[1905.11506] Ancestral causal learning in high dimensions with a human genome-wide application
"We consider learning ancestral causal relationships in high dimensions. Our approach is driven by a supervised learning perspective, with discrete indicators of causal relationships treated as labels to be learned from available data. We focus on the setting in which some causal (ancestral) relationships are known (via background knowledge or experimental data) and put forward a general approach that scales to large problems. This is motivated by problems in human biology which are characterized by high dimensionality and potentially many latent variables. We present a case study involving interventional data from human cells with total dimension p∼19,000. Performance is assessed empirically by testing model output against previously unseen interventional data. The proposed approach is highly effective and demonstrably scalable to the human genome-wide setting. We consider sensitivity to background knowledge and find that results are robust to nontrivial perturbations of the input information. We consider also the case, relevant to some applications, where the only prior information available concerns a small number of known ancestral relationships."
to:NB  causal_inference  causal_discovery  statistics  classifiers 
18 days ago by cshalizi
[1905.11622] Robust Nonparametric Difference-in-Differences Estimation
"We consider the problem of treatment effect estimation in difference-in-differences designs where parallel trends hold only after conditioning on covariates. Existing methods for this problem rely on strong additional assumptions, e.g., that any covariates may only have linear effects on the outcome of interest, or that there is no covariate shift between different cross sections taken in the same state. Here, we develop a suite of new methods for nonparametric difference-in-differences estimation that require essentially no assumptions beyond conditional parallel trends and a relevant form of overlap. Our proposals show promising empirical performance across a variety of simulation setups, and are more robust than the standard methods based on either linear regression or propensity weighting."
to:NB  causal_inference  statistics  nonparametrics 
18 days ago by cshalizi
[1905.10857] Causal Discovery and Forecasting in Nonstationary Environments with State-Space Models
"In many scientific fields, such as economics and neuroscience, we are often faced with nonstationary time series, and concerned with both finding causal relations and forecasting the values of variables of interest, both of which are particularly challenging in such nonstationary environments. In this paper, we study causal discovery and forecasting for nonstationary time series. By exploiting a particular type of state-space model to represent the processes, we show that nonstationarity helps to identify causal structure and that forecasting naturally benefits from learned causal knowledge. Specifically, we allow changes in both causal strengths and noise variances in the nonlinear state-space models, which, interestingly, renders both the causal structure and model parameters identifiable. Given the causal model, we treat forecasting as a problem in Bayesian inference in the causal model, which exploits the time-varying property of the data and adapts to new observations in a principled manner. Experimental results on synthetic and real-world data sets demonstrate the efficacy of the proposed methods."
to:NB  causal_inference  causal_discovery  state-space_models  time_series  non-stationarity  statistics  kith_and_kin  glymour.clark  to_read  zhang.kun 
19 days ago by cshalizi
[1905.10848] Gaussian DAGs on network data
"The traditional directed acyclic graph (DAG) model assumes data are generated independently from the underlying joint distribution defined by the DAG. In many applications, however, individuals are linked via a network and thus the independence assumption does not hold. We propose a novel Gaussian DAG model for network data, where the dependence among individual data points (row covariance) is modeled by an undirected graph. Under this model, we develop a maximum penalized likelihood method to estimate the DAG structure and the row correlation matrix. The algorithm iterates between a decoupled lasso regression step and a graphical lasso step. We show with extensive simulated and real network data, that our algorithm improves the accuracy of DAG structure learning by leveraging the information from the estimated row correlations. Moreover, we demonstrate that the performance of existing DAG learning methods can be substantially improved via de-correlation of network data with the estimated row correlation matrix from our algorithm."

--- Pretty sure this has been done before, e.g., by Seth Flaxman.
to:NB  graphical_models  causal_inference  statistics 
19 days ago by cshalizi
[1808.06581] The Deconfounded Recommender: A Causal Inference Approach to Recommendation
"The goal of recommendation is to show users items that they will like. Though usually framed as a prediction, the spirit of recommendation is to answer an interventional question---for each user and movie, what would the rating be if we "forced" the user to watch the movie? To this end, we develop a causal approach to recommendation, one where watching a movie is a "treatment" and a user's rating is an "outcome." The problem is there may be unobserved confounders, variables that affect both which movies the users watch and how they rate them; unobserved confounders impede causal predictions with observational data. To solve this problem, we develop the deconfounded recommender, a way to use classical recommendation models for causal recommendation. Following Wang & Blei [23], the deconfounded recommender involves two probabilistic models. The first models which movies the users watch; it provides a substitute for the unobserved confounders. The second one models how each user rates each movie; it employs the substitute to help account for confounders. This two-stage approach removes bias due to confounding. It improves recommendation and enjoys stable performance against interventions on test sets."
to:NB  causal_inference  collaborative_filtering  blei.david 
19 days ago by cshalizi
[1905.10176] Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments
"We consider the estimation of heterogeneous treatment effects with arbitrary machine learning methods in the presence of unobserved confounders with the aid of a valid instrument. Such settings arise in A/B tests with an intent-to-treat structure, where the experimenter randomizes over which user will receive a recommendation to take an action, and we are interested in the effect of the downstream action. We develop a statistical learning approach to the estimation of heterogeneous effects, reducing the problem to the minimization of an appropriate loss function that depends on a set of auxiliary models (each corresponding to a separate prediction task). The reduction enables the use of all recent algorithmic advances (e.g. neural nets, forests). We show that the estimated effect model is robust to estimation errors in the auxiliary models, by showing that the loss satisfies a Neyman orthogonality criterion. Our approach can be used to estimate projections of the true effect model on simpler hypothesis spaces. When these spaces are parametric, then the parameter estimates are asymptotically normal, which enables construction of confidence sets. We applied our method to estimate the effect of membership on downstream webpage engagement on TripAdvisor, using as an instrument an intent-to-treat A/B test among 4 million TripAdvisor users, where some users received an easier membership sign-up process. We also validate our method on synthetic data and on public datasets for the effects of schooling on income."
to:NB  instrumental_variables  nonparametrics  regression  causal_inference  statistics 
21 days ago by cshalizi
The Real Gold Standard: Measuring Counterfactual Worlds That Matter Most to Social Science and Policy | Annual Review of Criminology
"The randomized experiment has achieved the status of the gold standard for estimating causal effects in criminology and the other social sciences. Although causal identification is indeed important and observational data present numerous challenges to causal inference, we argue that conflating causality with the method used to identify it leads to a cognitive narrowing that diverts attention from what ultimately matters most—the difference between counterfactual worlds that emerge as a consequence of their being subjected to different treatment regimes applied to all eligible population members over a sustained period of time. To address this system-level and long-term challenge, we develop an analytic framework for integrating causality and policy inference that accepts the mandate of causal rigor but is conceptually rather than methodologically driven. We then apply our framework to two substantive areas that have generated high-visibility experimental research and that have considerable policy influence: (a) hot-spots policing and (b) the use of housing vouchers to reduce concentrated disadvantage and thereby crime. After reviewing the research in these two areas in light of our framework, we propose a research path forward and conclude with implications for the interplay of theory, data, and causal understanding in criminology and other social sciences."
to:NB  causal_inference  causality  social_science_methodology  statistics  nagin.dan  kith_and_kin  re:ADAfaEPoV 
21 days ago by cshalizi
Handling Missing Data in Instrumental Variable Methods for Causal Inference | Annual Review of Statistics and Its Application
"In instrumental variable studies, missing instrument data are very common. For example, in the Wisconsin Longitudinal Study, one can use genotype data as a Mendelian randomization–style instrument, but this information is often missing when subjects do not contribute saliva samples or when the genotyping platform output is ambiguous. Here we review missing at random assumptions one can use to identify instrumental variable causal effects, and discuss various approaches for estimation and inference. We consider likelihood-based methods, regression and weighting estimators, and doubly robust estimators. The likelihood-based methods yield the most precise inference and are optimal under the model assumptions, while the doubly robust estimators can attain the nonparametric efficiency bound while allowing flexible nonparametric estimation of nuisance functions (e.g., instrument propensity scores). The regression and weighting estimators can sometimes be easiest to describe and implement. Our main contribution is an extensive review of this wide array of estimators under varied missing-at-random assumptions, along with discussion of asymptotic properties and inferential tools. We also implement many of the estimators in an analysis of the Wisconsin Longitudinal Study, to study effects of impaired cognitive functioning on depression."
to:NB  instrumental_variables  missing_data  statistics  causal_inference  kith_and_kin  kennedy.edward  mauro.jacqueline  small.dylan 
21 days ago by cshalizi
Evaluation of Causal Effects and Local Structure Learning of Causal Networks | Annual Review of Statistics and Its Application
"Causal effect evaluation and causal network learning are two main research areas in causal inference. For causal effect evaluation, we review the two problems of confounders and surrogates. The Yule-Simpson paradox is the idea that the association between two variables may be changed dramatically due to ignoring confounders. We review criteria for confounders and methods of adjustment for observed and unobserved confounders. The surrogate paradox occurs when a treatment has a positive causal effect on a surrogate endpoint, which, in turn, has a positive causal effect on a true endpoint, but the treatment may have a negative causal effect on the true endpoint. Some of the existing criteria for surrogates are subject to the surrogate paradox, and we review criteria for consistent surrogates to avoid the surrogate paradox. Causal networks are used to depict the causal relationships among multiple variables. Rather than discovering a global causal network, researchers are often interested in discovering the causes and effects of a given variable. We review some algorithms for local structure learning of causal networks centering around a given variable."
to:NB  causal_inference  causal_discovery  graphical_models  statistics 
21 days ago by cshalizi
Identification and Extrapolation of Causal Effects with Instrumental Variables | Annual Review of Economics
"Instrumental variables (IV) are widely used in economics to address selection on unobservables. Standard IV methods produce estimates of causal effects that are specific to individuals whose behavior can be manipulated by the instrument at hand. In many cases, these individuals are not the same as those who would be induced to treatment by an intervention or policy of interest to the researcher. The average causal effect for the two groups can differ significantly if the effect of the treatment varies systematically with unobserved factors that are correlated with treatment choice. We review the implications of this type of unobserved heterogeneity for the interpretation of standard IV methods and for their relevance to policy evaluation. We argue that making inferences about policy-relevant parameters typically requires extrapolating from the individuals affected by the instrument to the individuals who would be induced to treatment by the policy under consideration. We discuss a variety of alternatives to standard IV methods that can be used to rigorously perform this extrapolation. We show that many of these approaches can be nested as special cases of a general framework that embraces the possibility of partial identification."

--- Memo to self: Read this before revising the IV sections of ADAfaEPoV.
to:NB  causal_inference  instrumental_variables  partial_identification  statistics  re:ADAfaEPoV  to_read 
21 days ago by cshalizi
[1808.08778] Dynamical systems theory for causal inference with application to synthetic control methods
"In this paper, we adopt results in non-linear time series analysis for causal inference in dynamical settings. We illustrate our approach on synthetic control methods, which are popular in policy analysis with panel data. Synthetic controls regress the pre-intervention outcomes of the treated unit to outcomes from a pool of control units, and then use the same regression model to estimate causal effects post-intervention. In this setting, we propose to screen out control units that have a weak dynamical relationship to the treated unit, according to certain well-established measures of relationship strength in dynamical systems theory. In simulations, we show that our method mitigates bias from adversarial control units. In real-world applications, our approach contributes to more reliable control selection, and thus more robust estimation of treatment effects in panel data."
to:NB  dynamical_systems  causal_inference  statistics 
22 days ago by cshalizi

« earlier    

related tags

?  2016  academic_publishing_complex  ai  algorithmic_fairness  aronow.peter  arrow_of_time  articles  bayesian  bayesianism  bernhard.schölkopf  bibliographies  blei.david  books  books:coveted  books:noted  brain  causal_discovery  causality  causation  children  classifiers  cognitive_development  cognitive_science  collaborative_filtering  computaional_advertising  computational_statistics  confidence_sets  contagion  contemporary_culture  counterfactuals  course  crime  data_mining  david.blei  debates  debunking  decison_theory  democracy  demographics  discrimination  dynamical_systems  econometrics  economic_sociology  economics  estimation  everything  experimental_design  experimental_sociology  explanation  exposure  facebook  factor_analysis  flint  for_friends  gender_studies  generalization  genetics  geography  glymour.clark  grad_students  graphical_models  groups  have_read  health  heard_the_talk  hume  hypothesis_testing  i_remain_skeptical  identifiability  ifttt  independent_component_analysis  inequality  instapaper  institutions  instrumental_variables  international_affairs  intervention  inverse_problems  kennedy.edward  kith_and_kin  kleinberg.samantha  latent_variables  lead-crime_hypothesis  lead  lecture_videos  library  maathuis.marloes  machine_learning  macroeconomics  matching  mauro.jacqueline  mccormick.tyler  media_studies  mediation_analysis  methodology  metric_learning  missing_data  misspecification  model_checking  nagin.dan  network_data_analysis  networks  nlp  non-stationarity  nonparametrics  norms  observational_studies  ogburn.elizabeth  online_classes  online_experiments  optimization  partial_identification  philip.dawid  platform_studies  police  policing  policy  political_economy  previous_tag_was_in_poor_taste  probability  process-tracing  psychology  public_perception_of_science  python  r  race  re:adafaepov  re:homophily_and_confounding  re:your_favorite_dsge_sucks  regression  replication_of_studies  research  review  robins.james  sad  sadeghi.kayvan  sensitivity_analysis  small.dylan  social_influence  social_networks  social_psychology  social_science_methodology  sociology  sociology_of_science  software  state-space_models  statistics  suresh.naidu  susan.athey  synthetic_control  teaching  the_civilizing_process  theory  time_series  to:nb  to_be_shot_after_a_fair_trial  to_read  to_teach:baby-nets  to_teach:undergrad-ada  track_down_references  transmission_of_inequality  tutorial  two-sample  us_elections  us_politics  van_der_laan.mark  violence  visualization  wapo  wealth  wellbeing  zhang.kun 

Copy this bookmark: