**causal_inference**630

Sensitivity analysis for inverse probability weighting estimators via the percentile bootstrap

yesterday by cshalizi

"To identify the estimand in missing data problems and observational studies, it is common to base the statistical estimation on the ‘missingness at random’ and ‘no unmeasured confounder’ assumptions. However, these assumptions are unverifiable by using empirical data and pose serious threats to the validity of the qualitative conclusions of statistical inference. A sensitivity analysis asks how the conclusions may change if the unverifiable assumptions are violated to a certain degree. We consider a marginal sensitivity model which is a natural extension of Rosenbaum's sensitivity model that is widely used for matched observational studies. We aim to construct confidence intervals based on inverse probability weighting estimators, such that asymptotically the intervals have at least nominal coverage of the estimand whenever the data‐generating distribution is in the collection of marginal sensitivity models. We use a percentile bootstrap and a generalized minimax–maximin inequality to transform this intractable problem into a linear fractional programming problem, which can be solved very efficiently. We illustrate our method by using a real data set to estimate the causal effect of fish consumption on blood mercury level."

to:NB
causal_inference
model_checking
misspecification
statistics
small.dylan
yesterday by cshalizi

Robust causal structure learning with some hidden variables

yesterday by cshalizi

"We introduce a new method to estimate the Markov equivalence class of a directed acyclic graph (DAG) in the presence of hidden variables, in settings where the underlying DAG among the observed variables is sparse, and there are a few hidden variables that have a direct effect on many of the observed variables. Building on the so‐called low rank plus sparse framework, we suggest a two‐stage approach which first removes the effect of the hidden variables and then estimates the Markov equivalence class of the underlying DAG under the assumption that there are no remaining hidden variables. This approach is consistent in certain high dimensional regimes and performs favourably when compared with the state of the art, in terms of both graphical structure recovery and total causal effect estimation."

to:NB
causal_discovery
causal_inference
graphical_models
maathuis.marloes
statistics
yesterday by cshalizi

Novel criteria to exclude the surrogate paradox and their optimalities - Yin - - Scandinavian Journal of Statistics - Wiley Online Library

yesterday by cshalizi

"When the primary outcome is hard to collect, a surrogate endpoint is typically used as a substitute. However, even when a treatment has a positive average causal effect (ACE) on a surrogate endpoint, which also has a positive ACE on the primary outcome, it is still possible that the treatment has a negative ACE on the primary outcome. Such a phenomenon is called the surrogate paradox and greatly challenges the use of surrogates. In this paper, we provide criteria to exclude the surrogate paradox. Our criteria are optimal in the sense that they are sufficient and “almost necessary” to exclude the paradox: If the conditions are satisfied, the surrogate paradox is guaranteed to be absent, whereas if the conditions fail, there exists a data‐generating process with surrogate paradox that can generate the same observed data. That is, our criteria capture all the observed information to exclude the surrogate paradox."

to:NB
causal_inference
statistics
yesterday by cshalizi

Flexible Sensitivity Analysis for Observational Studies Without Observable Implications: Journal of the American Statistical Association: Vol 0, No 0

3 days ago by cshalizi

"A fundamental challenge in observational causal inference is that assumptions about unconfoundedness are not testable from data. Assessing sensitivity to such assumptions is therefore important in practice. Unfortunately, some existing sensitivity analysis approaches inadvertently impose restrictions that are at odds with modern causal inference methods, which emphasize flexible models for observed data. To address this issue, we propose a framework that allows (1) flexible models for the observed data and (2) clean separation of the identified and unidentified parts of the sensitivity model. Our framework extends an approach from the missing data literature, known as Tukey’s factorization, to the causal inference setting. Under this factorization, we can represent the distributions of unobserved potential outcomes in terms of unidentified selection functions that posit a relationship between treatment assignment and unobserved potential outcomes. The sensitivity parameters in this framework are easily interpreted, and we provide heuristics for calibrating these parameters against observable quantities. We demonstrate the flexibility of this approach in two examples, where we estimate both average treatment effects and quantile treatment effects using Bayesian nonparametric models for the observed data."

to:NB
partial_identification
causal_inference
sensitivity_analysis
statistics
3 days ago by cshalizi

Life after Lead: Effects of Early Interventions for Children Exposed to Lead

6 days ago by cshalizi

"Lead pollution is consistently linked to cognitive and behavioral impairments, yet little is known about the benefits of public health interventions for children exposed to lead. This paper estimates the long-term impacts of early-life interventions (e.g. lead remediation, nutritional assessment, medical evaluation, developmental surveillance, and public assistance referrals) recommended for lead-poisoned children. Using linked administrative data from Charlotte, NC, we compare outcomes for children who are similar across observable characteristics but differ in eligibility for intervention due to blood lead test results. We find that the negative outcomes previously associated with early-life exposure can largely be reversed by intervention."

--- The last tag, as usual, is conditional on liking the paper after reading it, and on replication data being available.

to:NB
to_read
lead
cognitive_development
sociology
causal_inference
to_teach:undergrad-ADA
--- The last tag, as usual, is conditional on liking the paper after reading it, and on replication data being available.

6 days ago by cshalizi

[1808.09521] Bounds on the conditional and average treatment effect with unobserved confounding factors

11 days ago by cshalizi

"We study estimation of causal effects when the dependence of treatment assignments on unobserved confounding factors is bounded. First, we develop a loss minimization approach to quantify bounds on the conditional average treatment effect under a bounded unobserved confounding model, first studied by Rosenbaum for the average treatment effect. Then, we propose a semi-parametric model to bound the average treatment effect and provide a corresponding inferential procedure, allowing us to derive confidence intervals of the true average treatment effect. Our semi-parametric method extends the classical doubly robust estimator of the average treatment effect, which assumes all confounding variables are observed. As a result, our method allows applications in problems involving covariates of a higher dimension than traditional sensitivity analyses, e.g., covariate matching, allow. We complement our methodological development with optimality results showing that in certain cases, our proposed bounds are tight. In addition to our theoretical results, we perform simulation and real data analyses to investigate the performance of the proposed method, demonstrating accurate coverage of the new confidence intervals in practical finite sample regimes."

to:NB
statistics
causal_inference
partial_identification
11 days ago by cshalizi

[1806.06802] Almost-Exact Matching with Replacement for Causal Inference

11 days ago by cshalizi

"We aim to create the highest possible quality of treatment-control matches for categorical data in the potential outcomes framework. Matching methods are heavily used in the social sciences due to their interpretability, but most matching methods do not pass basic sanity checks: they fail when irrelevant variables are introduced, and tend to be either computationally slow or produce low-quality matches. The method proposed in this work aims to match units on a weighted Hamming distance, taking into account the relative importance of the covariates; the algorithm aims to match units on as many relevant variables as possible. To do this, the algorithm creates a hierarchy of covariate combinations on which to match (similar to downward closure), in the process solving an optimization problem for each unit in order to construct the optimal matches. The algorithm uses a single dynamic program to solve all of the optimization problems simultaneously. Notable advantages of our method over existing matching procedures are its high-quality matches, versatility in handling different data distributions that may have irrelevant variables, and ability to handle missing data by matching on as many available covariates as possible."

to:NB
causal_inference
matching
statistics
computational_statistics
11 days ago by cshalizi

Analytics, Policy, and Governance | Yale University Press

16 days ago by cshalizi

"The first available textbook on the rapidly growing and increasingly important field of government analytics

"This first textbook on the increasingly important field of government analytics provides invaluable knowledge and training for students of government in the synthesis, interpretation, and communication of “big data,” which is now an integral part of governance and policy making. Integrating all the major components of this rapidly growing field, this invaluable text explores the intricate relationship of data analytics to governance while providing innovative strategies for the retrieval and management of information."

to:NB
data_mining
statistics
causal_inference
us_politics
social_science_methodology
books:noted
"This first textbook on the increasingly important field of government analytics provides invaluable knowledge and training for students of government in the synthesis, interpretation, and communication of “big data,” which is now an integral part of governance and policy making. Integrating all the major components of this rapidly growing field, this invaluable text explores the intricate relationship of data analytics to governance while providing innovative strategies for the retrieval and management of information."

16 days ago by cshalizi

[1905.12020] Matching on What Matters: A Pseudo-Metric Learning Approach to Matching Estimation in High Dimensions

17 days ago by cshalizi

"When pre-processing observational data via matching, we seek to approximate each unit with maximally similar peers that had an alternative treatment status--essentially replicating a randomized block design. However, as one considers a growing number of continuous features, a curse of dimensionality applies making asymptotically valid inference impossible (Abadie and Imbens, 2006). The alternative of ignoring plausibly relevant features is certainly no better, and the resulting trade-off substantially limits the application of matching methods to "wide" datasets. Instead, Li and Fu (2017) recasts the problem of matching in a metric learning framework that maps features to a low-dimensional space that facilitates "closer matches" while still capturing important aspects of unit-level heterogeneity. However, that method lacks key theoretical guarantees and can produce inconsistent estimates in cases of heterogeneous treatment effects. Motivated by straightforward extension of existing results in the matching literature, we present alternative techniques that learn latent matching features through either MLPs or through siamese neural networks trained on a carefully selected loss function. We benchmark the resulting alternative methods in simulations as well as against two experimental data sets--including the canonical NSW worker training program data set--and find superior performance of the neural-net-based methods."

to:NB
matching
causal_inference
statistics
metric_learning
17 days ago by cshalizi

[1905.11506] Ancestral causal learning in high dimensions with a human genome-wide application

18 days ago by cshalizi

"We consider learning ancestral causal relationships in high dimensions. Our approach is driven by a supervised learning perspective, with discrete indicators of causal relationships treated as labels to be learned from available data. We focus on the setting in which some causal (ancestral) relationships are known (via background knowledge or experimental data) and put forward a general approach that scales to large problems. This is motivated by problems in human biology which are characterized by high dimensionality and potentially many latent variables. We present a case study involving interventional data from human cells with total dimension p∼19,000. Performance is assessed empirically by testing model output against previously unseen interventional data. The proposed approach is highly effective and demonstrably scalable to the human genome-wide setting. We consider sensitivity to background knowledge and find that results are robust to nontrivial perturbations of the input information. We consider also the case, relevant to some applications, where the only prior information available concerns a small number of known ancestral relationships."

to:NB
causal_inference
causal_discovery
statistics
classifiers
18 days ago by cshalizi

[1905.11622] Robust Nonparametric Difference-in-Differences Estimation

18 days ago by cshalizi

"We consider the problem of treatment effect estimation in difference-in-differences designs where parallel trends hold only after conditioning on covariates. Existing methods for this problem rely on strong additional assumptions, e.g., that any covariates may only have linear effects on the outcome of interest, or that there is no covariate shift between different cross sections taken in the same state. Here, we develop a suite of new methods for nonparametric difference-in-differences estimation that require essentially no assumptions beyond conditional parallel trends and a relevant form of overlap. Our proposals show promising empirical performance across a variety of simulation setups, and are more robust than the standard methods based on either linear regression or propensity weighting."

to:NB
causal_inference
statistics
nonparametrics
18 days ago by cshalizi

[1905.10857] Causal Discovery and Forecasting in Nonstationary Environments with State-Space Models

19 days ago by cshalizi

"In many scientific fields, such as economics and neuroscience, we are often faced with nonstationary time series, and concerned with both finding causal relations and forecasting the values of variables of interest, both of which are particularly challenging in such nonstationary environments. In this paper, we study causal discovery and forecasting for nonstationary time series. By exploiting a particular type of state-space model to represent the processes, we show that nonstationarity helps to identify causal structure and that forecasting naturally benefits from learned causal knowledge. Specifically, we allow changes in both causal strengths and noise variances in the nonlinear state-space models, which, interestingly, renders both the causal structure and model parameters identifiable. Given the causal model, we treat forecasting as a problem in Bayesian inference in the causal model, which exploits the time-varying property of the data and adapts to new observations in a principled manner. Experimental results on synthetic and real-world data sets demonstrate the efficacy of the proposed methods."

to:NB
causal_inference
causal_discovery
state-space_models
time_series
non-stationarity
statistics
kith_and_kin
glymour.clark
to_read
zhang.kun
19 days ago by cshalizi

[1905.10848] Gaussian DAGs on network data

19 days ago by cshalizi

"The traditional directed acyclic graph (DAG) model assumes data are generated independently from the underlying joint distribution defined by the DAG. In many applications, however, individuals are linked via a network and thus the independence assumption does not hold. We propose a novel Gaussian DAG model for network data, where the dependence among individual data points (row covariance) is modeled by an undirected graph. Under this model, we develop a maximum penalized likelihood method to estimate the DAG structure and the row correlation matrix. The algorithm iterates between a decoupled lasso regression step and a graphical lasso step. We show with extensive simulated and real network data, that our algorithm improves the accuracy of DAG structure learning by leveraging the information from the estimated row correlations. Moreover, we demonstrate that the performance of existing DAG learning methods can be substantially improved via de-correlation of network data with the estimated row correlation matrix from our algorithm."

--- Pretty sure this has been done before, e.g., by Seth Flaxman.

to:NB
graphical_models
causal_inference
statistics
--- Pretty sure this has been done before, e.g., by Seth Flaxman.

19 days ago by cshalizi

[1808.06581] The Deconfounded Recommender: A Causal Inference Approach to Recommendation

19 days ago by cshalizi

"The goal of recommendation is to show users items that they will like. Though usually framed as a prediction, the spirit of recommendation is to answer an interventional question---for each user and movie, what would the rating be if we "forced" the user to watch the movie? To this end, we develop a causal approach to recommendation, one where watching a movie is a "treatment" and a user's rating is an "outcome." The problem is there may be unobserved confounders, variables that affect both which movies the users watch and how they rate them; unobserved confounders impede causal predictions with observational data. To solve this problem, we develop the deconfounded recommender, a way to use classical recommendation models for causal recommendation. Following Wang & Blei [23], the deconfounded recommender involves two probabilistic models. The first models which movies the users watch; it provides a substitute for the unobserved confounders. The second one models how each user rates each movie; it employs the substitute to help account for confounders. This two-stage approach removes bias due to confounding. It improves recommendation and enjoys stable performance against interventions on test sets."

to:NB
causal_inference
collaborative_filtering
blei.david
19 days ago by cshalizi

[1905.10176] Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments

21 days ago by cshalizi

"We consider the estimation of heterogeneous treatment effects with arbitrary machine learning methods in the presence of unobserved confounders with the aid of a valid instrument. Such settings arise in A/B tests with an intent-to-treat structure, where the experimenter randomizes over which user will receive a recommendation to take an action, and we are interested in the effect of the downstream action. We develop a statistical learning approach to the estimation of heterogeneous effects, reducing the problem to the minimization of an appropriate loss function that depends on a set of auxiliary models (each corresponding to a separate prediction task). The reduction enables the use of all recent algorithmic advances (e.g. neural nets, forests). We show that the estimated effect model is robust to estimation errors in the auxiliary models, by showing that the loss satisfies a Neyman orthogonality criterion. Our approach can be used to estimate projections of the true effect model on simpler hypothesis spaces. When these spaces are parametric, then the parameter estimates are asymptotically normal, which enables construction of confidence sets. We applied our method to estimate the effect of membership on downstream webpage engagement on TripAdvisor, using as an instrument an intent-to-treat A/B test among 4 million TripAdvisor users, where some users received an easier membership sign-up process. We also validate our method on synthetic data and on public datasets for the effects of schooling on income."

to:NB
instrumental_variables
nonparametrics
regression
causal_inference
statistics
21 days ago by cshalizi

The Real Gold Standard: Measuring Counterfactual Worlds That Matter Most to Social Science and Policy | Annual Review of Criminology

21 days ago by cshalizi

"The randomized experiment has achieved the status of the gold standard for estimating causal effects in criminology and the other social sciences. Although causal identification is indeed important and observational data present numerous challenges to causal inference, we argue that conflating causality with the method used to identify it leads to a cognitive narrowing that diverts attention from what ultimately matters most—the difference between counterfactual worlds that emerge as a consequence of their being subjected to different treatment regimes applied to all eligible population members over a sustained period of time. To address this system-level and long-term challenge, we develop an analytic framework for integrating causality and policy inference that accepts the mandate of causal rigor but is conceptually rather than methodologically driven. We then apply our framework to two substantive areas that have generated high-visibility experimental research and that have considerable policy influence: (a) hot-spots policing and (b) the use of housing vouchers to reduce concentrated disadvantage and thereby crime. After reviewing the research in these two areas in light of our framework, we propose a research path forward and conclude with implications for the interplay of theory, data, and causal understanding in criminology and other social sciences."

to:NB
causal_inference
causality
social_science_methodology
statistics
nagin.dan
kith_and_kin
re:ADAfaEPoV
21 days ago by cshalizi

Handling Missing Data in Instrumental Variable Methods for Causal Inference | Annual Review of Statistics and Its Application

21 days ago by cshalizi

"In instrumental variable studies, missing instrument data are very common. For example, in the Wisconsin Longitudinal Study, one can use genotype data as a Mendelian randomization–style instrument, but this information is often missing when subjects do not contribute saliva samples or when the genotyping platform output is ambiguous. Here we review missing at random assumptions one can use to identify instrumental variable causal effects, and discuss various approaches for estimation and inference. We consider likelihood-based methods, regression and weighting estimators, and doubly robust estimators. The likelihood-based methods yield the most precise inference and are optimal under the model assumptions, while the doubly robust estimators can attain the nonparametric efficiency bound while allowing flexible nonparametric estimation of nuisance functions (e.g., instrument propensity scores). The regression and weighting estimators can sometimes be easiest to describe and implement. Our main contribution is an extensive review of this wide array of estimators under varied missing-at-random assumptions, along with discussion of asymptotic properties and inferential tools. We also implement many of the estimators in an analysis of the Wisconsin Longitudinal Study, to study effects of impaired cognitive functioning on depression."

to:NB
instrumental_variables
missing_data
statistics
causal_inference
kith_and_kin
kennedy.edward
mauro.jacqueline
small.dylan
21 days ago by cshalizi

Evaluation of Causal Effects and Local Structure Learning of Causal Networks | Annual Review of Statistics and Its Application

21 days ago by cshalizi

"Causal effect evaluation and causal network learning are two main research areas in causal inference. For causal effect evaluation, we review the two problems of confounders and surrogates. The Yule-Simpson paradox is the idea that the association between two variables may be changed dramatically due to ignoring confounders. We review criteria for confounders and methods of adjustment for observed and unobserved confounders. The surrogate paradox occurs when a treatment has a positive causal effect on a surrogate endpoint, which, in turn, has a positive causal effect on a true endpoint, but the treatment may have a negative causal effect on the true endpoint. Some of the existing criteria for surrogates are subject to the surrogate paradox, and we review criteria for consistent surrogates to avoid the surrogate paradox. Causal networks are used to depict the causal relationships among multiple variables. Rather than discovering a global causal network, researchers are often interested in discovering the causes and effects of a given variable. We review some algorithms for local structure learning of causal networks centering around a given variable."

to:NB
causal_inference
causal_discovery
graphical_models
statistics
21 days ago by cshalizi

Identification and Extrapolation of Causal Effects with Instrumental Variables | Annual Review of Economics

21 days ago by cshalizi

"Instrumental variables (IV) are widely used in economics to address selection on unobservables. Standard IV methods produce estimates of causal effects that are specific to individuals whose behavior can be manipulated by the instrument at hand. In many cases, these individuals are not the same as those who would be induced to treatment by an intervention or policy of interest to the researcher. The average causal effect for the two groups can differ significantly if the effect of the treatment varies systematically with unobserved factors that are correlated with treatment choice. We review the implications of this type of unobserved heterogeneity for the interpretation of standard IV methods and for their relevance to policy evaluation. We argue that making inferences about policy-relevant parameters typically requires extrapolating from the individuals affected by the instrument to the individuals who would be induced to treatment by the policy under consideration. We discuss a variety of alternatives to standard IV methods that can be used to rigorously perform this extrapolation. We show that many of these approaches can be nested as special cases of a general framework that embraces the possibility of partial identification."

--- Memo to self: Read this before revising the IV sections of ADAfaEPoV.

to:NB
causal_inference
instrumental_variables
partial_identification
statistics
re:ADAfaEPoV
to_read
--- Memo to self: Read this before revising the IV sections of ADAfaEPoV.

21 days ago by cshalizi

[1808.08778] Dynamical systems theory for causal inference with application to synthetic control methods

22 days ago by cshalizi

"In this paper, we adopt results in non-linear time series analysis for causal inference in dynamical settings. We illustrate our approach on synthetic control methods, which are popular in policy analysis with panel data. Synthetic controls regress the pre-intervention outcomes of the treated unit to outcomes from a pool of control units, and then use the same regression model to estimate causal effects post-intervention. In this setting, we propose to screen out control units that have a weak dynamical relationship to the treated unit, according to certain well-established measures of relationship strength in dynamical systems theory. In simulations, we show that our method mitigates bias from adversarial control units. In real-world applications, our approach contributes to more reliable control selection, and thus more robust estimation of treatment effects in panel data."

to:NB
dynamical_systems
causal_inference
statistics
22 days ago by cshalizi

**related tags**

Copy this bookmark: