AI Needs More Why
An article on Judea Pearl's Book of WHY.
Quite an accessible introduction.
9 weeks ago by drmeme
[1812.03253] Counterfactuals uncover the modular structure of deep generative models
Deep generative models such as Generative Adversarial Networks (GANs) and Variational Auto-Encoders (VAEs) are important tools to capture and investigate the properties of complex empirical data. However, the complexity of their inner elements makes their functioning challenging to assess and modify. In this respect, these architectures behave as black box models. In order to better understand the function of such networks, we analyze their modularity based on the counterfactual manipulation of their internal variables. Experiments with face images support that modularity between groups of channels is achieved to some degree within convolutional layers of vanilla VAE and GAN generators. This helps understand the functional organization of these systems and allows designing meaningful transformations of the generated images without further training.
december 2018 by arsyed
[1811.00164] Deep Counterfactual Regret Minimization
Counterfactual Regret Minimization (CFR) is the leading algorithm for solving large imperfect-information games. It iteratively traverses the game tree in order to converge to a Nash equilibrium. In order to deal with extremely large games, CFR typically uses domain-specific heuristics to simplify the target game in a process known as abstraction. This simplified game is solved with tabular CFR, and its solution is mapped back to the full game. This paper introduces Deep Counterfactual Regret Minimization (Deep CFR), a form of CFR that obviates the need for abstraction by instead using deep neural networks to approximate the behavior of CFR in the full game. We show that Deep CFR is principled and achieves strong performance in large poker games. This is the first non-tabular variant of CFR to be successful in large games.
december 2018 by arsyed
Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search | OpenReview
Abstract: Learning policies on data synthesized by models can in principle quench the thirst of reinforcement learning algorithms for large amounts of real experience, which is often costly to acquire. However, simulating plausible experience de novo is a hard problem for many complex environments, often resulting in biases for model-based policy evaluation and search. Instead of de novo synthesis of data, here we assume logged, real experience and model alternative outcomes of this experience under counterfactual actions, i.e. actions that were not actually taken. Based on this, we propose the Counterfactually-Guided Policy Search (CF-GPS) algorithm for learning policies in POMDPs from off-policy experience. It leverages structural causal models for counterfactual evaluation of arbitrary policies on individual off-policy episodes. CF-GPS can improve on vanilla model-based RL algorithms by making use of available logged data to de-bias model predictions. In contrast to off-policy algorithms based on Importance Sampling which re-weight data, CF-GPS leverages a model to explicitly consider alternative outcomes, allowing the algorithm to make better use of experience data. We find empirically that these advantages translate into improved policy evaluation and search results on a non-trivial grid-world task. Finally, we show that CF-GPS generalizes the previously proposed Guided Policy Search and that reparameterization-based algorithms such Stochastic Value Gradient can be interpreted as counterfactual methods.
november 2018 by arsyed
Eddie Murphy and the Dangers of Counterfactual Causal Thinking About Detecting Racial Discrimination by Issa Kohler-Hausmann :: SSRN
"The model of discrimination animating some of the most common approaches to detecting discrimination in both law and social science—the counterfactual causal model—is wrong. In that model, racial discrimination is detected by measuring the “treatment effect of race,” where the treatment is conceptualized as manipulating the raced status of otherwise identical units (e.g., a person, a neighborhood, a school). Most objections to talking about race as a cause in the counterfactual model have been raised in terms of manipulability. If we cannot manipulate a person’s race at the moment of a police stop, traffic encounter, or prosecutorial charging decision, then it is impossible to detect if the person’s race was the sole cause of an unfavorable outcome. But this debate has proceeded on the wrong terms. The counterfactual causal model of discrimination is not wrong because we can’t work around the practical limits of manipulation, as evidenced by both Eddie Murphy’s comic genius in the SNL skit “White Like Me” and the entire genre of audit and correspondence studies. It is wrong because to fit the rigor of the counterfactual model of a clearly defined treatment on otherwise identical units, we must reduce race to only the signs of the category, meaning we must think race is skin color, or phenotype, or other ways we identify group status. And that is a concept mistake if one subscribes to a constructivist, as opposed to biological or genetic, conception of race. I argue that the counterfactual causal model of discrimination is based on a flawed theory of (1) what the category of race references and how it produces effects in the world and (2) what is meant when we say it is wrong to make decisions of import because of race. We cannot detect actions as discriminatory by identifying a relation of counterfactual causality; we can only do so by reasoning about its distinctive wrongfulness by referencing what constitutes the very categories that are the objects of concern."
october 2018 by arsyed
The seven tools of causal inference with reflections on machine learning
The usual great synopsis by Adrian Colyer at A Morning Paper, of Judea Pearl's paper, on the differences between machine learning models and structural causal models.

See the original paper at
september 2018 by drmeme
Toward Predicting the Outcome of an A/B Experiment for Search Relevance
A standard approach to estimating online click-based metrics
of a ranking function is to run it in a controlled experiment
on live users. While reliable and popular in practice,
configuring and running an online experiment is cumbersome
and time-intensive. In this work, inspired by recent
successes of offline evaluation techniques for recommender
systems, we study an alternative that uses historical search
log to reliably predict online click-based metrics of a new
ranking function, without actually running it on live users.
To tackle novel challenges encountered in Web search,
variations of the basic techniques are proposed. The first
is to take advantage of diversified behavior of a search engine
over a long period of time to simulate randomized data
collection, so that our approach can be used at very low cost.
The second is to replace exact matching (of recommended
items in previous work) by fuzzy matching (of search result
pages) to increase data efficiency, via a better trade-off
of bias and variance. Extensive experimental results based
on large-scale real search data from a major commercial
search engine in the US market demonstrate our approach
is promising and has potential for wide use in Web search.
