Global Wealth Inequality

5 weeks ago by rvenkat

This article reviews the recent literature on the dynamics of global wealth inequality. I first reconcile available estimates of wealth inequality in the United States. Both surveys and tax data show that wealth inequality has increased dramatically since the 1980s, with a top 1% wealth share around 40% in 2016 vs. 25–30% in the 1980s. Second, I discuss the fast growing literature on wealth inequality across the world. Evidence points towards a rise in global wealth concentration: for China, Europe, and the United States combined, the top 1% wealth share has increased from 28% in 1980 to 33% today, while the bottom 75% share hovered around 10%. Recent studies, however, may under-estimate the level and rise of inequality, as financial globalization makes it increasingly hard to measure wealth at the top. I discuss how new data sources (leaks from financial institutions, tax amnesties, and macroeconomic statistics of tax havens) can be leveraged to better capture the wealth of the rich

wealth
inequality
world_trends
economics
globalization
finance
review
5 weeks ago by rvenkat

[1507.00477] Mechanistic Models in Computational Social Science

6 weeks ago by rvenkat

Quantitative social science is not only about regression analysis or, in general, data inference. Computer simulations of social mechanisms have an over 60 years long history. They have been used for many different purposes -- to test scenarios, to test the consistency of descriptive theories (proof-of-concept models), to explore emergent phenomena, for forecasting, etc. In this essay, we sketch these historical developments, the role of mechanistic models in the social sciences and the influences from the natural and formal sciences. We argue that mechanistic computational models form a natural common ground for social and natural sciences, and look forward to possible future information flow across the social-natural divide.

-- Pretty straightforward; nothing interesting here.

computational_social_science
explanation
simulation
review
-- Pretty straightforward; nothing interesting here.

6 weeks ago by rvenkat

[1810.08810] The Frontiers of Fairness in Machine Learning

8 weeks ago by rvenkat

The last few years have seen an explosion of academic and popular interest in algorithmic fairness. Despite this interest and the volume and velocity of work that has been produced recently, the fundamental science of fairness in machine learning is still in a nascent state. In March 2018, we convened a group of experts as part of a CCC visioning workshop to assess the state of the field, and distill the most promising research directions going forward. This report summarizes the findings of that workshop. Along the way, it surveys recent theoretical work in the field and points towards promising directions for research.

review
algorithmic_fairness
machine_learning
8 weeks ago by rvenkat

[1801.07351] Tracking network dynamics: a survey of distances and similarity metrics

10 weeks ago by rvenkat

From longitudinal biomedical studies to social networks, graphs have emerged as a powerful framework for describing evolving interactions between agents in complex systems. In such studies, after pre-processing, the data can be represented by a set of graphs, each representing a system's state at different points in time. The analysis of the system's dynamics depends on the selection of the appropriate analytical tools. After characterizing similarities between states, a critical step lies in the choice of a distance between graphs capable of reflecting such similarities. While the literature offers a number of distances that one could a priori choose from, their properties have been little investigated and no guidelines regarding the choice of such a distance have yet been provided. In particular, most graph distances consider that the nodes are exchangeable and do not take into account node identities. Accounting for the alignment of the graphs enables us to enhance these distances' sensitivity to perturbations in the network and detect important changes in graph dynamics. Thus the selection of an adequate metric is a decisive --yet delicate--practical matter.

In the spirit of Goldenberg, Zheng and Fienberg's seminal 2009 review, the purpose of this article is to provide an overview of commonly-used graph distances and an explicit characterization of the structural changes that they are best able to capture. We use as a guiding thread to our discussion the application of these distances to the analysis of both a longitudinal microbiome dataset and a brain fMRI study. We show examples of using permutation tests to detect the effect of covariates on the graphs' variability. Synthetic examples provide intuition as to the qualities and drawbacks of the different distances. Above all, we provide some guidance for choosing one distance over another in certain types of applications.

temporal_networks
review
network_data_analysis
teaching
?
for_friends
In the spirit of Goldenberg, Zheng and Fienberg's seminal 2009 review, the purpose of this article is to provide an overview of commonly-used graph distances and an explicit characterization of the structural changes that they are best able to capture. We use as a guiding thread to our discussion the application of these distances to the analysis of both a longitudinal microbiome dataset and a brain fMRI study. We show examples of using permutation tests to detect the effect of covariates on the graphs' variability. Synthetic examples provide intuition as to the qualities and drawbacks of the different distances. Above all, we provide some guidance for choosing one distance over another in certain types of applications.

10 weeks ago by rvenkat

Scientific communication in a post-truth society | PNAS

december 2018 by rvenkat

Within the scientific community, much attention has focused on improving communications between scientists, policy makers, and the public. To date, efforts have centered on improving the content, accessibility, and delivery of scientific communications. Here we argue that in the current political and media environment faulty communication is no longer the core of the problem. Distrust in the scientific enterprise and misperceptions of scientific knowledge increasingly stem less from problems of communication and more from the widespread dissemination of misleading and biased information. We describe the profound structural shifts in the media environment that have occurred in recent decades and their connection to public policy decisions and technological changes. We explain how these shifts have enabled unscrupulous actors with ulterior motives increasingly to circulate fake news, misinformation, and disinformation with the help of trolls, bots, and respondent-driven algorithms. We document the high degree of partisan animosity, implicit ideological bias, political polarization, and politically motivated reasoning that now prevail in the public sphere and offer an actual example of how clearly stated scientific conclusions can be systematically perverted in the media through an internet-based campaign of disinformation and misinformation. We suggest that, in addition to attending to the clarity of their communications, scientists must also develop online strategies to counteract campaigns of misinformation and disinformation that will inevitably follow the release of findings threatening to partisans on either end of the political spectrum.

-- restricts itself to a smaller subset of of problems; ignores the fact that politically motivated disinformation and misinformation coexist along side the more innocuous looking, socially sanctioned campaigns of hype conducted researchers and universities themselves.

science_journalism
misinformation
disinformation
public_perception_of_science
review
via:nyhan
-- restricts itself to a smaller subset of of problems; ignores the fact that politically motivated disinformation and misinformation coexist along side the more innocuous looking, socially sanctioned campaigns of hype conducted researchers and universities themselves.

december 2018 by rvenkat

[1804.10068] Quantum machine learning for data scientists

november 2018 by rvenkat

This text aims to present and explain quantum machine learning algorithms to a data scientist in an accessible and consistent way. The algorithms and equations presented are not written in rigorous mathematical fashion, instead, the pressure is put on examples and step by step explanation of difficult topics. This contribution gives an overview of selected quantum machine learning algorithms, however there is also a method of scores extraction for quantum PCA algorithm proposed as well as a new cost function in feed-forward quantum neural networks is introduced. The text is divided into four parts: the first part explains the basic quantum theory, then quantum computation and quantum computer architecture are explained in section two. The third part presents quantum algorithms which will be used as subroutines in quantum machine learning algorithms. Finally, the fourth section describes quantum machine learning algorithms with the use of knowledge accumulated in previous parts.

-- Ah, even before they have anything useful to offer, there is an article aimed at the data scientists! Not recommended (I did not read it)

quantum_computing
machine_learning
review
-- Ah, even before they have anything useful to offer, there is an article aimed at the data scientists! Not recommended (I did not read it)

november 2018 by rvenkat

Quantum machine learning | Nature

november 2018 by rvenkat

Fuelled by increasing computer power and algorithmic advances, machine learning techniques have become powerful tools for finding patterns in data. Quantum systems produce atypical patterns that classical systems are thought not to produce efficiently, so it is reasonable to postulate that quantum computers may outperform classical computers on machine learning tasks. The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers. Recent work has produced quantum algorithms that could act as the building blocks of machine learning programs, but the hardware and software challenges are still considerable.

-- recommend Aaronson's moderation along with this paper

https://www.nature.com/articles/nphys3272

quantum_computing
machine_learning
review
-- recommend Aaronson's moderation along with this paper

https://www.nature.com/articles/nphys3272

november 2018 by rvenkat

The Institutional Turn in Comparative Authoritarianism | British Journal of Political Science | Cambridge Core

november 2018 by rvenkat

The institutional turn in comparative authoritarianism has generated wide interest. This article reviews three prominent books on authoritarian institutions and their central theoretical propositions about the origins, functions and effects of dominant party institutions on authoritarian rule. Two critical perspectives on political institutions, one based on rationalist theories of institutional design and the other based on a social conflict theory of political economy, suggest that authoritarian institutions are epiphenomenal on more fundamental political, social and/or economic relations. Such approaches have been largely ignored in this recent literature, but each calls into question the theoretical and empirical claims that form the basis of institutionalist approaches to authoritarian rule. A central implication of this article is that authoritarian institutions cannot be studied separately from the concrete problems of redistribution and policy making that motivate regime behavior.

comparative
political_science
authoritarianism
institutions
review
via:henryfarrell
november 2018 by rvenkat

Glia as architects of central nervous system formation and function | Science

november 2018 by rvenkat

Glia constitute roughly half of the cells of the central nervous system (CNS) but were long-considered to be static bystanders to its formation and function. Here we provide an overview of how the diverse and dynamic functions of glial cells orchestrate essentially all aspects of nervous system formation and function. Radial glia, astrocytes, oligodendrocyte progenitor cells, oligodendrocytes, and microglia each influence nervous system development, from neuronal birth, migration, axon specification, and growth through circuit assembly and synaptogenesis. As neural circuits mature, distinct glia fulfill key roles in synaptic communication, plasticity, homeostasis, and network-level activity through dynamic monitoring and alteration of CNS structure and function. Continued elucidation of glial cell biology, and the dynamic interactions of neurons and glia, will enrich our understanding of nervous system formation, health, and function.

neuroscience
review
philosophy_of_science
november 2018 by rvenkat

[1809.10756] An Introduction to Probabilistic Programming

october 2018 by rvenkat

This document is designed to be a first-year graduate-level introduction to probabilistic programming. It not only provides a thorough background for anyone wishing to use a probabilistic programming system, but also introduces the techniques needed to design and build these systems. It is aimed at people who have an undergraduate-level understanding of either or, ideally, both probabilistic machine learning and programming languages.

We start with a discussion of model-based reasoning and explain why conditioning as a foundational computation is central to the fields of probabilistic machine learning and artificial intelligence. We then introduce a simple first-order probabilistic programming language (PPL) whose programs define static-computation-graph, finite-variable-cardinality models. In the context of this restricted PPL we introduce fundamental inference algorithms and describe how they can be implemented in the context of models denoted by probabilistic programs.

In the second part of this document, we introduce a higher-order probabilistic programming language, with a functionality analogous to that of established programming languages. This affords the opportunity to define models with dynamic computation graphs, at the cost of requiring inference methods that generate samples by repeatedly executing the program. Foundational inference algorithms for this kind of probabilistic programming language are explained in the context of an interface between program executions and an inference controller.

This document closes with a chapter on advanced topics which we believe to be, at the time of writing, interesting directions for probabilistic programming research; directions that point towards a tight integration with deep neural network research and the development of systems for next-generation artificial intelligence applications.

probabilistic_programming
machine_learning
tutorial
review
via:droy
We start with a discussion of model-based reasoning and explain why conditioning as a foundational computation is central to the fields of probabilistic machine learning and artificial intelligence. We then introduce a simple first-order probabilistic programming language (PPL) whose programs define static-computation-graph, finite-variable-cardinality models. In the context of this restricted PPL we introduce fundamental inference algorithms and describe how they can be implemented in the context of models denoted by probabilistic programs.

In the second part of this document, we introduce a higher-order probabilistic programming language, with a functionality analogous to that of established programming languages. This affords the opportunity to define models with dynamic computation graphs, at the cost of requiring inference methods that generate samples by repeatedly executing the program. Foundational inference algorithms for this kind of probabilistic programming language are explained in the context of an interface between program executions and an inference controller.

This document closes with a chapter on advanced topics which we believe to be, at the time of writing, interesting directions for probabilistic programming research; directions that point towards a tight integration with deep neural network research and the development of systems for next-generation artificial intelligence applications.

october 2018 by rvenkat

Active Matter

august 2018 by rvenkat

--collection of Nature journal articles on the topic.

review
active_matter
non-equilibrium
dynamical_system
self_organization
emergence
for_friends
teaching
august 2018 by rvenkat

[1808.00023] The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning

august 2018 by rvenkat

The nascent field of fair machine learning aims to ensure that decisions guided by algorithms are equitable. Over the last several years, three formal definitions of fairness have gained prominence: (1) anti-classification, meaning that protected attributes---like race, gender, and their proxies---are not explicitly used to make decisions; (2) classification parity, meaning that common measures of predictive performance (e.g., false positive and false negative rates) are equal across groups defined by the protected attributes; and (3) calibration, meaning that conditional on risk estimates, outcomes are independent of protected attributes. Here we show that all three of these fairness definitions suffer from significant statistical limitations. Requiring anti-classification or classification parity can, perversely, harm the very groups they were designed to protect; and calibration, though generally desirable, provides little guarantee that decisions are equitable. In contrast to these formal fairness criteria, we argue that it is often preferable to treat similarly risky people similarly, based on the most statistically accurate estimates of risk that one can produce. Such a strategy, while not universally applicable, often aligns well with policy objectives; notably, this strategy will typically violate both anti-classification and classification parity. In practice, it requires significant effort to construct suitable risk estimates. One must carefully define and measure the targets of prediction to avoid retrenching biases in the data. But, importantly, one cannot generally address these difficulties by requiring that algorithms satisfy popular mathematical formalizations of fairness. By highlighting these challenges in the foundation of fair machine learning, we hope to help researchers and practitioners productively advance the area.

machine_learning
algorithms
bias
ethics
privacy
review
for_friends
august 2018 by rvenkat

Crane : The Ubiquitous Ewens Sampling Formula

july 2018 by rvenkat

Ewens’s sampling formula exemplifies the harmony of mathematical theory, statistical application, and scientific discovery. The formula not only contributes to the foundations of evolutionary molecular genetics, the neutral theory of biodiversity, Bayesian nonparametrics, combinatorial stochastic processes, and inductive inference but also emerges from fundamental concepts in probability theory, algebra, and number theory. With an emphasis on its far-reaching influence throughout statistics and probability, we highlight these and many other consequences of Ewens’s seminal discovery.

probability
statistics
review
combinatorics
july 2018 by rvenkat

Ding , Li : Causal Inference: A Missing Data Perspective

july 2018 by rvenkat

Inferring causal effects of treatments is a central goal in many disciplines. The potential outcomes framework is a main statistical approach to causal inference, in which a causal effect is defined as a comparison of the potential outcomes of the same units under different treatment conditions. Because for each unit at most one of the potential outcomes is observed and the rest are missing, causal inference is inherently a missing data problem. Indeed, there is a close analogy in the terminology and the inferential framework between causal inference and missing data. Despite the intrinsic connection between the two subjects, statistical analyses of causal inference and missing data also have marked differences in aims, settings and methods. This article provides a systematic review of causal inference from the missing data perspective. Focusing on ignorable treatment assignment mechanisms, we discuss a wide range of causal inference methods that have analogues in missing data analysis, such as imputation, inverse probability weighting and doubly robust methods. Under each of the three modes of inference—Frequentist, Bayesian and Fisherian randomization—we present the general structure of inference for both finite-sample and super-population estimands, and illustrate via specific examples. We identify open questions to motivate more research to bridge the two fields.

causal_inference
statistics
review
july 2018 by rvenkat

Aldous : Elo Ratings and the Sports Model: A Neglected Topic in Applied Probability?

july 2018 by rvenkat

In a simple model for sports, the probability A beats B is a specified function of their difference in strength. One might think this would be a staple topic in Applied Probability textbooks (like the Galton–Watson branching process model, for instance) but it is curiously absent. Our first purpose is to point out that the model suggests a wide range of questions, suitable for “undergraduate research” via simulation but also challenging as professional research. Our second, more specific, purpose concerns Elo-type rating algorithms for tracking changing strengths. There has been little foundational research on their accuracy, despite a much-copied “30 matches suffice” claim, which our simulation study casts doubt upon.

probability
statistics
review
july 2018 by rvenkat

[1611.00814] Information-theoretic thresholds from the cavity method

july 2018 by rvenkat

Vindicating a sophisticated but non-rigorous physics approach called the cavity method, we establish a formula for the mutual information in statistical inference problems induced by random graphs and we show that the mutual information holds the key to understanding certain important phase transitions in random graph models. We work out several concrete applications of these general results. For instance, we pinpoint the exact condensation phase transition in the Potts antiferromagnet on the random graph, thereby improving prior approximate results [Contucci et al.: Communications in Mathematical Physics 2013]. Further, we prove the conjecture from [Krzakala et al.: PNAS 2007] about the condensation phase transition in the random graph coloring problem for any number q≥3 of colors. Moreover, we prove the conjecture on the information-theoretic threshold in the disassortative stochastic block model [Decelle et al.: Phys. Rev. E 2011]. Additionally, our general result implies the conjectured formula for the mutual information in Low-Density Generator Matrix codes [Montanari: IEEE Transactions on Information Theory 2005].

review
spin_glass
information_theory
phase_transition
july 2018 by rvenkat

On the nature and use of models in network neuroscience | Nature Reviews Neuroscience

july 2018 by rvenkat

Network theory provides an intuitively appealing framework for studying relationships among interconnected brain mechanisms and their relevance to behaviour. As the space of its applications grows, so does the diversity of meanings of the term network model. This diversity can cause confusion, complicate efforts to assess model validity and efficacy, and hamper interdisciplinary collaboration. In this Review, we examine the field of network neuroscience, focusing on organizing principles that can help overcome these challenges. First, we describe the fundamental goals in constructing network models. Second, we review the most common forms of network models, which can be described parsimoniously along the following three primary dimensions: from data representations to first-principles theory; from biophysical realism to functional phenomenology; and from elementary descriptions to coarse-grained approximations. Third, we draw on biology, philosophy and other disciplines to establish validation principles for these models. We close with a discussion of opportunities to bridge model types and point to exciting frontiers for future pursuits.

review
networks
neuroscience
july 2018 by rvenkat

[1309.6928] Structure and dynamics of core-periphery networks

july 2018 by rvenkat

Recent studies uncovered important core/periphery network structures characterizing complex sets of cooperative and competitive interactions between network nodes, be they proteins, cells, species or humans. Better characterization of the structure, dynamics and function of core/periphery networks is a key step of our understanding cellular functions, species adaptation, social and market changes. Here we summarize the current knowledge of the structure and dynamics of "traditional" core/periphery networks, rich-clubs, nested, bow-tie and onion networks. Comparing core/periphery structures with network modules, we discriminate between global and local cores. The core/periphery network organization lies in the middle of several extreme properties, such as random/condensed structures, clique/star configurations, network symmetry/asymmetry, network assortativity/disassortativity, as well as network hierarchy/anti-hierarchy. These properties of high complexity together with the large degeneracy of core pathways ensuring cooperation and providing multiple options of network flow re-channelling greatly contribute to the high robustness of complex systems. Core processes enable a coordinated response to various stimuli, decrease noise, and evolve slowly. The integrative function of network cores is an important step in the development of a large variety of complex organisms and organizations. In addition to these important features and several decades of research interest, studies on core/periphery networks still have a number of unexplored areas.

review
networks
dynamics
july 2018 by rvenkat

[1802.09679] A guide to Brownian motion and related stochastic processes

july 2018 by rvenkat

This is a guide to the mathematical theory of Brownian motion and related stochastic processes, with indications of how this theory is related to other branches of mathematics, most notably the classical theory of partial differential equations associated with the Laplace and heat operators, and various generalizations thereof. As a typical reader, we have in mind a student, familiar with the basic concepts of probability based on measure theory, at the level of the graduate texts of Billingsley and Durrett , and who wants a broader perspective on the theory of Brownian motion and related stochastic processes than can be found in these texts.

tutorial
review
brownian
stochastic_processes
july 2018 by rvenkat

[1401.4770] Opinion Exchange Dynamics

july 2018 by rvenkat

We survey a range of models of opinion exchange. From the introduction: "The exchange of opinions between individuals is a fundamental social interaction... Moreover, many models in this field are an excellent playground for mathematicians, especially those working in probability, algorithms and combinatorics. The goal of this survey is to introduce such models to mathematicians, and especially to those working in discrete mathematics, information theory, optimization, probability and statistics."

opinion_dynamics
opinion_formation
interating_particle_system
game_theory
review
july 2018 by rvenkat

Economic Consequences of Network Structure

july 2018 by rvenkat

We survey the literature on the economic consequences of the structure of social networks. We develop a taxonomy of "macro" and "micro" characteristics of social-interaction networks and discuss both the theoretical and empirical findings concerning the role of those characteristics in determining learning, diffusion, decisions, and resulting behaviors. We also discuss the challenges of accounting for the endogeneity of networks in assessing the relationship between the patterns of interactions and behaviors.

economics
social_networks
networks
review
matthew.jackson
teaching
july 2018 by rvenkat

Statistical Causality from a Decision-Theoretic Perspective | Annual Review of Statistics and Its Application

july 2018 by rvenkat

We present an overview of the decision-theoretic framework of statistical causality, which is well suited for formulating and solving problems of determining the effects of applied causes. The approach is described in detail, and it is related to and contrasted with other current formulations, such as structural equation models and potential responses. Topics and applications covered include confounding, the effect of treatment on the treated, instrumental variables, and dynamic treatment strategies.

causal_inference
decison_theory
statistics
review
philip.dawid
july 2018 by rvenkat

[1703.10007] A Course in Interacting Particle Systems

july 2018 by rvenkat

These lecture notes give an introduction to the theory of interacting particle systems. The main subjects are the construction using generators and graphical representations, the mean field limit, stochastic order, duality, and the relation to oriented percolation. An attempt is made to give a large number of examples beyond the classical voter, contact and Ising processes and to illustrate these based on numerical simulations.

interating_particle_system
tutorial
review
july 2018 by rvenkat

Persistence and first-passage properties in nonequilibrium systems: Advances in Physics: Vol 62, No 3

june 2018 by rvenkat

In this review, we discuss the persistence and the related first-passage properties in extended many-body nonequilibrium systems. Starting with simple systems with one or few degrees of freedom, such as random walk and random acceleration problems, we progressively discuss the persistence properties in systems with many degrees of freedom. These systems include spin models undergoing phase-ordering dynamics, diffusion equation, fluctuating interfaces, etc. Persistence properties are nontrivial in these systems as the effective underlying stochastic process is non-Markovian. Several exact and approximate methods have been developed to compute the persistence of such non-Markov processes over the last two decades, as reviewed in this article. We also discuss various generalizations of the local site persistence probability. Persistence in systems with quenched disorder is discussed briefly. Although the main emphasis of this review is on the theoretical developments on persistence, we briefly touch upon various experimental systems as well.

first-passage-time
non-equilibrium
dynamics
review
satya.majumdar
june 2018 by rvenkat

The Exit Problem for Randomly Perturbed Dynamical Systems | SIAM Journal on Applied Mathematics | Vol. 33, No. 2 | Society for Industrial and Applied Mathematics

june 2018 by rvenkat

The cumulative effect on dynamical systems, of even very small random perturbations, may be considerable after sufficiently long times. For example, even if the corresponding deterministic system has an asymptotically stable equilibrium point, random effects can cause the trajectories of the system to leave any bounded domain with probability one. In this paper we consider the effect of small random perturbations of the type referred to as Gaussian white noise, on a (deterministic) dynamical system $\dot x = b(x)$. The vector $x(t)$ then becomes a stochastic process $x_\varepsilon (t)$ which satisfies the stochastic differential equation $dx_\varepsilon = b(x_\varepsilon )dt + \varepsilon \sigma (x_\varepsilon )dw$. Here $w(t)$ is the n dimensional Wiener process (Brownian motion), $b(x)$ is a vector field, $\sigma (x)$ is the diffusion matrix and $\varepsilon \ne 0$ is a small real parameter. We give the first complete formal solution of the following problem originally posed by Kolmogorov: Find the asymptotic expansion in $\varepsilon $ of (i) the probability distribution of the points on the boundary of a domain, where trajectories of the perturbed system first exit, and (ii) of the expected exit times. Our method is to relate the solutions of the above problems to the solutions of various singularly perturbed elliptic boundary value problems with turning points, whose solutions are then constructed asymptotically.

review
first-passage-time
june 2018 by rvenkat

The Exit Problem: A New Approach to Diffusion Across Potential Barriers | SIAM Journal on Applied Mathematics | Vol. 36, No. 3 | Society for Industrial and Applied Mathematics

june 2018 by rvenkat

We consider the problem of a Brownian particle confined in a potential well of forces, which escapes the potential barrrier as the result of white noise forces acting on it. The problem is characterized by a diffusion process in a force field and is described by Langevin’s stochastic differential equation. We consider potential wells with many transition states and compute the expected exit time of the particle from the well as well as the probability distribution of the exit points. Our method relates these quantities to the solutions of certain singularly perturbed elliptic boundary value problems which are solved asymptotically. Our results are then applied to the calculation of chemical reaction rates by considering the breaking of chemical bonds caused by random molecular collisions, and to the calculation of the diffusion matrix in crystals by considering random atomic migration in the periodic force field of the crystal lattice, caused by thermal vibrations of the lattice.

review
first-passage-time
june 2018 by rvenkat

A Direct Approach to the Exit Problem | SIAM Journal on Applied Mathematics | Vol. 50, No. 2 | Society for Industrial and Applied Mathematics

june 2018 by rvenkat

This paper considers the problem of exit for a dynamical system driven by small white noise, from the domain of attraction of a stable state. A direct singular perturbation analysis of the forward equation is presented, based on Kramers’ approach, in which the solution to the stationary Fokker–Planck equation is constructed, for a process with absorption at the boundary and a source at the attractor. In this formulation the boundary and matching conditions fully determine the uniform expansion of the solution, without resorting to “external” selection criteria for the expansion coefficients, such as variational principles or the Lagrange identity, as in our previous theory. The exit density and the mean first passage time to the boundary are calculated from the solution of the stationary Fokker–Planck equation as the probability current density and as the inverse of the total flux on the boundary, respectively. As an application, a uniform expansion is constructed for the escape rate in Kramers’ problem of activated escape from a potential well for the full range of the dissipation parameter.

review
first-passage-time
june 2018 by rvenkat

Singular Perturbation Methods in Stochastic Differential Equations of Mathematical Physics | SIAM Review | Vol. 22, No. 2 | Society for Industrial and Applied Mathematics

june 2018 by rvenkat

Stochastic differential equations are used as models for various physical phenomena, such as chemical reactions, atomic migration in crystals, thermal fluctuations in electrical networks, noisy signals in radio transmission, etc. First passage times of solutions of such equations from certain domains and the distribution of the exit points are computed from the solutions of singularly perturbed elliptic boundary value problems. Physical interpretation of these quantities is given. Applications in communication theory and in reliability of structures are shown.

review
first-passage-time
stochastic_processes
dynamical_system
june 2018 by rvenkat

Stochastic Actor-Oriented Models for Network Dynamics | Annual Review of Statistics and Its Application

june 2018 by rvenkat

This article discusses the stochastic actor-oriented model for analyzing panel data of networks. The model is defined as a continuous-time Markov chain, observed at two or more discrete time moments. It can be regarded as a generalized linear model with a large amount of missing data. Several estimation methods are discussed. After presenting the model for evolution of networks, attention is given to coevolution models. These use the same approach of a continuous-time Markov chain observed at a small number of time points, but now with an extended state space. The state space can be, for example, the combination of a network and nodal variables, or a combination of several networks. This leads to models for the dynamics of multivariate networks. The article emphasizes the approach to modeling and algorithmic issues for estimation; some attention is given to comparison with other models.

review
networks
dynamics
june 2018 by rvenkat

[1605.00316] Directional Statistics in Machine Learning: a Brief Review

june 2018 by rvenkat

The modern data analyst must cope with data encoded in various forms, vectors, matrices, strings, graphs, or more. Consequently, statistical and machine learning models tailored to different data encodings are important. We focus on data encoded as normalized vectors, so that their "direction" is more important than their magnitude. Specifically, we consider high-dimensional vectors that lie either on the surface of the unit hypersphere or on the real projective plane. For such data, we briefly review common mathematical models prevalent in machine learning, while also outlining some technical aspects, software, applications, and open mathematical challenges.

review
statistics
machine_learning
differential_geometry
june 2018 by rvenkat

Optimization Methods for Large-Scale Machine Learning | SIAM Review | Vol. 60, No. 2 | Society for Industrial and Applied Mathematics

may 2018 by rvenkat

This paper provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications. Through case studies on text classification and the training of deep neural networks, we discuss how optimization problems arise in machine learning and what makes them challenging. A major theme of our study is that large-scale machine learning represents a distinctive setting in which the stochastic gradient (SG) method has traditionally played a central role while conventional gradient-based nonlinear optimization techniques typically falter. Based on this viewpoint, we present a comprehensive theory of a straightforward, yet versatile SG algorithm, discuss its practical behavior, and highlight opportunities for designing algorithms with improved performance. This leads to a discussion about the next generation of optimization methods for large-scale machine learning, including an investigation of two main streams of research on techniques that diminish noise in the stochastic directions and methods that make use of second-order derivative approximations.

review
optimization
machine_learning
may 2018 by rvenkat

Configuring Random Graph Models with Fixed Degree Sequences | SIAM Review | Vol. 60, No. 2 | Society for Industrial and Applied Mathematics

may 2018 by rvenkat

Random graph null models have found widespread application in diverse research communities analyzing network datasets, including social, information, and economic networks, as well as food webs, protein-protein interactions, and neuronal networks. The most popular random graph null models, called configuration models, are defined as uniform distributions over a space of graphs with a fixed degree sequence. Commonly, properties of an empirical network are compared to properties of an ensemble of graphs from a configuration model in order to quantify whether empirical network properties are meaningful or whether they are instead a common consequence of the particular degree sequence. In this work we study the subtle but important decisions underlying the specification of a configuration model, and we investigate the role these choices play in graph sampling procedures and a suite of applications. We place particular emphasis on the importance of specifying the appropriate graph labeling---stub-labeled or vertex-labeled---under which to consider a null model, a choice that closely connects the study of random graphs to the study of random contingency tables. We show that the choice of graph labeling is inconsequential for studies of simple graphs, but can have a significant impact on analyses of multigraphs or graphs with self-loops. The importance of these choices is demonstrated through a series of three in-depth vignettes, analyzing three different network datasets under many different configuration models and observing substantial differences in study conclusions under different models. We argue that in each case, only one of the possible configuration models is appropriate. While our work focuses on undirected static networks, it aims to guide the study of directed networks, dynamic networks, and all other network contexts that are suitably studied through the lens of random graph null models.

networks
review
simulation
may 2018 by rvenkat

[1703.10146] Community detection and stochastic block models: recent developments

april 2018 by rvenkat

The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study the statistical and computational tradeoffs that arise in network and data sciences.

This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds.

The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed.

networks
block_model
teaching
community_detection
review
This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds.

The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed.

april 2018 by rvenkat

[1508.01303] Modern temporal network theory: A colloquium

april 2018 by rvenkat

The power of any kind of network approach lies in the ability to simplify a complex system so that one can better understand its function as a whole. Sometimes it is beneficial, however, to include more information than in a simple graph of only nodes and links. Adding information about times of interactions can make predictions and mechanistic understanding more accurate. The drawback, however, is that there are not so many methods available, partly because temporal networks is a relatively young field, partly because it more difficult to develop such methods compared to for static networks. In this colloquium, we review the methods to analyze and model temporal networks and processes taking place on them, focusing mainly on the last three years. This includes the spreading of infectious disease, opinions, rumors, in social networks; information packets in computer networks; various types of signaling in biology, and more. We also discuss future directions.

temporal_networks
review
networks
teaching
april 2018 by rvenkat

[1803.08823] A high-bias, low-variance introduction to Machine Learning for physicists

april 2018 by rvenkat

Machine Learning (ML) is one of the most exciting and dynamic areas of modern research and application. The purpose of this review is to provide an introduction to the core concepts and tools of machine learning in a manner easily understood and intuitive to physicists. The review begins by covering fundamental concepts in ML and modern statistics such as the bias-variance tradeoff, overfitting, regularization, and generalization before moving on to more advanced topics in both supervised and unsupervised learning. Topics covered in the review include ensemble models, deep learning and neural networks, clustering and data visualization, energy-based models (including MaxEnt models and Restricted Boltzmann Machines), and variational methods. Throughout, we emphasize the many natural connections between ML and statistical physics. A notable aspect of the review is the use of Python notebooks to introduce modern ML/statistical packages to readers using physics-inspired datasets (the Ising Model and Monte-Carlo simulations of supersymmetric decays of proton-proton collisions). We conclude with an extended outlook discussing possible uses of machine learning for furthering our understanding of the physical world as well as open problems in ML where physicists maybe able to contribute. (Notebooks are available at this https URL )

tutorial
review
machine_learning
deep_learning
statistical_mechanics
physics
python
teaching
april 2018 by rvenkat

[1609.00066] A Review of Multivariate Distributions for Count Data Derived from the Poisson Distribution

march 2018 by rvenkat

The Poisson distribution has been widely studied and used for modeling univariate count-valued data. Multivariate generalizations of the Poisson distribution that permit dependencies, however, have been far less popular. Yet, real-world high-dimensional count-valued data found in word counts, genomics, and crime statistics, for example, exhibit rich dependencies, and motivate the need for multivariate distributions that can appropriately model this data. We review multivariate distributions derived from the univariate Poisson, categorizing these models into three main classes: 1) where the marginal distributions are Poisson, 2) where the joint distribution is a mixture of independent multivariate Poisson distributions, and 3) where the node-conditional distributions are derived from the Poisson. We discuss the development of multiple instances of these classes and compare the models in terms of interpretability and theory. Then, we empirically compare multiple models from each class on three real-world datasets that have varying data characteristics from different domains, namely traffic accident data, biological next generation sequencing data, and text data. These empirical experiments develop intuition about the comparative advantages and disadvantages of each class of multivariate distribution that was derived from the Poisson. Finally, we suggest new research directions as explored in the subsequent discussion section.

count_models
statistics
data_analysis
review
march 2018 by rvenkat

Network centrality: an introduction

march 2018 by rvenkat

--This, Newman et al, and Jackson et al paper for student projects?

networks
teaching
software
python
review
march 2018 by rvenkat

The science of fake news | Science

march 2018 by rvenkat

The rise of fake news highlights the erosion of long-standing institutional bulwarks against misinformation in the internet age. Concern over the problem is global. However, much remains unknown regarding the vulnerabilities of individuals, institutions, and society to manipulations by malicious actors. A new system of safeguards is needed. Below, we discuss extant social and computer science research regarding belief in fake news and the mechanisms by which it spreads. Fake news has a long history, but we focus on unanswered scientific questions raised by the proliferation of its most recent, politically oriented incarnation. Beyond selected references in the text, suggested further reading can be found in the supplementary materials.

review
report
misinformation
disinformation
contagion
journalism
news_media
networks
dmce
teaching
march 2018 by rvenkat

[1710.07035] Generative Adversarial Networks: An Overview

february 2018 by rvenkat

Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can be learned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application.

tutorial
review
deep_learning
adversarial_examples
february 2018 by rvenkat

[1607.00699] The State of Applied Econometrics - Causality and Policy Evaluation

february 2018 by rvenkat

In this paper we discuss recent developments in econometrics that we view as important for empirical researchers working on policy evaluation questions. We focus on three main areas, where in each case we highlight recommendations for applied work. First, we discuss new research on identification strategies in program evaluation, with particular focus on synthetic control methods, regression discontinuity, external validity, and the causal interpretation of regression methods. Second, we discuss various forms of supplementary analyses to make the identification strategies more credible. These include placebo analyses as well as sensitivity and robustness analyses. Third, we discuss recent advances in machine learning methods for causal effects. These advances include methods to adjust for differences between treated and control units in high-dimensional settings, and methods for identifying and estimating heterogeneous treatment effects.

causal_inference
intervention
econometrics
review
susan.athey
february 2018 by rvenkat

Social Mobilization | Annual Review of Psychology

february 2018 by rvenkat

Abstract

This article reviews research from several behavioral disciplines to derive strategies for prompting people to perform behaviors that are individually costly and provide negligible individual or social benefits but are meaningful when performed by a large number of individuals. Whereas the term social influence encompasses all the ways in which people influence other people, social mobilization refers specifically to principles that can be used to influence a large number of individuals to participate in such activities. The motivational force of social mobilization is amplified by the fact that others benefit from the encouraged behaviors, and its overall impact is enhanced by the fact that people are embedded within social networks. This article may be useful to those interested in the provision of public goods, collective action, and prosocial behavior, and we give special attention to field experiments on election participation, environmentally sustainable behaviors, and charitable giving.

https://scholar.harvard.edu/files/todd_rogers/files/rogers_goldstein_fox.2017.pdf

collective_action
political_economy
public_goods
social_behavior
intervention
review
social_networks
networks
dmce
teaching
via:nyhan
This article reviews research from several behavioral disciplines to derive strategies for prompting people to perform behaviors that are individually costly and provide negligible individual or social benefits but are meaningful when performed by a large number of individuals. Whereas the term social influence encompasses all the ways in which people influence other people, social mobilization refers specifically to principles that can be used to influence a large number of individuals to participate in such activities. The motivational force of social mobilization is amplified by the fact that others benefit from the encouraged behaviors, and its overall impact is enhanced by the fact that people are embedded within social networks. This article may be useful to those interested in the provision of public goods, collective action, and prosocial behavior, and we give special attention to field experiments on election participation, environmentally sustainable behaviors, and charitable giving.

https://scholar.harvard.edu/files/todd_rogers/files/rogers_goldstein_fox.2017.pdf

february 2018 by rvenkat

[1204.6265] Statistical inference for dynamical systems: a review

january 2018 by rvenkat

The topic of statistical inference for dynamical systems has been studied extensively across several fields. In this survey we focus on the problem of parameter estimation for non-linear dynamical systems. Our objective is to place results across distinct disciplines in a common setting and highlight opportunities for further research.

statistics
dynamical_system
differential_geometry
topological_data_analysis
review
january 2018 by rvenkat

Natural Experiments in Macroeconomics

november 2017 by rvenkat

A growing literature relies on natural experiments to establish causal effects in macroeconomics. In diverse applications, natural experiments have been used to verify underlying assumptions of conventional models, quantify specific model parameters, and identify mechanisms that have major effects on macroeconomic quantities but are absent from conventional models. We discuss and compare the use of natural experiments across these different applications and summarize what they have taught us about such diverse subjects as the validity of the Permanent Income Hypothesis, the size of the fiscal multiplier, and about the effects of institutions, social structure, and culture on economic growth. We also outline challenges for future work in each of these fields, give guidance for identifying useful natural experiments, and discuss the strengths and weaknesses of the approach.

https://www.wiwi.uni-frankfurt.de/profs/fuchs/staff/fuchs/paper/PaperNaturalExperimentsMacroNBER.pdf

review
natural_experiment
econometrics
causal_inference
via:noahpinion
https://www.wiwi.uni-frankfurt.de/profs/fuchs/staff/fuchs/paper/PaperNaturalExperimentsMacroNBER.pdf

november 2017 by rvenkat

Islam and Economic Performance: Historical and Contemporary Links

october 2017 by rvenkat

This essay critically evaluates the analytic literature concerned with causal connections between Islam and economic performance. It focuses on works since 1997, when this literature was last surveyed. Among the findings are the following: Ramadan fasting by pregnant women harms prenatal development; Islamic charities mainly benefit the middle class; Islam affects educational outcomes less through Islamic schooling than through structural factors that handicap learning as a whole; Islamic finance hardly affects Muslim financial behavior; and low generalized trust depresses Muslim trade. The last feature reflects the Muslim world’s delay in transitioning from personal to impersonal exchange.The delay resulted from the persistent simplicity of the private enterprises formed under Islamic law. Weak property rights reinforced the private sector’s stagnation by driving capital out of commerce and into rigid waqfs. Waqfs limited economic development through their inflexibility and democratization by restraining the development of civil society. Parts of the Muslim world conquered by Arab armies are especially undemocratic, which suggests that early Islamic institutions, including slave-based armies, were particularly critical to the persistence of authoritarian patterns of governance. States have contributed themselves to the persistence of authoritarianism by treating Islam as an instrument of governance. As the world started to industrialize, non-Muslim subjects of Muslim-governed states pulled ahead of their Muslim neighbors by exercising the choice of law they enjoyed under Islamic law in favor of a Western legal system

islam
culture
norms
institutions
economic_sociology
economic_history
review
october 2017 by rvenkat

[1709.09636] Randomized experiments to detect and estimate social influence in networks

october 2017 by rvenkat

Estimation of social influence in networks can be substantially biased in observational studies due to homophily and network correlation in exposure to exogenous events. Randomized experiments, in which the researcher intervenes in the social system and uses randomization to determine how to do so, provide a methodology for credibly estimating of causal effects of social behaviors. In addition to addressing questions central to the social sciences, these estimates can form the basis for effective marketing and public policy.

In this review, we discuss the design space of experiments to measure social influence through combinations of interventions and randomizations. We define an experiment as combination of (1) a target population of individuals connected by an observed interaction network, (2) a set of treatments whereby the researcher will intervene in the social system, (3) a randomization strategy which maps individuals or edges to treatments, and (4) a measurement of an outcome of interest after treatment has been assigned. We review experiments that demonstrate potential experimental designs and we evaluate their advantages and tradeoffs for answering different types of causal questions about social influence. We show how randomization also provides a basis for statistical inference when analyzing these experiments.

review
networks
influence
social_networks
homophily
contagion
causal_inference
intervention
experimental_design
teaching
In this review, we discuss the design space of experiments to measure social influence through combinations of interventions and randomizations. We define an experiment as combination of (1) a target population of individuals connected by an observed interaction network, (2) a set of treatments whereby the researcher will intervene in the social system, (3) a randomization strategy which maps individuals or edges to treatments, and (4) a measurement of an outcome of interest after treatment has been assigned. We review experiments that demonstrate potential experimental designs and we evaluate their advantages and tradeoffs for answering different types of causal questions about social influence. We show how randomization also provides a basis for statistical inference when analyzing these experiments.

october 2017 by rvenkat

Science and data science

august 2017 by rvenkat

Data science has attracted a lot of attention, promising to turn vast amounts of data into useful predictions and insights. In this article, we ask why scientists should care about data science. To answer, we discuss data science from three perspectives: statistical, computational, and human. Although each of the three is a critical component of data science, we argue that the effective combination of all three components is the essence of what data science is about.

review
statistics
machine_learning
data_science
big_data
methods
david.blei
august 2017 by rvenkat

Law and Psychology Grows Up, Goes Online, and Replicates by Kristin Firth, David A. Hoffman, Tess Wilkinson‐Ryan :: SSRN

august 2017 by rvenkat

Over the last thirty years, legal scholars have increasingly deployed experimental studies, particularly hypothetical scenarios, to test intuitions about legal reasoning and behavior. That movement has accelerated in the last decade, facilitated in large part by cheap and convenient Internet participant recruiting platforms like Amazon Mechanical Turk. The widespread use of MTurk subjects, a practice that dramatically lowers the barriers to entry for experimental research, has been controversial. At the same time, law and psychology’s home discipline is experiencing a public crisis of confidence widely discussed in terms of the “replication crisis.” At present, law and psychology research is arguably in a new era, in which it is both an accepted feature of the legal landscape and also a target of fresh skepticism. The moment is ripe for taking stock.

In this paper, we bring an empirical approach to these problems. Using three canonical law and psychology findings, we document the challenges and the feasibility of reproducing results across platforms. We evaluate the extent to which we are able to reproduce the original findings with contemporary subject pools (MTurk, other national online platforms, and in-person labs). We partially replicate the results, and show marked similarities in subject responses across platforms. In the context of the experiments here, we conclude that meaningful replication requires active intervention in order to keep the materials relevant and sensible. The second aim is to compare Turk subjects to the original samples and to the replication samples. We find, consistent with the weight of recent evidence, that MTurk samples are highly reliable and useful. Subjects are highly similar to subjects on other online platforms an in-person samples, but they differ in their high level of attentiveness. Finally, we review the growing replication literature across disciplines, as well as our firsthand experience, to propose a set of standard practices for the publication of results in law and psychology.

review
empirical_legal_studies
amazon_turk
replication_of_studies
methods
sociology
social_psychology
In this paper, we bring an empirical approach to these problems. Using three canonical law and psychology findings, we document the challenges and the feasibility of reproducing results across platforms. We evaluate the extent to which we are able to reproduce the original findings with contemporary subject pools (MTurk, other national online platforms, and in-person labs). We partially replicate the results, and show marked similarities in subject responses across platforms. In the context of the experiments here, we conclude that meaningful replication requires active intervention in order to keep the materials relevant and sensible. The second aim is to compare Turk subjects to the original samples and to the replication samples. We find, consistent with the weight of recent evidence, that MTurk samples are highly reliable and useful. Subjects are highly similar to subjects on other online platforms an in-person samples, but they differ in their high level of attentiveness. Finally, we review the growing replication literature across disciplines, as well as our firsthand experience, to propose a set of standard practices for the publication of results in law and psychology.

august 2017 by rvenkat

Political Theory of the Firm

august 2017 by rvenkat

The revenues of large companies often rival those of national governments, and some companies have annual revenues higher than many national governments. Among the largest corporations in 2015, some had private security forces that rivaled the best secret services, public relations offices that dwarfed a US presidential campaign headquarters, more lawyers than the US Justice Department, and enough money to capture (through campaign donations, lobbying, and even explicit bribes) a majority of the elected representatives. The only powers these large corporations missed were the power to wage war and the legal power of detaining people, although their political influence was sufficiently large that many would argue that, at least in certain settings, large corporations can exercise those powers by proxy. Yet in economics, the commonly prevailing view of the firm ignores all these elements of politics and power. We must recognize that large firms have considerable power to influence the rules of the game. I call attention to the risk of a "Medici vicious circle," in which economic and political power reinforce each other. The possibility and extent of a "Medici vicious circle" depends upon several nonmarket factors. I discuss how they should be incorporated in a broader "Political Theory" of the firm.

political_economy
review
microeconomics
?
via:noahpinion
august 2017 by rvenkat

[1706.08440] Challenges to estimating contagion effects from observational data

july 2017 by rvenkat

A growing body of literature attempts to learn about contagion using observational (i.e. non-experimental) data collected from a single social network. While the conclusions of these studies may be correct, the methods rely on assumptions that are likely--and sometimes guaranteed to be--false, and therefore the evidence for the conclusions is often weaker than it seems. Developing methods that do not need to rely on implausible assumptions is an incredibly challenging and important open problem in statistics. Appropriate methods don't (yet!) exist, so researchers hoping to learn about contagion from observational social network data are sometimes faced with a dilemma: they can abandon their research program, or they can use inappropriate methods. This chapter will focus on the challenges and the open problems and will not weigh in on that dilemma, except to mention here that the most responsible way to use any statistical method, especially when it is well-known that the assumptions on which it rests do not hold, is with a healthy dose of skepticism, with honest acknowledgment and deep understanding of the limitations, and with copious caveats about how to interpret the results.

causal_inference
contagion
homophily
observational_studies
review
social_networks
teaching
?
via:cshalizi
july 2017 by rvenkat

The State of Applied Econometrics: Causality and Policy Evaluation

may 2017 by rvenkat

In this paper, we discuss recent developments in econometrics that we view as important for empirical researchers working on policy evaluation questions. We focus on three main areas, in each case, highlighting recommendations for applied work. First, we discuss new research on identification strategies in program evaluation, with particular focus on synthetic control methods, regression discontinuity, external validity, and the causal interpretation of regression methods. Second, we discuss various forms of supplementary analyses, including placebo analyses as well as sensitivity and robustness analyses, intended to make the identification strategies more credible. Third, we discuss some implications of recent advances in machine learning methods for causal effects, including methods to adjust for differences between treated and control units in high-dimensional settings, and methods for identifying and estimating heterogenous treatment effects.

econometrics
methods
machine_learning
causal_inference
review
may 2017 by rvenkat

Machine Learning: An Applied Econometric Approach

may 2017 by rvenkat

Machines are increasingly doing "intelligent" things. Face recognition algorithms use a large dataset of photos labeled as having a face or not to estimate a function that predicts the presence y of a face from pixels x. This similarity to econometrics raises questions: How do these new empirical tools fit with what we know? As empirical economists, how can we use them? We present a way of thinking about machine learning that gives it its own place in the econometric toolbox. Machine learning not only provides new tools, it solves a different problem. Specifically, machine learning revolves around the problem of prediction, while many economic applications revolve around parameter estimation. So applying machine learning to economics requires finding relevant tasks. Machine learning algorithms are now technically easy to use: you can download convenient packages in R or Python. This also raises the risk that the algorithms are applied naively or their output is misinterpreted. We hope to make them conceptually easier to use by providing a crisper understanding of how these algorithms work, where they excel, and where they can stumble—and thus where they can be most usefully applied.

econometrics
economics
methods
machine_learning
prediction
review
sendhil.mullainathan
may 2017 by rvenkat

Top Income Inequality in the 21st Century: Some Cautionary Notes

april 2017 by rvenkat

We revisit recent empirical evidence about the rise in top income inequality in the United States, drawing attention to four key issues that we believe are critical for an informed discussion about changing inequality since 1980. Our goal is to inform researchers, policy makers, and journalists who are interested in top income inequality.

inequality
economics
review
public_policy
united_states_of_america
april 2017 by rvenkat

Is Deontology a Heuristic? On Psychology, Neuroscience, Ethics, and Law by Cass R. Sunstein :: SSRN

march 2017 by rvenkat

A growing body of psychological and neuroscientific research links dual-process theories of cognition with moral reasoning (and implicitly to legal reasoning as well). The relevant research appears to show that at least some deontological judgments are connected with rapid, automatic, emotional processing, and that consequentialist judgments (including utilitarianism) are connected with slower, more deliberative thinking. These findings are consistent with the claim that deontological thinking is best understood as a moral heuristic – one that generally works well, but that also misfires. If this claim is right, it may have large implications for many debates in politics, morality, and law, including those involving the role of retribution, the free speech principle, religious liberty, the idea of fairness, and the legitimacy of cost-benefit analysis. Nonetheless, psychological and neuroscientific research cannot rule out the possibility that consequentialism is wrong and that deontology is right. It tells us about the psychology of moral and legal judgment, but it does no more. On the largest questions, it leaves moral and legal debates essentially as they were before.

cass.sunstein
moral_psychology
heuristics
judgment_decision-making
moral_philosophy
dmce
teaching
review
march 2017 by rvenkat

[1703.06843] The Role of Network Analysis in Industrial and Applied Mathematics

march 2017 by rvenkat

Many problems in industry --- and in the social, natural, information, and medical sciences --- involve discrete data and benefit from approaches from subjects such as network science, information theory, optimization, probability, and statistics. Because the study of networks is concerned explicitly with connectivity between different entities, it has become very prominent in industrial settings, and this importance has been accentuated further amidst the modern data deluge. In this article, we discuss the role of network analysis in industrial and applied mathematics, and we give several examples of network science in industry.

networks
applied_math
review
teaching
?
march 2017 by rvenkat

[1605.01483] Spectral Properties of Hypergraph Laplacian and Approximation Algorithms

march 2017 by rvenkat

The celebrated Cheeger's Inequality establishes a bound on the edge expansion of a graph via its spectrum. This inequality is central to a rich spectral theory of graphs, based on studying the eigenvalues and eigenvectors of the adjacency matrix (and other related matrices) of graphs. It has remained open to define a suitable spectral model for hypergraphs whose spectra can be used to estimate various combinatorial properties of the hypergraph.

In this paper we introduce a new hypergraph Laplacian operator generalizing the Laplacian matrix of graphs. In particular, the operator is induced by a diffusion process on the hypergraph, such that within each hyperedge, measure flows from vertices having maximum weighted measure to those having minimum. Since the operator is non-linear, we have to exploit other properties of the diffusion process to recover a spectral property concerning the "second eigenvalue" of the resulting Laplacian. Moreover, we show that higher order spectral properties cannot hold in general using the current framework.

We consider a stochastic diffusion process, in which each vertex also experiences Brownian noise from outside the system. We show a relationship between the second eigenvalue and the convergence behavior of the process.

We show that various hypergraph parameters like multi-way expansion and diameter can be bounded using this operator's spectral properties. Since higher order spectral properties do not hold for the Laplacian operator, we instead use the concept of procedural minimizers to consider higher order Cheeger-like inequalities. For any positive integer k, we give a polynomial time algorithm to compute an O(logr)-approximation to the k-th procedural minimizer, where r is the maximum cardinality of a hyperedge. We show that this approximation factor is optimal under the SSE hypothesis for constant values of k.

hypergraph
networks
review
?
In this paper we introduce a new hypergraph Laplacian operator generalizing the Laplacian matrix of graphs. In particular, the operator is induced by a diffusion process on the hypergraph, such that within each hyperedge, measure flows from vertices having maximum weighted measure to those having minimum. Since the operator is non-linear, we have to exploit other properties of the diffusion process to recover a spectral property concerning the "second eigenvalue" of the resulting Laplacian. Moreover, we show that higher order spectral properties cannot hold in general using the current framework.

We consider a stochastic diffusion process, in which each vertex also experiences Brownian noise from outside the system. We show a relationship between the second eigenvalue and the convergence behavior of the process.

We show that various hypergraph parameters like multi-way expansion and diameter can be bounded using this operator's spectral properties. Since higher order spectral properties do not hold for the Laplacian operator, we instead use the concept of procedural minimizers to consider higher order Cheeger-like inequalities. For any positive integer k, we give a polynomial time algorithm to compute an O(logr)-approximation to the k-th procedural minimizer, where r is the maximum cardinality of a hyperedge. We show that this approximation factor is optimal under the SSE hypothesis for constant values of k.

march 2017 by rvenkat

Economics: The architecture of inequality : Nature : Nature Research

march 2017 by rvenkat

-- review of five books on the topic

inequality
economics
history
review
march 2017 by rvenkat

Angus Deaton, Nobel Laureate, on Trump, Poverty, and Opioids - The Atlantic

march 2017 by rvenkat

-- read his critical remarks on Gordon's non-replicability of technological innovations and one sentence takes on other books that traces white poverty.

inequality
economics
health
book
review
united_states_of_america
nationalism
sociology
civilizing_process
?
human_progress
the_atlantic
march 2017 by rvenkat

The Administrative State: Law, Democracy, and Knowledge by Adrian Vermeule :: SSRN

february 2017 by rvenkat

This is a chapter for the forthcoming Oxford Handbook of the United States Constitution. I provide and compare three organizing frameworks for the administrative state. The first examines its constitutionality, the second its democratic credentials, the third its epistemic and technocratic capacities. After describing each, I examine their interaction, and suggest that the administrative state is the setting for an endlessly shifting series of alliances between and among constitutionalists, democrats and technocrats.

democracy
administrative_state
bureaucracy
collective_cognition
law
review
february 2017 by rvenkat

[1702.00467] The Computer Science and Physics of Community Detection: Landscapes, Phase Transitions, and Hardness

february 2017 by rvenkat

Community detection in graphs is the problem of finding groups of vertices which are more densely connected than they are to the rest of the graph. This problem has a long history, but it is currently motivated by social and biological networks. While there are many ways to formalize it, one of the most popular is as an inference problem, where there is a planted "ground truth" community structure around which the graph is generated probabilistically. Our task is then to recover the ground truth knowing only the graph.

Recently it was discovered, first heuristically in physics and then rigorously in probability and computer science, that this problem has a phase transition at which it suddenly becomes impossible. Namely, if the graph is too sparse, or the probabilistic process that generates it is too noisy, then no algorithm can find a partition that is correlated with the planted one---or even tell if there are communities, i.e., distinguish the graph from a purely random one with high probability. Above this information-theoretic threshold, there is a second threshold beyond which polynomial-time algorithms are known to succeed; in between, there is a regime in which community detection is possible, but conjectured to be exponentially hard.

For computer scientists, this field offers a wealth of new ideas and open questions, with connections to probability and combinatorics, message-passing algorithms, and random matrix theory. Perhaps more importantly, it provides a window into the cultures of statistical physics and statistical inference, and how those cultures think about distributions of instances, landscapes of solutions, and hardness.

cris_moore
community_detection
networks
review
phase_transition
statistical_mechanics
computational_complexity
statistics
information_theory
Recently it was discovered, first heuristically in physics and then rigorously in probability and computer science, that this problem has a phase transition at which it suddenly becomes impossible. Namely, if the graph is too sparse, or the probabilistic process that generates it is too noisy, then no algorithm can find a partition that is correlated with the planted one---or even tell if there are communities, i.e., distinguish the graph from a purely random one with high probability. Above this information-theoretic threshold, there is a second threshold beyond which polynomial-time algorithms are known to succeed; in between, there is a regime in which community detection is possible, but conjectured to be exponentially hard.

For computer scientists, this field offers a wealth of new ideas and open questions, with connections to probability and combinatorics, message-passing algorithms, and random matrix theory. Perhaps more importantly, it provides a window into the cultures of statistical physics and statistical inference, and how those cultures think about distributions of instances, landscapes of solutions, and hardness.

february 2017 by rvenkat

Social Theory and Public Opinion | Annual Review of Sociology

february 2017 by rvenkat

Any study of public opinion must consider the ontological status of the public being represented. In this review, we outline several empirical problems in current public opinion research and illustrate them with a contemporary case: public opinion about same-sex marriage. We then briefly trace historical attempts to grapple with the public in public opinion and then present the most thoroughgoing critiques and defenses of polling. We detail four approaches to the ontology and epistemology of public opinion. We argue for a conceptualization of public opinion that relies upon polling techniques alongside other investigative modes but that understands public opinion as dynamic, reactive, and collective. Publics are shaped by techniques that represent them, including public opinion research.

public_opinion
cultural_cognition
political_science
sociology
opinion_formation
methods
critique
review
dmce
teaching
february 2017 by rvenkat

The Computational Foundation of Life | Quanta Magazine

february 2017 by rvenkat

-- nice summary of a recent Santa Fe workship

-- contains nice summary of recent work on the topic.

biology
philosophy
thermodynamics
non-equilibrium
statistical_mechanics
review
quanta_mag
-- contains nice summary of recent work on the topic.

february 2017 by rvenkat

Social Network Sites: Definition, History, and Scholarship - boyd - 2007 - Journal of Computer-Mediated Communication - Wiley Online Library

january 2017 by rvenkat

Social network sites (SNSs) are increasingly attracting the attention of academic and industry researchers intrigued by their affordances and reach. This special theme section of the Journal of Computer-Mediated Communication brings together scholarship on these emergent phenomena. In this introductory article, we describe features of SNSs and propose a comprehensive definition. We then present one perspective on the history of such sites, discussing key changes and developments. After briefly summarizing existing scholarship concerning SNSs, we discuss the articles in this special section and conclude with considerations for future research.

media_studies
social_media
social_networks
dana.boyd
review
networks
teaching
january 2017 by rvenkat

Deliberative Democratic Theory and Empirical Political Science | Annual Review of Political Science

january 2017 by rvenkat

Although empirical studies of deliberative democracy have proliferated in the past decade, too few have addressed the questions that are most significant in the normative theories. At the same time, many theorists have tended too easily to dismiss the empirical findings. More recently, some theorists and empiricists have been paying more attention to each other's work. Nevertheless, neither is likely to produce the more comprehensive understanding of deliberative democracy we need unless both develop a clearer conception of the elements of deliberation, the conflicts among those elements, and the structural relationships in deliberative systems.

democracy
collective_cognition
political_science
political_psychology
review
via:duncan.watts
january 2017 by rvenkat

Nudge Theory in Action - Behavioral Design in | Sherzod Abdukadirov | Palgrave Macmillan

january 2017 by rvenkat

-- this is salesman like overselling.

behavioral_economics
policy
book
review
dmce
teaching
i_remain_skeptical
january 2017 by rvenkat

[1611.08083] Survey of Expressivity in Deep Neural Networks

january 2017 by rvenkat

We survey results on neural network expressivity described in "On the Expressive Power of Deep Neural Networks". The paper motivates and develops three natural measures of expressiveness, which all display an exponential dependence on the depth of the network. In fact, all of these measures are related to a fourth quantity, trajectory length. This quantity grows exponentially in the depth of the network, and is responsible for the depth sensitivity observed. These results translate to consequences for networks during and after training. They suggest that parameters earlier in a network have greater influence on its expressive power -- in particular, given a layer, its influence on expressivity is determined by the remaining depth of the network after that layer. This is verified with experiments on MNIST and CIFAR-10. We also explore the effect of training on the input-output map, and find that it trades off between the stability and expressivity.

deep_learning
review
machine_learning
generative_model
i_remain_skeptical
january 2017 by rvenkat

[1610.01361] Explosive transitions in complex networks' structure and dynamics: percolation and synchronization

january 2017 by rvenkat

Percolation and synchronization are two phase transitions that have been extensively studied since already long ago. A classic result is that, in the vast majority of cases, these transitions are of the second-order type, i.e. continuous and reversible. Recently, however, explosive phenomena have been reported in com- plex networks' structure and dynamics, which rather remind first-order (discontinuous and irreversible) transitions. Explosive percolation, which was discovered in 2009, corresponds to an abrupt change in the network's structure, and explosive synchronization (which is concerned, instead, with the abrupt emergence of a collective state in the networks' dynamics) was studied as early as the first models of globally coupled phase oscillators were taken into consideration. The two phenomena have stimulated investigations and de- bates, attracting attention in many relevant fields. So far, various substantial contributions and progresses (including experimental verifications) have been made, which have provided insights on what structural and dynamical properties are needed for inducing such abrupt transformations, as well as have greatly enhanced our understanding of phase transitions in networked systems. Our intention is to offer here a monographic review on the main-stream literature, with the twofold aim of summarizing the existing results and pointing out possible directions for future research.

-- first-order phase transitions in networks.

networks
dynamics
synchronization
percolation
review
teaching
?
-- first-order phase transitions in networks.

january 2017 by rvenkat

Aaronson's survey of P-NP question

january 2017 by rvenkat

--helps if you've been following his blog and have read some of his more philosophical writings.

computational_complexity
review
survey
scott.aaronson
january 2017 by rvenkat

[1612.03281] Random walks and diffusion on networks

january 2017 by rvenkat

Random walks are ubiquitous in the sciences, and they are interesting from both theoretical and practical perspectives. They are one of the most fundamental types of stochastic processes; can be used to model numerous phenomena, including diffusion, interactions, and opinions among humans and animals; and can be used to extract information about important entities or dense groups of entities in a network. Random walks have been studied for many decades on both regular lattices and (especially in the last couple of decades) on networks with a variety of structures. In the present article, we survey the theory and applications of random walks on networks, restricting ourselves to simple cases of single and non-adaptive random walkers. We distinguish three main types of random walks: discrete-time random walks, node-centric continuous-time random walks, and edge-centric continuous-time random walks. We first briefly survey random walks on a line, and then we consider random walks on various types of networks. We extensively discuss applications of random walks, including ranking of nodes (e.g., PageRank), community detection, respondent-driven sampling, and opinion models such as voter models.

ctmc
networks
review
random_walk
opinion_dynamics
teaching
?
january 2017 by rvenkat

[1010.5017] Collective motion

january 2017 by rvenkat

We review the observations and the basic laws describing the essential aspects of collective motion -- being one of the most common and spectacular manifestation of coordinated behavior. Our aim is to provide a balanced discussion of the various facets of this highly multidisciplinary field, including experiments, mathematical methods and models for simulations, so that readers with a variety of background could get both the basics and a broader, more detailed picture of the field. The observations we report on include systems consisting of units ranging from macromolecules through metallic rods and robots to groups of animals and people. Some emphasis is put on models that are simple and realistic enough to reproduce the numerous related observations and are useful for developing concepts for a better understanding of the complexity of systems consisting of many simultaneously moving entities. As such, these models allow the establishing of a few fundamental principles of flocking. In particular, it is demonstrated, that in spite of considerable differences, a number of deep analogies exist between equilibrium statistical physics systems and those made of self-propelled (in most cases living) units. In both cases only a few well defined macroscopic/collective states occur and the transitions between these states follow a similar scenario, involving discontinuity and algebraic divergences.

interating_particle_system
complex_system
collective_dynamics
self_organization
emergence
review
january 2017 by rvenkat

The Economic and Fiscal Consequences of Immigration | The National Academies Press

december 2016 by rvenkat

More than 40 million people living in the United States were born in other countries, and almost an equal number have at least one foreign-born parent. Together, the first generation (foreign-born) and second generation (children of the foreign-born) comprise almost one in four Americans. It comes as little surprise, then, that many U.S. residents view immigration as a major policy issue facing the nation. Not only does immigration affect the environment in which everyone lives, learns, and works, but it also interacts with nearly every policy area of concern, from jobs and the economy, education, and health care, to federal, state, and local government budgets.

The changing patterns of immigration and the evolving consequences for American society, institutions, and the economy continue to fuel public policy debate that plays out at the national, state, and local levels. The Economic and Fiscal Consequences of Immigration assesses the impact of dynamic immigration processes on economic and fiscal outcomes for the United States, a major destination of world population movements. This report will be a fundamental resource for policy makers and law makers at the federal, state, and local levels but extends to the general public, nongovernmental organizations, the business community, educational institutions, and the research community.

nap
review
immigration
us_politics
economics
united_states_of_america
The changing patterns of immigration and the evolving consequences for American society, institutions, and the economy continue to fuel public policy debate that plays out at the national, state, and local levels. The Economic and Fiscal Consequences of Immigration assesses the impact of dynamic immigration processes on economic and fiscal outcomes for the United States, a major destination of world population movements. This report will be a fundamental resource for policy makers and law makers at the federal, state, and local levels but extends to the general public, nongovernmental organizations, the business community, educational institutions, and the research community.

december 2016 by rvenkat

**related tags**

Copy this bookmark: