[1312.0117] Stochastic manifolds
"Malliavin Calculus can be seen as a differential calculus on Wiener spaces. We present the notion of stochastic manifold for which the Malliavin Calculus plays the same role as the classical differential calculus for the differential manifolds. The set of the paths in a Riemmanian compact manifold is then seen as a particular case of the above structure."
to:NB  differential_geometry  stochastic_processes
yesterday
[1311.7482] Local adaptation and genetic effects on fitness: Calculations for exponential family models with random effects
"Random effects are implemented for aster models using two approximations taken from Breslow and Clayton [J. Amer. Statist. Assoc. 88 (1993) 9-25]. Random effects are analytically integrated out of the Laplace approximation to the complete data log likelihood, giving a closed-form expression for an approximate missing data log likelihood. Third and higher derivatives of the complete data log likelihood with respect to the random effects are ignored, giving a closed-form expression for second derivatives of the approximate missing data log likelihood, hence approximate observed Fisher information. This method is applicable to any exponential family random effects model. It is implemented in the CRAN package aster (R Core Team [R: A Language and Environment for Statistical Computing (2012) R Foundation for Statistical Computing], Geyer [R package aster (2012) this http URL]). Applications are analyses of local adaptation in the invasive California wild radish (Raphanus sativus) and the slender wild oat (Avena barbata) and of additive genetic variance for fitness in the partridge pea (Chamaecrista fasciculata)."
to:NB  statistics  estimation  hierarchical_statistical_models  laplace_approximation  exponential_families  geyer.charles_j.  genetics
yesterday
Wengrow, D.: The Origins of Monsters: Image and Cognition in the First Age of Mechanical Reproduction. (eBook and Cloth)
"It has often been claimed that "monsters"--supernatural creatures with bodies composed from multiple species--play a significant part in the thought and imagery of all people from all times. The Origins of Monsters advances an alternative view. Composite figurations are intriguingly rare and isolated in the art of the prehistoric era. Instead it was with the rise of cities, elites, and cosmopolitan trade networks that "monsters" became widespread features of visual production in the ancient world. Showing how these fantastic images originated and how they were transmitted, David Wengrow identifies patterns in the records of human image-making and embarks on a search for connections between mind and culture.
"Wengrow asks: Can cognitive science explain the potency of such images? Does evolutionary psychology hold a key to understanding the transmission of symbols? How is our making and perception of images influenced by institutions and technologies? Wengrow considers the work of art in the first age of mechanical reproduction, which he locates in the Middle East, where urban life began. Comparing the development and spread of fantastic imagery across a range of prehistoric and ancient societies, including Mesopotamia, Egypt, Greece, and China, he explores how the visual imagination has been shaped by a complex mixture of historical and universal factors."
to:NB  books:noted  mythology  art_history  archaeology  ancient_history  monsters  cognitive_science  epidemiology_of_representations
yesterday
Everything Was New Once - The Edge of the American West - The Chronicle of Higher Education
"Just as we have visions of the present now so too did the past have views of the present then, ones may seem somewhat, but were just as present for our past selves as what we think. In fact, given that most folks don’t constantly adjust to the new wave of modernity rushing through their lives, our “present” is a curious amalgam of things past and up-to-date. We use smartphones to listen to “80s favorites” (or 70s or 60s etc). The 1960s is long in the past, perceptually, but most of the passenger airliners we fly were designed then. We pick things up as we go along, and they remain part of the present for us for much longer than they do the rest of the world. The personal present is the accreted collection of past collective presents, acquired along the way – consciously and not – from childhood to grave. That mosaic, writ over millions of peoples into a collective perception, suggests that events and items don’t truly stop being the “present” and become “history” until the generations that experienced them directly passes from the scene. A war, to take one example, remains an active part of culture until long after its end date. The examples are too obvious: Vietnam and World War II are still ongoing presences in American cultural discourse and will be for some time.
"It may be AD 2013, but for most people, I think, it is many other years as well."
uses_of_the_past  history  memory  modernity
yesterday
AER (103,7) p. 2643 - Adverse Selection and Inertia in Health Insurance Markets: When Nudging Hurts
"This paper investigates consumer inertia in health insurance markets, where adverse selection is a potential concern. We leverage a major change to insurance provision that occurred at a large firm to identify substantial inertia, and develop and estimate a choice model that also quantifies risk preferences and ex ante health risk. We use these estimates to study the impact of policies that nudge consumers toward better decisions by reducing inertia. When aggregated, these improved individual-level choices substantially exacerbate adverse selection in our setting, leading to an overall reduction in welfare that doubles the existing welfare loss from adverse selection."
to:NB  market_failures_in_everything  re:nudging
2 days ago
AER (103,7) p. 2790 - &quot;Reverse Bayesianism&quot;: A Choice-Based Theory of Growing Awareness
"This article introduces a new approach to modeling the expanding universe of decision makers in the wake of growing awareness, and invokes the axiomatic approach to model the evolution of decision makers' beliefs as awareness grows. The expanding universe is accompanied by extension of the set of acts, the preference relations over which are linked by a new axiom, invariant risk preferences, asserting that the ranking of lotteries is independent of the set of acts under consideration. The main results are representation theorems and rules for updating beliefs over expanding state spaces and events that have the flavor of "reverse Bayesianism."
to:NB  decision_theory  bayesianism
2 days ago
AER (103,7) p. 3001 - Conclusions Regarding Cross-Group Differences in Happiness Depend on Difficulty of Reaching Respondents
"A growing literature explores differences in subjective well-being across demographic groups, often relying on surveys with high nonresponse rates. By using the reported number of call attempts made to participants in the University of Michigan's Surveys of Consumers, we show that comparisons among easy-to-reach respondents differ from comparisons among hard-to-reach ones. Notably, easy-to-reach women are happier than easy-to-reach men, but hard-to-reach men are happier than hard-to-reach women, and conclusions of a survey could reverse with more attempted calls. Better alternatives to comparing group sample averages might include putting greater weight on hard-to-reach respondents or even extrapolating trends in responses."

- This could make a good teaching example, if it pans out.
to:NB  data_analysis  surveys  statistics  happiness  to_be_shot_after_a_fair_trial
2 days ago
[1312.0451] Consistency of weighted majority votes
"We revisit the classical decision-theoretic problem of weighted expert voting from a statistical learning perspective. In particular, we examine the consistency (both asymptotic and finitary) of the optimal Nitzan-Paroush weighted majority and related rules. In the case of known expert competence levels, we give precise necessary and sufficient conditions for consistency. When the competence levels are unknown, they must be empirically estimated. We provide frequentist and Bayesian analyses for this situation and discuss open problems presented by both approaches. Some of our proof techniques are non-standard and may be of independent interest. Experimental results are provided."
to:NB  to_read  learning_theory  ensemble_methods  kith_and_kin  kontorovich.aryeh  re:democratic_cognition
3 days ago
Lecué : Empirical risk minimization is optimal for the convex aggregation problem
"Let F be a finite model of cardinality M and denote by conv(F) its convex hull. The problem of convex aggregation is to construct a procedure having a risk as close as possible to the minimal risk over conv(F). Consider the bounded regression model with respect to the squared risk denoted by R(⋅). If fˆERM-Cn denotes the empirical risk minimization procedure over conv(F), then we prove that for any x>0, with probability greater than 1−4exp(−x),

R(fˆERM-Cn)≤minf∈conv(F)R(f)+c0max(ψ(C)n(M),xn),

where c0>0 is an absolute constant and ψ(C)n(M) is the optimal rate of convex aggregation defined in (In Computational Learning Theory and Kernel Machines (COLT-2003) (2003) 303–313 Springer) by ψ(C)n(M)=M/n when M≤n√ and ψ(C)n(M)=log(eM/n√)/n‾‾‾‾‾‾‾‾‾‾‾‾√ when M>n√."
to:NB  ensemble_methods  learning_theory  statistics
3 days ago
Dubarry , Le Corff : Non-asymptotic deviation inequalities for smoothed additive functionals in nonlinear state-space models
"The approximation of fixed-interval smoothing distributions is a key issue in inference for general state-space hidden Markov models (HMM). This contribution establishes non-asymptotic bounds for the Forward Filtering Backward Smoothing (FFBS) and the Forward Filtering Backward Simulation (FFBSi) estimators of fixed-interval smoothing functionals. We show that the rate of convergence of the Lq-mean errors of both methods depends on the number of observations T and the number of particles N only through the ratio T/N for additive functionals. In the case of the FFBS, this improves recent results providing bounds depending on T/N‾‾√."
to:NB  filtering  state_estimation  state-space_models  markov_models  stochastic_processes  deviation_inequalities  statistical_inference_for_stochastic_processes  statistics
3 days ago
Sadeghi : Stable mixed graphs
"In this paper, we study classes of graphs with three types of edges that capture the modified independence structure of a directed acyclic graph (DAG) after marginalisation over unobserved variables and conditioning on selection variables using the m-separation criterion. These include MC, summary, and ancestral graphs. As a modification of MC graphs, we define the class of ribbonless graphs (RGs) that permits the use of the m-separation criterion. RGs contain summary and ancestral graphs as subclasses, and each RG can be generated by a DAG after marginalisation and conditioning. We derive simple algorithms to generate RGs, from given DAGs or RGs, and also to generate summary and ancestral graphs in a simple way by further extension of the RG-generating algorithm. This enables us to develop a parallel theory on these three classes and to study the relationships between them as well as the use of each class."
3 days ago
Preuß , Vetter , Dette : A test for stationarity based on empirical processes
"In this paper we investigate the problem of testing the assumption of stationarity in locally stationary processes. The test is based on an estimate of a Kolmogorov–Smirnov type distance between the true time varying spectral density and its best approximation through a stationary spectral density. Convergence of a time varying empirical spectral process indexed by a class of certain functions is proved, and furthermore the consistency of a bootstrap procedure is shown which is used to approximate the limiting distribution of the test statistic. Compared to other methods proposed in the literature for the problem of testing for stationarity the new approach has at least two advantages: On one hand, the test can detect local alternatives converging to the null hypothesis at any rate gT→0 such that gTT1/2→∞, where T denotes the sample size. On the other hand, the estimator is based on only one regularization parameter while most alternative procedures require two. Finite sample properties of the method are investigated by means of a simulation study, and a comparison with several other tests is provided which have been proposed in the literature."
to:NB  empirical_processes  stochastic_processes  statistical_inference_for_stochastic_processes  hypothesis_testing  statistics  non-stationarity
3 days ago
From Alexandria, Through Baghdad - Surveys and Studies in the Ancient Greek and Medieval Islamic
"This book honors the career of historian of mathematics J.L. Berggren, his scholarship, and service to the broader community. The first part, of value to scholars, graduate students, and interested readers, is a survey of scholarship in the mathematical sciences in ancient Greece and medieval Islam. It consists of six articles (three by Berggren himself) covering research from the middle of the 20th century to the present. The remainder of the book contains studies by eminent scholars of the ancient and medieval mathematical sciences. They serve both as examples of the breadth of current approaches and topics, and as tributes to Berggren's interests by his friends and colleagues."

--- Berggren's _Episodes in the Mathematics of Medieval Islam_ is great...
books:noted  to:NB  islamic_civilization  history_of_mathematics  cultural_exchange
3 days ago
[1311.6622] Ray-Knight Theorem: a short proof
"We provide a short proof of the Ray-Knight second generalized Theorem, using a martingale which can be seen (on the positive quadrant) as the Radon-Nikodym derivate of the reversed vertex-reinforced jump process measure with respect to the Markov jump process with the same conductances."
to:NB  stochastic_processes  markov_models  martingales  measure_theory  probability
5 days ago
Karl Marx or Pope Francis? — Crooked Timber
"Pope Francis’s new Apostolic Exhortation, Evangelii Gaudium, has been getting some attention today, mostly thanks to its reiteration of some long-standing Catholic doctrine on social justice and the market. So, here is a quiz to see whether you can distinguish statements by Pope Francis from statements by Karl Marx. I figured someone was likely to do this anyway, so why not be first to the market? It’s fair to say that the Pope and Karl Marx differ significantly on numerous points of theory as well as on what people asking questions at job talks refer to as the policy implications of their views..."
funny:malicious  funny:academic  catholicism  marx.karl  socialism  healy.kieran
10 days ago
[1311.6238] Exact inference after model selection via the Lasso
"We develop a framework for inference after model selection based on the Lasso. At the core of this framework is a result that characterizes the exact (non-asymptotic) distribution of a pivot computed from the Lasso solution. This pivot allows us to (i) devise a test statistic that has an exact (non-asymptotic) $\unif(0,1)$ distribution under the null hypothesis that all relevant variables have been included in the model, and (ii) construct valid confidence intervals for the selected coefficients that account for the selection procedure."
to:NB  model_selection  confidence_sets  sparsity  lasso  hypothesis_testing  regression  statistics
10 days ago
[1311.6443] Probabilistic generation of random networks taking into account information on motifs occurrence
"Because of the huge number of graphs possible even with a small number of nodes, inference on network structure is known to be a challenging problem. Generating large random directed graphs with prescribed probabilities of occurrences of some meaningful patterns (motifs) is also difficult. We show how to generate such random graphs according to a formal probabilistic representation, using fast Markov chain Monte Carlo methods to sample them. As an illustration, we generate realistic graphs with several hundred nodes mimicking a gene transcription interaction network in Escherichia coli."
to:NB  network_sampling
10 days ago
[1311.6425] Robust Multimodal Graph Matching: Sparse Coding Meets Graph Matching
"Graph matching is a challenging problem with very important applications in a wide range of fields, from image and video analysis to biological and biomedical problems. We propose a robust graph matching algorithm inspired in sparsity-related techniques. We cast the problem, resembling group or collaborative sparsity formulations, as a non-smooth convex optimization problem that can be efficiently solved using augmented Lagrangian techniques. The method can deal with weighted or unweighted graphs, as well as multimodal data, where different graphs represent different types of data. The proposed approach is also naturally integrated with collaborative graph inference techniques, solving general network inference problems where the observed variables, possibly coming from different modalities, are not in correspondence. The algorithm is tested and compared with state-of-the-art graph matching techniques in both synthetic and real graphs. We also present results on multimodal graphs and applications to collaborative inference of brain connectivity from alignment-free functional magnetic resonance imaging (fMRI) data. The code is publicly available."
to:NB  to_read  network_data_analysis  optimization  sparsity  re:network_differences
10 days ago
Licentiousness § Unqualified Offerings
"An idea that I have half-formed in my head is that if you want a freer and more stable society, one less prone to centralized abuses (whether from central state authorities or corporate headquarters) you want a society with multiple centers of power, multiple allegiances, multiple sources of authority, as a type of check and balance.  The doctors are there and they have their allegiance to their oath and their professional organization, and the lawyers have theirs, and the academics have their allegiance to their professional societies, and a lot of community activities are happening through churches or fraternal orders or national charities that have their own codes, and the workplace is a negotiation between management and the union.
"There’s a lot that can go wrong in all of this, so don’t interpret it as a simple-minded call to return to some mythical version of the 1950’s.  (I mean, come on, they didn’t even have grunge rock in the 1950’s!)  Certainly the guilds can also become rent-seekers and work to keep alternative, competent providers from competing.  (e.g. the wars between MDs and nurse-practicioners)  However, it’s something I’m tossing around as I mull through Twilight of the Elites: Is the problem really that our elites are pseudo-meritocratic, or is the  problem that they are a disconnected monoculture?  Is the problem that there are a bunch of Ivy-League businessmen in Team Blue, or that they don’t sit alongside enough representatives of labor and other distinct interests wielding comparable power in Team Blue?"
inequality  elites  professions  class_struggles_in_america  our_decrepit_institutions  whats_gone_wrong_with_america  re:democratic_cognition
14 days ago
Vocation of the elites § Unqualified Offerings
"Education isn’t really the root of the pathologies in our elite class.  Honestly, I’m glad that some of our elite class passes through top educational institutions.  I could think of worse ways to shape them.  The real problem is that there are few routes to sit at the table.  By all means, have some Ivy Leaguers there.  But have those Ivy Leaguers in a wider range of endeavors than just the FIRE sector.  Have people who came up through organized labor.  Have people who came up through professions and vocations, through entrepreneurial activity outside the FIRE sector, and through numerous other paths.  (But keep the journalists out of the elite class.  The last thing we want is journalists who like sitting at fancy dinners with their rich buddies.  The proper place for a journalist is digging up dirt outside the banquet hall, while the rich are unaware what they’re up to.)
"This monoculture, this lack of variety in the paths to the top, is reflected in the obsession with sending everyone to college, and the accompanying reduction of emphasis on vocational education.  Yes, we still have community colleges doing a lot of vocational education, but they are all too often judged on transfer rates to 4-year schools (a point often lamented by Dean Dad).  When there’s only one path to the top, and the function of the education system is to legitimize the top by making it look accessible, preparation for vocations will suffer.  Meanwhile, you get degree inflation, PhD over-production, JD over-production, etc."
education  academia  inequality  elites  class_struggles_in_america  whats_gone_wrong_with_america  re:democratic_cognition
14 days ago
How Citizen Representatives Address the Epistemic Challenges of Democratic Citizenship by Mark E. Warren, John Gastil :: SSRN
"In this essay, we take a closer look at the kinds of trust judgments citizens need to be able to make in epistemic divisions of labour in democratic self-government. We distinguish kinds of political trust judgments, with a focus on the epistemic demands they place on citizens, as well sources of institutional support for citizen trust judgments, such as professional certifications and ethics. Though trust judgments have always been functionally necessary for representative democracies, the institutional conditions of trustee representation have always been weak, and the rise of the sceptical citizen has eroded the bases of deferential forms of trust. We propose that well-designed minipublics have the characteristics necessary to serve as institutional supports for citizens’ trust judgments. In particular, we review two cases in which minipublics have, in fact, functioned as trustee representatives: the British Columbia Citizens’ Assembly and the Oregon Citizens’ Initiative Review. These new forms of trustee representation cannot close the gap between complex societies and democratic citizenship, but they may very well narrow it."
to:NB  democracy  deliberative_democracy  collective_cognition  via:henry_farrell  re:democratic_cognition
14 days ago
The Partisan Dynamics of Networks, Coalitions, and Organizations in the Antiwar Movement after 9/11 by Michael T. Heaney :: SSRN
"This paper is Chapter 5 of a book manuscript titled, Party in the Street: the Antiwar Movement and the Democratic Party after 9/11. The book develops the concept of the party in the street, which is the space in which social movements and political parties intersect. We argue that this space is relevant to the politics of movements and parties because the actors within it – which we call movement-partisans – are conflicted in their identities as participants in the movement and participants in the parties. Because partisan identification is, in general, stronger than movement identification, partisan identities tend to drive the mobilizations cycles of social movements. The book manuscript tests this idea at three levels: activists, organizations, and legislators. This particular chapter focuses organizations. It looks at the partisan dynamics of organizational networks, coalitions of organizations, and three organizational case studies. Each of these sets of evidence supports the proposition that partisan identification shapes the way that organizations participated in the antiwar movement."
to:NB  the_continuing_crises  social_networks  social_movements  political_science  us_politics  us-iraq_war  via:henry_farrell
14 days ago
When Experimentalist Governance Meets Science-Based Regulations; the Case of Food Safety Regulations by Susanne Wengle :: SSRN
"This paper presents a detailed examination of a central regulatory mechanism and of the politics of regulation shaping food economies. Food safety regulations in the US rely on a science-based regulatory system known as HACCP, which bears central features of what Charles Sabel and Jonathan Zeitlin have identified as experimentalist governance. Theoretically, the paper examines what the reliance on science means for the promise of an experimentalist policy regime to enable a new form of politics. Based on interviews with meat producers and USDA regulators, I found that HACCP’s reliance on a particular scientific system acts as an effective divider between producers who can interpret and produce this kind of science, and others, for whom this is challenging. There is clear evidence that a significant number of small processors were unable to adapt to the regulatory system’s requirements. In so far as the HACCP-based food safety regulations delineate the kind of producer that can viably exist in the system and contributed to the demise of another set of producers, the regulation has created an outcome.
"The politics of HACCP, then, revolve around these effects of the scientification of food safety regulations. I also show that some of the most salient political arguments surrounding food safety regulations are not addressed through the institutionalized channels of the regulatory system. This is the case, I argue, because they are not commensurable with the scientific system underlying HACCP – their merits cannot be evaluated with its units of measurement, nor with kind of data it produces. The combination of an experimentalist policy regime with a science-based regulatory system then, is something like a test case for experimentalism’s ability to learn from difference, to realize its democratic promise and to overcome the enduring dilemmas that arise at the nexus of science, regulation and politics. I conclude with the argument that if experimentalist policy arrangements rely on science-based regulation, special caution is warranted to recognize experiences and arguments backed by multiple systems of reasoning. These are arguably timely observations, as the Obama administration has embraced science-based regulatory arrangements with as much enthusiasm as Reagan once did."
to:NB  regulation  evidence_based  political_science  via:henry_farrell
14 days ago
Family-based training program improves brain function, cognition, and behavior in lower socioeconomic status preschoolers
"Using information from research on the neuroplasticity of selective attention and on the central role of successful parenting in child development, we developed and rigorously assessed a family-based training program designed to improve brain systems for selective attention in preschool children. One hundred forty-one lower socioeconomic status preschoolers enrolled in a Head Start program were randomly assigned to the training program, Head Start alone, or an active control group. Electrophysiological measures of children’s brain functions supporting selective attention, standardized measures of cognition, and parent-reported child behaviors all favored children in the treatment program relative to both control groups. Positive changes were also observed in the parents themselves. Effect sizes ranged from one-quarter to half of a standard deviation. These results lend impetus to the further development and broader implementation of evidence-based education programs that target at-risk families."
to:NB  to_read  re:g_paper  cognitive_development  experimental_psychology  to_be_shot_after_a_fair_trial  inequality
14 days ago
"On the causal interpretation of race in regressions adjusting for conf" by Tyler J. VanderWeele and Whitney Robinson
"We consider different possible interpretations of the “effect of race” when regressions are run with race as an exposure variable, controlling also for various confounding and mediating variables. When adjustment is made for socioeconomic status early in a person's life, we discuss under what contexts the regression coefficients for race can be interpreted as corresponding to the extent to which a racial disparity would remain if various socioeconomic distributions early in life across racial groups could be equalized. When adjustment is also made for adult socioeconomic status, we note how the overall disparity can be decomposed into the portion that would be eliminated by equalizing adult socioeconomic status across racial groups and the portion of the disparity that would remain even if adult socioeconomic status across racial groups were equalized. We also discuss a stronger interpretation of the “effect of race” involving the joint effects of skin color, parental skin color, genetic background and cultural context when such variables are thought to be hypothetically manipulable and if adequate control for confounding were possible. We discuss some of the challenges with such an interpretation. Further discussion is given as to how the use of selected populations in examining racial disparities can additionally complicate the interpretation of the effects."
14 days ago
Protocols of Liberty: Communication Innovation and the American Revolution, Warner
"The fledgling United States fought a war to achieve independence from Britain, but as John Adams said, the real revolution occurred “in the minds and hearts of the people” before the armed conflict ever began. Putting the practices of communication at the center of this intellectual revolution, Protocols of Liberty shows how American patriots—the Whigs—used new forms of communication to challenge British authority before any shots were fired at Lexington and Concord.
"To understand the triumph of the Whigs over the Brit-friendly Tories, William B. Warner argues that it is essential to understand the communication systems that shaped pre-Revolution events in the background. He explains the shift in power by tracing the invention of a new political agency, the Committee of Correspondence; the development of a new genre for political expression, the popular declaration; and the emergence of networks for collective political action, with the Continental Congress at its center. From the establishment of town meetings to the creation of a new postal system and, finally, the Declaration of Independence, Protocols of Liberty reveals that communication innovations contributed decisively to nation-building and continued to be key tools in later American political movements, like abolition and women’s suffrage, to oppose local custom and state law."
to:NB  books:noted  history_of_ideas  epidemiology_of_representations  american_history
14 days ago
Trick or Treat: A History of Halloween, Morton
"Every year, children and adults alike take to the streets dressed as witches, demons, animals, celebrities, and more. They carve pumpkins and play pranks, and the braver ones watch scary movies and go on ghost tours. There are parades, fireworks displays, cornfield mazes, and haunted houses—and, most important, copious amounts of bite-sized candy. The popularity of Halloween has spread around the globe to places as diverse as Russia, China, and Japan, but its association with death and the supernatural and its inevitable commercialization has made it one of our most misunderstood holidays. How did it become what it is today?"
to:NB  books:noted  cultural_history  halloween
14 days ago
PAutomaC: a probabilistic automata and hidden Markov models learning competition - Online First - Springer
"Approximating distributions over strings is a hard learning problem. Typical techniques involve using finite state machines as models and attempting to learn these; these machines can either be hand built and then have their weights estimated, or built by grammatical inference techniques: the structure and the weights are then learned simultaneously. The Probabilistic Automata learning Competition (PAutomaC), run in 2012, was the first grammatical inference challenge that allowed the comparison between these methods and algorithms. Its main goal was to provide an overview of the state-of-the-art techniques for this hard learning problem. Both artificial data and real data were presented and contestants were to try to estimate the probabilities of strings. The purpose of this paper is to describe some of the technical and intrinsic challenges such a competition has to face, to give a broad state of the art concerning both the problems dealing with learning grammars and finite state machines and the relevant literature. This paper also provides the results of the competition and a brief description and analysis of the different approaches the main participants used."
to:NB  markov_models  automata_theory  grammar_induction  statistics  machine_learning  to_read  re:AoS_project
14 days ago
[1310.0532] Perfect Clustering for Stochastic Blockmodel Graphs via Adjacency Spectral Embedding
"Vertex clustering in a stochastic blockmodel graph has wide applicability and has been the subject of extensive research. In this paper, we provide a short proof that the adjacency spectral embedding can be used to obtain perfect clustering for the stochastic blockmodel."
14 days ago
[1310.1495] Role of Normalization in Spectral Clustering for Stochastic Blockmodels
"Spectral Clustering clusters elements using the top few eigenvectors of their (possibly normalized) similarity matrix. The quality of Spectral Clustering is closely tied to the convergence properties of these principal eigenvectors. This rate of convergence has been shown to be identical for both the normalized and unnormalized variants ([17]). However normalization for Spectral Clustering is the common practice ([16], [19]). Indeed, our experiments also show that normalization improves prediction accuracy. In this paper, for the popular Stochastic Blockmodel, we theoretically show that under spectral embedding, normalization shrinks the variance of points in a class by a constant fraction. As a byproduct of our work, we also obtain sharp deviation bounds of empirical principal eigenvalues of graphs generated from a Stochastic Blockmodel."
to:NB  to_read  spectral_clustering  network_data_analysis  community_discovery  re:smoothing_adjacency_matrices  bickel.peter  sarkar.purnamrita  heard_the_talk
14 days ago
Alexanderian : On spectral methods for variance based sensitivity analysis
"Consider a mathematical model with a finite number of random parameters. Variance based sensitivity analysis provides a framework to characterize the contribution of the individual parameters to the total variance of the model response. We consider the spectral methods for variance based sensitivity analysis which utilize representations of square integrable random variables in a generalized polynomial chaos basis. Taking a measure theoretic point of view, we provide a rigorous and at the same time intuitive perspective on the spectral methods for variance based sensitivity analysis. Moreover, we discuss approximation errors incurred by fixing inessential random parameters, when approximating functions with generalized polynomial chaos expansions."
to:NB  stochastic_processes  sensitivity_analysis
14 days ago
[1311.4158] Unsupervised Learning of Invariant Representations in Hierarchical Architectures
"Representations that are invariant to translation, scale and other transformations, can considerably reduce the sample complexity of learning, allowing recognition of new object classes from very few examples - a hallmark of human recognition. Empirical estimates of one-dimensional projections of the distribution induced by a group of affine transformations are proven to represent a unique and invariant signature associated with an image. We show how projections yielding invariant signatures for future images can be learned automatically, and updated continuously, during unsupervised visual experience. A module performing filtering and pooling, like simple and complex cells as proposed by Hubel and Wiesel, can compute such estimates. Under this view, a pooling stage estimates a one-dimensional probability distribution. Invariance from observations through a restricted window is equivalent to a sparsity property w.r.t. to a transformation, which yields templates that are a) Gabor for optimal simultaneous invariance to translation and scale or b) very specific for complex, class-dependent transformations such as rotation in depth of faces. Hierarchical architectures consisting of this basic Hubel-Wiesel module inherit its properties of invariance, stability, and discriminability while capturing the compositional organization of the visual world in terms of wholes and parts, and are invariant to complex transformations that may only be locally affine. The theory applies to several existing deep learning convolutional architectures for image and speech recognition. It also suggests that the main computational goal of the ventral stream of visual cortex is to provide a hierarchical representation of new objects which is invariant to transformations, stable, and discriminative for recognition - this representation may be learned in an unsupervised way from natural visual experience."
to:NB  neuroscience  abstraction  representation  machine_learning
14 days ago
[1311.4175] Estimation in High-dimensional Vector Autoregressive Models
"Vector Autoregression (VAR) is a widely used method for learning complex interrelationship among the components of multiple time series. Over the years it has gained popularity in the fields of control theory, statistics, economics, finance, genetics and neuroscience. We consider the problem of estimating stable VAR models in a high-dimensional setting, where both the number of time series and the VAR order are allowed to grow with sample size. In addition to the curse of dimensionality" introduced by a quadratically growing dimension of the parameter space, VAR estimation poses considerable challenges due to the temporal and cross-sectional dependence in the data. Under a sparsity assumption on the model transition matrices, we establish estimation and prediction consistency of ℓ1-penalized least squares and likelihood based methods. Exploiting spectral properties of stationary VAR processes, we develop novel theoretical techniques that provide deeper insight into the effect of dependence on the convergence rates of the estimates. We study the impact of error correlations on the estimation problem and develop fast, parallelizable algorithms for penalized likelihood based VAR estimates."
to:NB  time_series  sparsity  statistics  re:your_favorite_dsge_sucks
15 days ago
The trajectory, structure and origin of the Chelyabinsk asteroidal impactor : Nature : Nature Publishing Group
"Earth is continuously colliding with fragments of asteroids and comets of various sizes. The largest encounter in historical times occurred over the Tunguska river in Siberia in 1908, producing1, 2 an airburst of energy equivalent to 5–15 megatons of trinitrotoluene (1 kiloton of trinitrotoluene represents an energy of 4.185 × 1012 joules). Until recently, the next most energetic airburst events occurred over Indonesia3 in 2009 and near the Marshall Islands4 in 1994, both with energies of several tens of kilotons. Here we report an analysis of selected video records of the Chelyabinsk superbolide5 of 15 February 2013, with energy equivalent to 500 kilotons of trinitrotoluene, and details of its atmospheric passage. We found that its orbit was similar to the orbit of the two-kilometre-diameter asteroid 86039 (1999 NC43), to a degree of statistical significance sufficient to suggest that the two were once part of the same object. The bulk strength—the ability to resist breakage—of the Chelyabinsk asteroid, of about one megapascal, was similar to that of smaller meteoroids6 and corresponds to a heavily fractured single stone. The asteroid broke into small pieces between the altitudes of 45 and 30 kilometres, preventing more-serious damage on the ground. The total mass of surviving fragments larger than 100 grams was lower than expected7."
to:NB  asteroids  astronomy  to:blog
15 days ago
A 500-kiloton airburst over Chelyabinsk and an enhanced hazard from small impactors : Nature : Nature Publishing Group
"Most large (over a kilometre in diameter) near-Earth asteroids are now known, but recognition that airbursts (or fireballs resulting from nuclear-weapon-sized detonations of meteoroids in the atmosphere) have the potential to do greater damage1 than previously thought has shifted an increasing portion of the residual impact risk (the risk of impact from an unknown object) to smaller objects2. Above the threshold size of impactor at which the atmosphere absorbs sufficient energy to prevent a ground impact, most of the damage is thought to be caused by the airburst shock wave3, but owing to lack of observations this is uncertain4, 5. Here we report an analysis of the damage from the airburst of an asteroid about 19 metres (17 to 20 metres) in diameter southeast of Chelyabinsk, Russia, on 15 February 2013, estimated to have an energy equivalent of approximately 500 (±100) kilotons of trinitrotoluene (TNT, where 1 kiloton of TNT = 4.185×1012 joules). We show that a widely referenced technique4, 5, 6 of estimating airburst damage does not reproduce the observations, and that the mathematical relations7 based on the effects of nuclear weapons—almost always used with this technique—overestimate blast damage. This suggests that earlier damage estimates5, 6 near the threshold impactor size are too high. We performed a global survey of airbursts of a kiloton or more (including Chelyabinsk), and find that the number of impactors with diameters of tens of metres may be an order of magnitude higher than estimates based on other techniques8, 9. This suggests a non-equilibrium (if the population were in a long-term collisional steady state the size-frequency distribution would either follow a single power law or there must be a size-dependent bias in other surveys) in the near-Earth asteroid population for objects 10 to 50 metres in diameter, and shifts more of the residual impact risk to these sizes."
to:NB  heavy_tails  asteroids  astronomy  to:blog
15 days ago
Shang , Cheng : Local and global asymptotic inference in smoothing spline models
"This article studies local and global inference for smoothing spline estimation in a unified asymptotic framework. We first introduce a new technical tool called functional Bahadur representation, which significantly generalizes the traditional Bahadur representation in parametric models, that is, Bahadur [Ann. Inst. Statist. Math. 37 (1966) 577–580]. Equipped with this tool, we develop four interconnected procedures for inference: (i) pointwise confidence interval; (ii) local likelihood ratio testing; (iii) simultaneous confidence band; (iv) global likelihood ratio testing. In particular, our confidence intervals are proved to be asymptotically valid at any point in the support, and they are shorter on average than the Bayesian confidence intervals proposed by Wahba [J. R. Stat. Soc. Ser. B Stat. Methodol. 45 (1983) 133–150] and Nychka [J. Amer. Statist. Assoc. 83 (1988) 1134–1143]. We also discuss a version of the Wilks phenomenon arising from local/global likelihood ratio testing. It is also worth noting that our simultaneous confidence bands are the first ones applicable to general quasi-likelihood models. Furthermore, issues relating to optimality and efficiency are carefully addressed. As a by-product, we discover a surprising relationship between periodic and nonperiodic smoothing splines in terms of inference."
to:NB  splines  nonparametrics  confidence_sets  hypothesis_testing  to_read
16 days ago
[1311.1869] Optimization, Learning, and Games with Predictable Sequences
"We provide several applications of Optimistic Mirror Descent, an online learning algorithm based on the idea of predictable sequences. First, we recover the Mirror Prox algorithm for offline optimization, prove an extension to Holder-smooth functions, and apply the results to saddle-point type problems. Next, we prove that a version of Optimistic Mirror Descent (which has a close relation to the Exponential Weights algorithm) can be used by two strongly-uncoupled players in a finite zero-sum matrix game to converge to the minimax equilibrium at the rate of O((log T)/T). This addresses a question of Daskalakis et al 2011. Further, we consider a partial information version of the problem. We then apply the results to convex programming and exhibit a simple algorithm for the approximate Max Flow problem."
to:NB  low-regret_learning  game_theory  learning_theory  rakhlin.sasha
16 days ago
[1311.3485] A New Algorithm for Distributed Nonparametric Sequential Detection
"We consider nonparametric sequential hypothesis testing problem when the distribution under the null hypothesis is fully known but the alternate hypothesis corresponds to some other unknown distribution with some loose constraints. We propose a simple algorithm to address the problem. These problems are primarily motivated from wireless sensor networks and spectrum sensing in Cognitive Radios. A decentralized version utilizing spatial diversity is also proposed. Its performance is analysed and asymptotic properties are proved. The simulated and analysed performance of the algorithm is compared with an earlier algorithm addressing the same problem with similar assumptions. We also modify the algorithm for optimizing performance when information about the prior probabilities of occurrence of the two hypotheses are known."
to:NB  hypothesis_testing  nonparametrics  goodness-of-fit  statistics
16 days ago
JEP (27,4) p. 165 - Gifts of Mars: Warfare and Europe's Early Rise to Riches
"Western Europe surged ahead of the rest of the world long before technological growth became rapid. Europe in 1500 already had incomes twice as high on a per capita basis as Africa, and one-third greater than most of Asia. In this essay, we explain how Europe's tumultuous politics and deadly penchant for warfare translated into a sustained advantage in per capita incomes. We argue that Europe's rise to riches was driven by the nature of its politics after 1350 -- it was a highly fragmented continent characterized by constant warfare and major religious strife. No other continent in recorded history fought so frequently, for such long periods, killing such a high proportion of its population. When it comes to destroying human life, the atomic bomb and machine guns may be highly efficient, but nothing rivaled the impact of early modern Europe's armies spreading hunger and disease. War therefore helped Europe's precocious rise to riches because the survivors had more land per head available for cultivation. Our interpretation involves a feedback loop from higher incomes to more war and higher land-labor ratios, a loop set in motion by the Black Death in the middle of the 14th century."

- That's an... interesting notion.
to:NB  early_modern_european_history  economics  war  mother_courage_raises_the_west
16 days ago
Literate Testing in R | Data Analysis Visually Enforced
Nicely-named functions for testing some kinds of numerical properties.
R  programming  to_teach:statcomp
17 days ago
[1311.2645] Program Evaluation with High-Dimensional Data
"We consider estimation of policy relevant treatment effects in a data-rich environment where there may be many more control variables available than there are observations. In addition to allowing many control variables, the setting we consider allows heterogeneous treatment effects, endogenous receipt of treatment, and function-valued outcomes. To make informative inference possible, we assume that reduced form predictive relationships are approximately sparse. That is, we require that the relationship between the covariates and the outcome, treatment status, and instrument status can be captured up to a small approximation error using a small number of controls whose identities are unknown to the researcher. This condition allows estimation and inference for a wide variety of treatment parameters to proceed after selection of an appropriate set of control variables formed by selecting controls separately for each reduced form relationship and then appropriately combining this set of reduced form predictive models and associated selected controls. We provide conditions under which post-selection inference is uniformly valid across a wide-range of models and show that a key condition underlying uniform validity of post-selection inference allowing for imperfect model selection is the use of approximately unbiased estimating equations. We illustrate the use of the proposed treatment effect estimation methods with an application to estimating the effect of 401(k) participation on accumulated assets."
to:NB  causal_inference  public_policy  regression  statistics  high-dimensional_statistics
17 days ago
AER (103,6) p. 2121 - The China Syndrome: Local Labor Market Effects of Import Competition in the United States
"We analyze the effect of rising Chinese import competition between 1990 and 2007 on US local labor markets, exploiting cross- market variation in import exposure stemming from initial differences in industry specialization and instrumenting for US imports using changes in Chinese imports by other high-income countries. Rising imports cause higher unemployment, lower labor force participation, and reduced wages in local labor markets that house import-competing manufacturing industries. In our main specification, import competition explains one-quarter of the contemporaneous aggregate decline in US manufacturing employment. Transfer benefits payments for unemployment, disability, retirement, and healthcare also rise sharply in more trade-exposed labor markets."
to:NB  economics  economic_policy  globalization  class_struggles_in_america
17 days ago
Nadeem, S.: Dead Ringers: How Outsourcing Is Changing the Way Indians Understand Themselves. (eBook, Paper and Cloth)
"In the Indian outsourcing industry, employees are expected to be "dead ringers" for the more expensive American workers they have replaced--complete with Westernized names, accents, habits, and lifestyles that are organized around a foreign culture in a distant time zone. Dead Ringers chronicles the rise of a workforce for whom mimicry is a job requirement and a passion. In the process, the book deftly explores the complications of hybrid lives and presents a vivid portrait of a workplace where globalization carries as many downsides as advantages.
"Shehzad Nadeem writes that the relatively high wages in the outsourcing sector have empowered a class of cultural emulators. These young Indians indulge in American-style shopping binges at glittering malls, party at upscale nightclubs, and arrange romantic trysts at exurban cafés. But while the high-tech outsourcing industry is a matter of considerable pride for India, global corporations view the industry as a low-cost, often low-skill sector. Workers use the digital tools of the information economy not to complete technologically innovative tasks but to perform grunt work and rote customer service. Long hours and the graveyard shift lead to health problems and social estrangement. Surveillance is tight, management is overweening, and workers are caught in a cycle of hope and disappointment.
"Through lively ethnographic detail and subtle analysis of interviews with workers, managers, and employers, Nadeem demonstrates the culturally transformative power of globalization and its effects on the lives of the individuals at its edges."
to:NB  books:noted  india  globalization  class_struggles_in_america  cultural_exchange  ethnography
18 days ago
How Literature Plays with the Brain: The Neuroscience of Reading and Art
""Literature matters," says Paul B. Armstrong, "for what it reveals about human experience, and the very different perspective of neuroscience on how the brain works is part of that story." In How Literature Plays with the Brain, Armstrong examines the parallels between certain features of literary experience and functions of the brain. His central argument is that literature plays with the brain through experiences of harmony and dissonance which set in motion oppositions that are fundamental to the neurobiology of mental functioning. These oppositions negotiate basic tensions in the operation of the brain between the drive for pattern, synthesis, and constancy and the need for flexibility, adaptability, and openness to change.
"The challenge, Armstrong argues, is to account for the ability of readers to find incommensurable meanings in the same text, for example, or to take pleasure in art that is harmonious or dissonant, symmetrical or distorted, unified or discontinuous and disruptive.
"How Literature Plays with the Brain is the first book to use the resources of neuroscience and phenomenology to analyze aesthetic experience. For the neuroscientific community, the study suggests that different areas of research—the neurobiology of vision and reading, the brain-body interactions underlying emotions—may be connected to a variety of aesthetic and literary phenomena. For critics and students of literature, the study engages fundamental questions within the humanities: What is aesthetic experience? What happens when we read a literary work? How does the interpretation of literature relate to other ways of knowing?"
to:NB  books:noted  appropriations_of_neuroscience  literary_criticism  literary_theory
18 days ago
Picking up the pieces: Causal states in noisy data, and how to recover them
"Automatic structure discovery is desirable in many Markov model applications where a good topology (states and transitions) is not known a priori. CSSR is an established pattern discovery algorithm for stationary and ergodic stochastic symbol sequences that learns a predictively optimal Markov representation consisting of so-called causal states. By means of a novel algebraic criterion, we prove that the causal states of a simple process disturbed by random errors frequently are too complex to be learned fully, making CSSR diverge. In fact, the causal state representation of many hidden Markov models, representing simple but noise-disturbed data, has infinite cardinality. We also report that these problems can be solved by endowing CSSR with the ability to make approximations. The resulting algorithm, robust causal states (RCS), is able to recover the underlying causal structure from data corrupted by random substitutions, as is demonstrated both theoretically and in an experiment. The algorithm has potential applications in areas such as error correction and learning stochastic grammars."

- Huh.
to:NB  to_read  markov_models  grammar_induction  re:AoS_project
18 days ago
[1207.0865] Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels
"Variational methods for parameter estimation are an active research area, potentially offering computationally tractable heuristics with theoretical performance bounds. We build on recent work that applies such methods to network data, and establish asymptotic normality rates for parameter estimates of stochastic blockmodel data, by either maximum likelihood or variational estimation. The result also applies to various sub-models of the stochastic blockmodel found in the literature."
in_NB  likelihood  estimation  community_discovery  network_data_analysis  statistics  bickel.peter  choi.david  kith_and_kin  variational_inference
18 days ago
Divvy: Fast and Intuitive Exploratory Data Analysis
"Divvy is an application for applying unsupervised machine learning techniques (clustering and dimensionality reduction) to the data analysis process. Divvy provides a novel UI that allows researchers to tighten the action-perception loop of changing algorithm parameters and seeing a visualization of the result. Machine learning researchers can use Divvy to publish easy to use reference implementations of their algorithms, which helps the machine learning field have a greater impact on research practices elsewhere."
to:NB  visual_display_of_quantitative_information  data_mining  clustering  to_teach:data-mining
18 days ago
Phys. Rev. E 88, 052802 (2013): Normalized modularity optimization method for community identification with degree adjustment
"As a fundamental problem in network study, community identification has attracted much attention from different fields. Representing a seminal work in this area, the modularity optimization method has been widely applied and studied. However, this method has issues in resolution limit and extreme degeneracy and may not perform well for networks with unbalanced structures. Although several methods have been proposed to overcome these limitations, they are all based on the original idea of defining modularity through comparing the total number of edges within the putative communities in the observed network with that in an equivalent randomly generated network. In this paper, we show that this modularity definition is not suitable to analyze some networks such as those with unbalanced structures. Instead, we propose to define modularity through the average degree within the communities and formulate modularity as comparing the sum of average degree within communities of the observed network to that of an equivalent randomly generated network. In addition, we also propose a degree-adjusted approach for further improvement when there are unbalanced structures. We analyze the theoretical properties of our degree adjusted method. Numerical experiments for both artificial networks and real networks demonstrate that average degree plays an important role in network community identification, and our proposed methods have better performance than existing ones."
to:NB  community_discovery
18 days ago
[1311.2878] Selection Effects in Online Sharing: Consequences for Peer Adoption
"Most models of social contagion take peer exposure to be a corollary of adoption, yet in many settings, the visibility of one's adoption behavior happens through a separate decision process. In online systems, product designers can define how peer exposure mechanisms work: adoption behaviors can be shared in a passive, automatic fashion, or occur through explicit, active sharing. The consequences of these mechanisms are of substantial practical and theoretical interest: passive sharing may increase total peer exposure but active sharing may expose higher quality products to peers who are more likely to adopt.
"We examine selection effects in online sharing through a large-scale field experiment on Facebook that randomizes whether or not adopters share Offers (coupons) in a passive manner. We derive and estimate a joint discrete choice model of adopters' sharing decisions and their peers' adoption decisions. Our results show that active sharing enables a selection effect that exposes peers who are more likely to adopt than the population exposed under passive sharing.
"We decompose the selection effect into two distinct mechanisms: active sharers expose peers to higher quality products, and the peers they share with are more likely to adopt independently of product quality. Simulation results show that the user-level mechanism comprises the bulk of the selection effect. The study's findings are among the first to address downstream peer effects induced by online sharing mechanisms, and can inform design in settings where a surplus of sharing could be viewed as costly."
to:NB  social_influence  social_media  experimental_economics  re:homophily_and_confounding  bakshy.eytan
18 days ago
Sparse Markov Chains for Sequence Data - Jääskinen - 2013 - Scandinavian Journal of Statistics - Wiley Online Library
"Finite memory sources and variable-length Markov chains have recently gained popularity in data compression and mining, in particular, for applications in bioinformatics and language modelling. Here, we consider denser data compression and prediction with a family of sparse Bayesian predictive models for Markov chains in finite state spaces. Our approach lumps transition probabilities into classes composed of invariant probabilities, such that the resulting models need not have a hierarchical structure as in context tree-based approaches. This can lead to a substantially higher rate of data compression, and such non-hierarchical sparse models can be motivated for instance by data dependence structures existing in the bioinformatics context. We describe a Bayesian inference algorithm for learning sparse Markov models through clustering of transition probabilities. Experiments with DNA sequence and protein data show that our approach is competitive in both prediction and classification when compared with several alternative methods on the basis of variable memory length."
to:NB  markov_models  statistical_inference_for_stochastic_processes  statistics  time_series  re:AoS_project
18 days ago
[1310.7320] High Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing
"In a recent article (Proc. Natl. Acad. Sci., 110(36), 14557-14562), El Karoui et al. study the distribution of robust regression estimators in the regime in which the number of parameters p is of the same order as the number of samples n. Using numerical simulations and highly plausible' heuristic arguments, they unveil a striking new phenomenon. Namely, the regression coefficients contain an extra Gaussian noise component that is not explained by classical concepts such as the Fisher information matrix. We show here that that this phenomenon can be characterized rigorously techniques that were developed by the authors for analyzing the Lasso estimator under high-dimensional asymptotics. We introduce an approximate message passing (AMP) algorithm to compute M-estimators and deploy state evolution to evaluate the operating characteristics of AMP and so also M-estimates. Our analysis clarifies that the extra Gaussian noise' encountered in this problem is fundamentally similar to phenomena already studied for regularized least squares in the setting n<p."
to:NB  estimation  fisher_information  statistics  high-dimensional_statistics
18 days ago
Simpson , Bowman , Laurienti : Analyzing complex functional brain networks: Fusing statistics and network science to understand the brain
"Complex functional brain network analyses have exploded over the last decade, gaining traction due to their profound clinical implications. The application of network science (an interdisciplinary offshoot of graph theory) has facilitated these analyses and enabled examining the brain as an integrated system that produces complex behaviors. While the field of statistics has been integral in advancing activation analyses and some connectivity analyses in functional neuroimaging research, it has yet to play a commensurate role in complex network analyses. Fusing novel statistical methods with network-based functional neuroimage analysis will engender powerful analytical tools that will aid in our understanding of normal brain function as well as alterations due to various brain disorders. Here we survey widely used statistical and network science tools for analyzing fMRI network data and discuss the challenges faced in filling some of the remaining methodological gaps. When applied and interpreted correctly, the fusion of network scientific and statistical methods has a chance to revolutionize the understanding of brain function."
to:NB  functional_connectivity  neuroscience  statistics  to_read  re:functional_communities
18 days ago
[1311.3494] Fundamental Limits of Online and Distributed Algorithms for Statistical Learning and Estimation
"Many machine learning approaches are characterized by information constraints on how they interact with the training data. These include memory and sequential access constraints (e.g. fast first-order methods to solve stochastic optimization problems); communication constraints (e.g. distributed learning); partial access to the underlying data (e.g. missing features and multi-armed bandits) and more. However, currently we have little understanding how such information constraints fundamentally affect our performance, independent of the learning problem semantics. For example, are there learning problems where \emph{any} algorithm which has small memory footprint (or can use any bounded number of bits from each example, or has certain communication constraints) will perform worse than what is possible without such constraints? In this paper, we describe how a single set of results implies positive answers to the above, for a variety of settings."
to:NB  computational_complexity  learning_theory  information_theory  statistics  to_read
18 days ago
[1311.3475] Social Influence and the Collective Dynamics of Opinion Formation
"Social influence is the process by which individuals adapt their opinion, revise their beliefs, or change their behavior as a result of social interactions with other people. In our strongly interconnected society, social influence plays a prominent role in many self-organized phenomena such as herding in cultural markets, the spread of ideas and innovations, and the amplification of fears during epidemics. Yet, the mechanisms of opinion formation remain poorly understood, and existing physics-based models lack systematic empirical validation. Here, we report two controlled experiments showing how participants answering factual questions revise their initial judgments after being exposed to the opinion and confidence level of others. Based on the observation of 59 experimental subjects exposed to peer-opinion for 15 different items, we draw an influence map that describes the strength of peer influence during interactions. A simple process model derived from our observations demonstrates how opinions in a group of interacting people can converge or split over repeated interactions. In particular, we identify two major attractors of opinion: (i) the expert effect, induced by the presence of a highly confident individual in the group, and (ii) the majority effect, caused by the presence of a critical mass of laypeople sharing similar opinions. Additional simulations reveal the existence of a tipping point at which one attractor will dominate over the other, driving collective opinion in a given direction. These findings have implications for understanding the mechanisms of public opinion formation and managing conflicting situations in which self-confident and better informed minorities challenge the views of a large uninformed majority."
to:NB  experimental_psychology  experimental_sociology  decision-making  social_influence  social_life_of_the_mind  collective_cognition  re:democratic_cognition  to_read
18 days ago
[1311.2503] Predictable Feature Analysis
"Every organism in an environment, whether biological, robotic or virtual, must be able to predict certain aspects of its environment in order to survive or perform whatever task is intended. It needs a model that is capable of estimating the consequences of possible actions, so that planning, control, and decision-making become feasible. For scientific purposes, such models are usually created in a problem specific manner using differential equations and other techniques from control- and system-theory. In contrast to that, we aim for an unsupervised approach that builds up the desired model in a self-organized fashion. Inspired by Slow Feature Analysis (SFA), our approach is to extract sub-signals from the input, that behave as predictable as possible. These "predictable features" are highly relevant for modeling, because predictability is a desired property of the needed consequence-estimating model by definition. In our approach, we measure predictability with respect to a certain prediction model. We focus here on the solution of the arising optimization problem and present a tractable algorithm based on algebraic methods which we call Predictable Feature Analysis (PFA). We prove that the algorithm finds the globally optimal signal, if this signal can be predicted with low error. To deal with cases where the optimal signal has a significant prediction error, we provide a robust, heuristically motivated variant of the algorithm and verify it empirically. Additionally, we give formal criteria a prediction-model must meet to be suitable for measuring predictability in the PFA setting and also provide a suitable default-model along with a formal proof that it meets these criteria."

- Extremely similar to my student Georg Goerg's "forecastable component analysis" (which is duly cited).
to:NB  time_series  prediction  statistics
18 days ago
[1311.2483] Global Sensitivity Analysis with Dependence Measures
"Global sensitivity analysis with variance-based measures suffers from several theoretical and practical limitations, since they focus only on the variance of the output and handle multivariate variables in a limited way. In this paper, we introduce a new class of sensitivity indices based on dependence measures which overcomes these insufficiencies. Our approach originates from the idea to compare the output distribution with its conditional counterpart when one of the input variables is fixed. We establish that this comparison yields previously proposed indices when it is performed with Csiszar f-divergences, as well as sensitivity indices which are well-known dependence measures between random variables. This leads us to investigate completely new sensitivity indices based on recent state-of-the-art dependence measures, such as distance correlation and the Hilbert-Schmidt independence criterion. We also emphasize the potential of feature selection techniques relying on such dependence measures as alternatives to screening in high dimension."
to:NB  sensitivity_analysis  stochastic_models  two-sample_tests  statistics
18 days ago
[1311.1797] Sensitivity analysis for multidimensional and functional outputs
"Let X:=(X1,…,Xp) be random objects (the inputs), defined on some probability space (Ω,,ℙ) and valued in some measurable space E=E1×…×Ep. Further, let Y:=Y=f(X1,…,Xp) be the output. Here, f is a measurable function from E to some Hilbert space ℍ (ℍ could be either of finite or infinite dimension). In this work, we give a natural generalization of the Sobol indices (that are classically defined when Y∈ℝ ), when the output belongs to ℍ. These indices have very nice properties. First, they are invariant. under isometry and scaling. Further they can be, as in dimension 1, easily estimated by using the so-called Pick and Freeze method. We investigate the asymptotic behaviour of such estimation scheme."
to:NB  sensitivity_analysis  stochastic_models
18 days ago
[1309.6415] Stratified Graphical Models - Context-Specific Independence in Graphical Models
"Theory of graphical models has matured over more than three decades to provide the backbone for several classes of models that are used in a myriad of applications such as genetic mapping of diseases, credit risk evaluation, reliability and computer security, etc. Despite of their generic applicability and wide adoptance, the constraints imposed by undirected graphical models and Bayesian networks have also been recognized to be unnecessarily stringent under certain circumstances. This observation has led to the proposal of several generalizations that aim at more relaxed constraints by which the models can impose local or context-specific dependence structures. Here we consider an additional class of such models, termed as stratified graphical models. We develop a method for Bayesian learning of these models by deriving an analytical expression for the marginal likelihood of data under a specific subclass of decomposable stratified models. A non-reversible Markov chain Monte Carlo approach is further used to identify models that are highly supported by the posterior distribution over the model space. Our method is illustrated and compared with ordinary graphical models through application to several real and synthetic datasets."
to:NB  statistics  graphical_models  causal_discovery
18 days ago
[1212.3587] Detecting Time-dependent Structure in Network Data via a New Class of Latent Process Models
"We introduce a new class of latent process models for dynamic relational network data with the goal of detecting time-dependent structure. Network data are often observed over time, and static network models for such data may fail to capture relevant dynamic features. We present a new technique for identifying the emergence or disappearance of distinct subpopulations of vertices. In this formulation, a network is observed over time, with attributed edges appearing at random times. At unknown time points, subgroups of vertices may exhibit a change in behavior. Such changes may take the form of a change in the overall probability of connection within or between subgroups, or a change in the distribution of edge attributes. A mixture distribution for latent vertex positions is used to detect heterogeneities in connectivity behavior over time and over vertices. The probability of edges with various attributes at a given time is modeled using a latent-space stochastic process associated with each vertex. A random dot product model is used to describe the dependency structure of the graph. As an application we analyze the Enron email corpus."
to:NB  time-series  network_data_analysis  re:network_differences  to_read
18 days ago
Permanental Partition Models and Markovian Gibbs Structures - Springer
"We study both time-invariant and time-varying Gibbs distributions for configurations of particles into disjoint clusters. Specifically, we introduce and give some fundamental properties for a class of partition models, called permanental partition models, whose distributions are expressed in terms of the α-permanent of a similarity matrix parameter. We show that, in the time-invariant case, the permanental partition model is a refinement of the celebrated Pitman–Ewens distribution; whereas, in the time-varying case, the permanental model refines the Ewens cut-and-paste Markov chains (J. Appl. Probab. 43(3):778–791, 2011). By a special property of the α-permanent, the partition function can be computed exactly, allowing us to make several precise statements about this general model, including a characterization of exchangeable and consistent permanental models."
to:NB  stochastic_processes  statistical_mechanics  re:your_favorite_ergm_sucks
18 days ago
Choose what you like or like what you choose? Identifying Influence and Homophily out of Individual Decisions
"We investigate the microfoundations of the identification problem related to social influence and homophily. Focusing on the individual decision making of interacting individuals, we investigate how they affect each other’s behaviors. We propose simple and direct measures of homophily and influence by making use of individual preferences of these interacting individuals. Since in many occasions, preferences are not easily observed, we extend our analysis to the observables, decision outcomes. In order to infer the underlying preferences of interacting individuals out of their decision outcomes, we follow a foundational approach. We analyze the behavioral characteristics of individual de- cision making that includes interaction and finally we make use of
the tools that are provided by revealed preference theory in order to uncover the underlying preferences of the individuals. Based on re- vealed preference analysis, we revisit our measurement techniques for homophily and influence."

- Gets only partial identification.
to:NB  to_read  decision_theory  social_influence  re:homophily_and_confounding
18 days ago
[1311.3492] High-dimensional learning of linear causal networks via inverse covariance estimation
"We establish a new framework for statistical estimation of directed acyclic graphs (DAGs) when data are generated from a linear, possibly non-Gaussian structural equation model. Our framework consists of two parts: (1) inferring the moralized graph from the support of the inverse covariance matrix; and (2) selecting the best-scoring graph amongst DAGs that are consistent with the moralized graph. We show that when the error variances are known or estimated to close enough precision, the true DAG is the unique minimizer of the score computed using the reweighted squared l_2-loss. Our population-level results have implications for the identifiability of linear SEMs when the error covariances are specified up to a constant multiple. On the statistical side, we establish rigorous conditions for high-dimensional consistency of our two-part algorithm, defined in terms of a "gap" between the true DAG and the next best candidate. Finally, we demonstrate that dynamic programming may be used to select the optimal DAG in linear time when the treewidth of the moralized graph is bounded."
to:NB  high-dimensional_statistics  causal_discovery  graphical_models  buhlmann.peter  statistics
19 days ago
[1311.3576] Reproducing kernel Hilbert space based estimation of systems of ordinary differential equations
"Non-linear systems of differential equations have attracted the interest in fields like system biology, ecology or biochemistry, due to their flexibility and their ability to describe dynamical systems. Despite the importance of such models in many branches of science they have not been the focus of systematic statistical analysis until recently. In this work we propose a general approach to estimate the parameters of systems of differential equations measured with noise. Our methodology is based on the maximization of the penalized likelihood where the system of differential equations is used as a penalty. To do so, we use a Reproducing Kernel Hilbert Space approach that allows to formulate the estimation problem as an unconstrained numeric maximization problem easy to solve. The proposed method is tested with synthetically simulated data and it is used to estimate the unobserved transcription factor CdaR in Steptomyes coelicolor using gene expression data of the genes it regulates."
to:NB  dynamical_systems  statistical_inference_for_stochastic_processes  hilbert_space  statistics
19 days ago
When and Why Noise Correlations Are Important in Neural Decoding
"Information may be encoded both in the individual activity of neurons and in the correlations between their activities. Understanding whether knowledge of noise correlations is required to decode all the encoded information is fundamental for constructing computational models, brain–machine interfaces, and neuroprosthetics. If correlations can be ignored with tolerable losses of information, the readout of neural signals is simplified dramatically. To that end, previous studies have constructed decoders assuming that neurons fire independently and then derived bounds for the information that is lost. However, here we show that previous bounds were not tight and overestimated the importance of noise correlations. In this study, we quantify the exact loss of information induced by ignoring noise correlations and show why previous estimations were not tight. Further, by studying the elementary parts of the decoding process, we determine when and why information is lost on a single-response basis. We introduce the minimum decoding error to assess the distinctive role of noise correlations under natural conditions. We conclude that all of the encoded information can be decoded without knowledge of noise correlations in many more situations than previously thought."
to:NB  neural_data_analysis  neural_coding_and_decoding
19 days ago
A Plug-in Approach to Neyman-Pearson Classification
"The Neyman-Pearson (NP) paradigm in binary classification treats type I and type II errors with different priorities. It seeks classifiers that minimize type II error, subject to a type I error constraint under a user specified level α. In this paper, plug-in classifiers are developed under the NP paradigm. Based on the fundamental Neyman-Pearson Lemma, we propose two related plug-in classifiers which amount to thresholding respectively the class conditional density ratio and the regression function. These two classifiers handle different sampling schemes. This work focuses on theoretical properties of the proposed classifiers; in particular, we derive oracle inequalities that can be viewed as finite sample versions of risk bounds. NP classification can be used to address anomaly detection problems, where asymmetry in errors is an intrinsic property. As opposed to a common practice in anomaly detection that consists of thresholding normal class density, our approach does not assume a specific form for anomaly distributions. Such consideration is particularly necessary when the anomaly class density is far from uniformly distributed."
to:NB  hypothesis_testing  anomaly_detection  learning_theory  neyman-pearson  classifiers
19 days ago
[1207.6076] Equivalence of distance-based and RKHS-based statistics in hypothesis testing
"We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, maximum mean discrepancies (MMD), that is, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. In the case where the energy distance is computed with a semimetric of negative type, a positive definite kernel, termed distance kernel, may be defined such that the MMD corresponds exactly to the energy distance. Conversely, for any positive definite kernel, we can interpret the MMD as energy distance with respect to some negative-type semimetric. This equivalence readily extends to distance covariance using kernels on the product space. We determine the class of probability distributions for which the test statistics are consistent against all alternatives. Finally, we investigate the performance of the family of distance kernels in two-sample and independence tests: we show in particular that the energy distance most commonly employed in statistics is just one member of a parametric family of kernels, and that other choices from this family can yield more powerful tests."
to:NB  kernel_methods  hilbert_space  independence_testing  two-sample_tests  statistics  nonparametrics  to_read
19 days ago
[1311.2972] Learning Mixtures of Discrete Product Distributions using Spectral Decompositions
"We study the problem of learning a distribution from samples, when the underlying distribution is a mixture of product distributions over discrete domains. This problem is motivated by several practical applications such as crowd-sourcing, recommendation systems, and learning Boolean functions. The existing solutions either heavily rely on the fact that the number of components in the mixtures is finite or have sample/time complexity that is exponential in the number of components. In this paper, we introduce a polynomial time/sample complexity method for learning a mixture of r discrete product distributions over {1,2,…,ℓ}n, for general ℓ and r. We show that our approach is statistically consistent and further provide finite sample guarantees.
"We use techniques from the recent work on tensor decompositions for higher-order moment matching. A crucial step in these moment matching methods is to construct a certain matrix and a certain tensor with low-rank spectral decompositions. These tensors are typically estimated directly from the samples. The main challenge in learning mixtures of discrete product distributions is that these low-rank tensors cannot be obtained directly from the sample moments. Instead, we reduce the tensor estimation problem to: a) estimating a low-rank matrix using only off-diagonal block elements; and b) estimating a tensor using a small number of linear measurements. Leveraging on recent developments in matrix completion, we give an alternating minimization based method to estimate the low-rank matrix, and formulate the tensor completion problem as a least-squares problem."
to:NB  mixture_models  spectral_methods  tensors  statistics
19 days ago
Wall Street Isn’t Worth It | Jacobin
"It may be true, in some cases, that the relative wages of different workers reflect the relative market values of the goods and services they produce after managers and capital owners have taken their cut. But when the magnitude of this cut is unrelated to social contribution, and actually constitutes the bulk of total value, the value of the residual paid to those actually engaged in production is not related to social contribution, either. The dominance of the financial sector distorts all prices and wages in the economy, not just those directly related to the activities of the financial sector."
market_failures_in_everything  markets_as_collective_calculating_devices  finance  financialization  economics  quiggin.john
21 days ago
[1311.2038] The Rate of Convergence for Approximate Bayesian Computation
"Approximate Bayesian Computation (ABC) is a popular computational method for likelihood-free Bayesian inference. The term "likelihood-free" refers to problems where the likelihood is intractable to compute or estimate directly, but where it is possible to generate simulated data X relatively easily given a candidate set of parameters θ simulated from a prior distribution. Parameters which generate simulated data within some tolerance δ of the observed data x∗ are regarded as plausible, and a collection of such θ is used to estimate the posterior distribution θ|X=x∗. Suitable choice of δ is vital for ABC methods to return good approximations to θ in reasonable computational time.
"While ABC methods are widely used in practice, particularly in population genetics, study of the mathematical properties of ABC estimators is still in its infancy. We prove that ABC estimates converge to the exact solution under very weak assumptions and, under slightly stronger assumptions, quantify the rate of this convergence. Our results can be used to guide the choice of the tolerance parameter δ."
to:NB  approximate_bayesian_computation  statistics
21 days ago

