11399
Robert Wiebe's Self-Rule and American Democracy | Waggish
"What emerged with industrialization in the United States was a three-class system, conflictual but not revolutionary: one class geared to national institutions and policies, one dominating local affairs, and one sunk beneath both of those in the least rewarding jobs and least stable environments–in the terminology of my account, a national class, a local middle class, and a lower class. New hierarchies ordered relations inside both the national and lower middle class; both of those classes, in turn, counted on hierarchies to control the lower class."
in_NB  books:noted  us_politics  american_history  class_struggles_in_america  democracy
2 days ago
Artificial sweeteners induce glucose intolerance by altering the gut microbiota : Nature : Nature Publishing Group
"Non-caloric artificial sweeteners (NAS) are among the most widely used food additives worldwide, regularly consumed by lean and obese individuals alike. NAS consumption is considered safe and beneficial owing to their low caloric content, yet supporting scientific data remain sparse and controversial. Here we demonstrate that consumption of commonly used NAS formulations drives the development of glucose intolerance through induction of compositional and functional alterations to the intestinal microbiota. These NAS-mediated deleterious metabolic effects are abrogated by antibiotic treatment, and are fully transferrable to germ-free mice upon faecal transplantation of microbiota configurations from NAS-consuming mice, or of microbiota anaerobically incubated in the presence of NAS. We identify NAS-altered microbial metabolic pathways that are linked to host susceptibility to metabolic disease, and demonstrate similar NAS-induced dysbiosis and glucose intolerance in healthy human subjects. Collectively, our results link NAS consumption, dysbiosis and metabolic abnormalities, thereby calling for a reassessment of massive NAS usage."
2 days ago
Pleistocene cave art from Sulawesi, Indonesia : Nature : Nature Publishing Group
"Archaeologists have long been puzzled by the appearance in Europe ~40–35 thousand years (kyr) ago of a rich corpus of sophisticated artworks, including parietal art (that is, paintings, drawings and engravings on immobile rock surfaces)1, 2 and portable art (for example, carved figurines)3, 4, and the absence or scarcity of equivalent, well-dated evidence elsewhere, especially along early human migration routes in South Asia and the Far East, including Wallacea and Australia5, 6, 7, 8, where modern humans (Homo sapiens) were established by 50 kyr ago9, 10. Here, using uranium-series dating of coralloid speleothems directly associated with 12 human hand stencils and two figurative animal depictions from seven cave sites in the Maros karsts of Sulawesi, we show that rock art traditions on this Indonesian island are at least compatible in age with the oldest European art11. The earliest dated image from Maros, with a minimum age of 39.9 kyr, is now the oldest known hand stencil in the world. In addition, a painting of a babirusa (‘pig-deer’) made at least 35.4 kyr ago is among the earliest dated figurative depictions worldwide, if not the earliest one. Among the implications, it can now be demonstrated that humans were producing rock art by ~40 kyr ago at opposite ends of the Pleistocene Eurasian world."
to:NB  archaeology  human_evolution
2 days ago
On inference of causality for discrete state models in a multiscale context
"Discrete state models are a common tool of modeling in many areas. E.g., Markov state models as a particular representative of this model family became one of the major instruments for analysis and understanding of processes in molecular dynamics (MD). Here we extend the scope of discrete state models to the case of systematically missing scales, resulting in a nonstationary and nonhomogeneous formulation of the inference problem. We demonstrate how the recently developed tools of nonstationary data analysis and information theory can be used to identify the simultaneously most optimal (in terms of describing the given data) and most simple (in terms of complexity and causality) discrete state models. We apply the resulting formalism to a problem from molecular dynamics and show how the results can be used to understand the spatial and temporal causality information beyond the usual assumptions. We demonstrate that the most optimal explanation for the appropriately discretized/coarse-grained MD torsion angles data in a polypeptide is given by the causality that is localized both in time and in space, opening new possibilities for deploying percolation theory and stochastic subgridscale modeling approaches in the area of MD."
to:NB  to_read  re:AoS_project  time_series  markov_models  statistics
2 days ago
Pre-Industrial Inequality - Milanovic - 2010 - The Economic Journal - Wiley Online Library
"Is inequality largely the result of the Industrial Revolution? Or, were pre-industrial incomes as unequal as they are today? This article infers inequality across individuals within each of the 28 pre-industrial societies, for which data were available, using what are known as social tables. It applies two new concepts: the inequality possibility frontier and the inequality extraction ratio. They compare the observed income inequality to the maximum feasible inequality that, at a given level of income, might have been ‘extracted’ by those in power. The results give new insights into the connection between inequality and economic development in the very long run."
to:NB  to_read  economics  economic_history  inequality  great_transformation
2 days ago
[0902.3837] Innovated higher criticism for detecting sparse signals in correlated noise
"Higher criticism is a method for detecting signals that are both sparse and weak. Although first proposed in cases where the noise variables are independent, higher criticism also has reasonable performance in settings where those variables are correlated. In this paper we show that, by exploiting the nature of the correlation, performance can be improved by using a modified approach which exploits the potential advantages that correlation has to offer. Indeed, it turns out that the case of independent noise is the most difficult of all, from a statistical viewpoint, and that more accurate signal detection (for a given level of signal sparsity and strength) can be obtained when correlation is present. We characterize the advantages of correlation by showing how to incorporate them into the definition of an optimal detection boundary. The boundary has particularly attractive properties when correlation decays at a polynomial rate or the correlation matrix is Toeplitz."
to:NB  multiple_testing  hypothesis_testing  time_series  have_skimmed  jin.jiashun  hall.peter
2 days ago
Higher criticism in the context of unknown distribution, non-independence and classification
"Higher criticism has been proposed as a tool for highly multiple hypothesis testing or signal detection, initially in cases where the distribution of a test statistic (or the noise in a signal) is known and the component tests are statisti- cally independent. In this paper we explore the extent to which the assumptions of known distribution and independence can be relaxed, and we consider too the ap- plication of higher criticism to classification. It is shown that effective distribution approximations can be achieved by using a threshold approach; that is, by disre- garding data components unless their significance level exceeds a sufficiently high value. This method exploits the good relative accuracy of approximations to light- tailed distributions. In particular, it can be effective when the true distribution is founded on something like a Studentised mean, or on an average of related type, which is commonly the case in practice. The issue of dependence among vector components is also shown not to be a serious difficulty in many circumstances."
to:NB  have_skimmed  multiple_testing  statistics  hall.peter  hypothesis_testing  statistical_inference_for_stochastic_processes  re:network_differences
2 days ago
[math/0410072] Higher criticism for detecting sparse heterogeneous mixtures
"Higher criticism, or second-level significance testing, is a multiple-comparisons concept mentioned in passing by Tukey. It concerns a situation where there are many independent tests of significance and one is interested in rejecting the joint null hypothesis. Tukey suggested comparing the fraction of observed significances at a given \alpha-level to the expected fraction under the joint null. In fact, he suggested standardizing the difference of the two quantities and forming a z-score; the resulting z-score tests the significance of the body of significance tests. We consider a generalization, where we maximize this z-score over a range of significance levels 0<\alpha\leq\alpha_0.
"We are able to show that the resulting higher criticism statistic is effective at resolving a very subtle testing problem: testing whether n normal means are all zero versus the alternative that a small fraction is nonzero. The subtlety of this sparse normal means'' testing problem can be seen from work of Ingster and Jin, who studied such problems in great detail. In their studies, they identified an interesting range of cases where the small fraction of nonzero means is so small that the alternative hypothesis exhibits little noticeable effect on the distribution of the p-values either for the bulk of the tests or for the few most highly significant tests.
"In this range, when the amplitude of nonzero means is calibrated with the fraction of nonzero means, the likelihood ratio test for a precisely specified alternative would still succeed in separating the two hypotheses."

--- It makes a lot more sense that the name would come from someone like Tukey.
to:NB  multiple_testing  hypothesis_testing  empirical_processes  statistics  donoho.david  jin.jiashun  tukey.john_w.  have_read  re:network_differences
2 days ago
Human preferences for sexually dimorphic faces may be evolutionarily novel
"A large literature proposes that preferences for exaggerated sex typicality in human faces (masculinity/femininity) reflect a long evolutionary history of sexual and social selection. This proposal implies that dimorphism was important to judgments of attractiveness and personality in ancestral environments. It is difficult to evaluate, however, because most available data come from large-scale, industrialized, urban populations. Here, we report the results for 12 populations with very diverse levels of economic development. Surprisingly, preferences for exaggerated sex-specific traits are only found in the novel, highly developed environments. Similarly, perceptions that masculine males look aggressive increase strongly with development and, specifically, urbanization. These data challenge the hypothesis that facial dimorphism was an important ancestral signal of heritable mate value. One possibility is that highly developed environments provide novel opportunities to discern relationships between facial traits and behavior by exposing individuals to large numbers of unfamiliar faces, revealing patterns too subtle to detect with smaller samples."

--- Another possibility is that these are recent cultural conventions! (Maybe they have some subtle way of ruling that out...)
to:NB  psychology  evolutionary_psychology  cultural_differences  to_be_shot_after_a_fair_trial
4 days ago
No evidence for genetic assortative mating beyond that due to population stratification
"Domingue et al. (1) use genome-wide SNPs to show in non-Hispanic US whites that spouses are genetically more similar than random pairs of individuals. We argue that, although this reported result is descriptively true, the spousal genetic similarity can be explained by assortment on shared ancestry (i.e., population stratification) and thus does not reflect genetic assortative mating as interpreted by Dominigue et al. This greatly affects the implications of the findings for understanding assortative mating in humans."
to:NB  human_genetics
4 days ago
HCN ice in Titan/'s high-altitude southern polar cloud : Nature : Nature Publishing Group
"Titan’s middle atmosphere is currently experiencing a rapid change of season after northern spring arrived in 2009 (refs 1, 2). A large cloud was observed3 for the first time above Titan’s southern pole in May 2012, at an altitude of 300 kilometres. A temperature maximum was previously observed there, and condensation was not expected for any of Titan’s atmospheric gases. Here we report that this cloud is composed of micrometre-sized particles of frozen hydrogen cyanide (HCN ice). The presence of HCN particles at this altitude, together with temperature determinations from mid-infrared observations, indicate a dramatic cooling of Titan’s atmosphere inside the winter polar vortex in early 2012. Such cooling is in contrast to previously measured high-altitude warming in the polar vortex1, and temperatures are a hundred degrees colder than predicted by circulation models4. These results show that post-equinox cooling at the winter pole of Titan is much more efficient than previously thought."
to:NB  titan  astronomy
4 days ago
[1001.0591] Comparing Distributions and Shapes using the Kernel Distance
"Starting with a similarity function between objects, it is possible to define a distance metric on pairs of objects, and more generally on probability distributions over them. These distance metrics have a deep basis in functional analysis, measure theory and geometric measure theory, and have a rich structure that includes an isometric embedding into a (possibly infinite dimensional) Hilbert space. They have recently been applied to numerous problems in machine learning and shape analysis.
"In this paper, we provide the first algorithmic analysis of these distance metrics. Our main contributions are as follows: (i) We present fast approximation algorithms for computing the kernel distance between two point sets P and Q that runs in near-linear time in the size of (P cup Q) (note that an explicit calculation would take quadratic time). (ii) We present polynomial-time algorithms for approximately minimizing the kernel distance under rigid transformation; they run in time O(n + poly(1/epsilon, log n)). (iii) We provide several general techniques for reducing complex objects to convenient sparse representations (specifically to point sets or sets of points sets) which approximately preserve the kernel distance. In particular, this allows us to reduce problems of computing the kernel distance between various types of objects such as curves, surfaces, and distributions to computing the kernel distance between point sets. These take advantage of the reproducing kernel Hilbert space and a new relation linking binary range spaces to continuous range spaces with bounded fat-shattering dimension."
to:NB  to_read  kernel_estimators  two-sample_tests  statistics  probability  re:network_differences
4 days ago
[1307.7760] Geometric Inference on Kernel Density Estimates
"We show that geometric inference of a point cloud can be calculated by examining its kernel density estimate. This intermediate step results in the inference being statically robust to noise and allows for large computational gains and scalability (e.g. on 100 million points). In particular, by first creating a coreset for the kernel density estimate, the data representing the final geometric and topological structure has size depending only on the error tolerance, not on the size of the original point set or the complexity of the structure. To achieve this result, we study how to replace distance to a measure, as studied by Chazal, Cohen-Steiner, and Merigot, with the kernel distance. The kernel distance is monotonic with the kernel density estimate (sublevel sets of the kernel distance are superlevel sets of the kernel density estimate), thus allowing us to examine the kernel density estimate in this manner. We show it has several computational and stability advantages. Moreover, we provide an algorithm to estimate its topology using weighted Vietoris-Rips complexes."
to:NB  geometry  kernel_estimators  density_estimation  statistics  computational_statistics
4 days ago
Quality and efficiency for kernel density estimates in large data
"Kernel density estimates are important for a broad variety of applications. Their construction has been well-studied, but existing techniques are expensive on massive datasets and/or only provide heuristic approximations without theoretical guarantees. We propose randomized and deterministic algorithms with quality guarantees which are orders of magnitude more efficient than previous algorithms. Our algorithms do not require knowledge of the kernel or its bandwidth parameter and are easily parallelizable. We demonstrate how to implement our ideas in a centralized setting and in MapReduce, although our algorithms are applicable to any large-scale data processing framework. Extensive experiments on large real datasets demonstrate the quality, efficiency, and scalability of our techniques."

--- Ungated version: http://www.cs.utah.edu/~lifeifei/papers/kernelsigmod13.pdf
4 days ago
Norbert Wiener, 1894-1964
Memorial issue of the Bulletin of the AMS, now open access...
to:NB  wiener.norbert  lives_of_the_scientists  mathematics  stochastic_processes
6 days ago
Nonparametric Estimation of Küllback-Leibler Divergence
"In this letter, we introduce an estimator of Küllback-Leibler divergence based on two independent samples. We show that on any finite alphabet, this estimator has an exponentially decaying bias and that it is consistent and asymptotically normal. To explain the importance of this estimator, we provide a thorough analysis of the more standard plug-in estimator. We show that it is consistent and asymptotically normal, but with an infinite bias. Moreover, if we modify the plug-in estimator to remove the rare events that cause the bias to become infinite, the bias still decays at a rate no faster than . Further, we extend our results to estimating the symmetrized Küllback-Leibler divergence. We conclude by providing simulation results, which show that the asymptotic properties of these estimators hold even for relatively small sample sizes."

--- Trivial, but: _Kullback_ didn't spell his name with an umlaut --- why here?
to:NB  entropy_estimation  information_theory  statistics  nonparametrics
6 days ago
Arguments, More than Confidence, Explain the Good Performance of Reasoning Groups by Emmanuel Trouche, Emmanuel Sander, Hugo Mercier :: SSRN
to:NB  to_read  cognitive_science  experimental_psychology  social_life_of_the_mind  collective_cognition  re:democratic_cognition  mercier.hugo
7 days ago
The Virtues of Ingenuity: Reasoning and Arguing without Bias - Springer
This paper describes and defends the “virtues of ingenuity”: detachment, lucidity, thoroughness. Philosophers traditionally praise these virtues for their role in the practice of using reasoning to solve problems and gather information. Yet, reasoning has other, no less important uses. Conviction is one of them. A recent revival of rhetoric and argumentative approaches to reasoning (in psychology, philosophy and science studies) has highlighted the virtues of persuasiveness and cast a new light on some of its apparent vices—bad faith, deluded confidence, confirmation and myside biases. Those traits, it is often argued, will no longer look so detrimental once we grasp their proper function: arguing in order to persuade, rather than thinking in order to solve problems. Some of these biases may even have a positive impact on intellectual life. Seen in this light, the virtues of ingenuity may well seem redundant. Defending them, I argue that the vices of conviction are not innocuous. If generalized, they would destabilize argumentative practices. Argumentation is a common good that is threatened when every arguer pursues conviction at the expense of ingenuity. Bad faith, myside biases and delusions of all sorts are neither called for nor explained by argumentative practices. To avoid a collapse of argumentation, mere civil virtues (respect, humility or honesty) do not suffice: we need virtues that specifically attach to the practice of making conscious inferences.
to:NB  to_read  rhetoric  epistemology  social_life_of_the_mind  re:democratic_cognition  via:?
7 days ago
globalinequality: Ahistoricism in Acemoglu-Robinson
Hinting at the MacLeod thesis, that the historical role of Communism was to lay the groundwork for capitalism...
have_read  economic_growth  institutions  economic_policy  development_economics  communism  china  china:prc  ussr
9 days ago
Revisiting the Impact of Teachers
"Chetty, Friedman, and Rockoff (hereafter CFR) use teacher switch- ing as a quasi-experiment to test for bias from student sorting in value added (VA) models of teacher effectiveness. They conclude that VA estimates are not meaningfully biased by student sorting (CFR 2014a). A companion paper finds that high-VA teachers have large effects on students’ later outcomes (CFR 2014b). I reproduce CFR’s analysis in data from North Carolina. Their key reported results are all success- fully replicated. Further investigation, however, reveals that the quasi- experiment is invalid: Teacher switching is correlated with changes in students’ prior grade scores that bias the key coefficient toward a find- ing of no bias. Estimates that adjust for changes in students’ prior achievement find evidence of moderate bias in VA scores, in the middle of the range suggested by Rothstein (2009). The association between VA and long-run outcomes is not robust and quite sensitive to controls."

--- Is the data available for either this or CFR? If not, perhaps, the last tag is a mistake.
9 days ago
[1408.4102] Estimation of Monotone Treatment Effects in Network Experiments
"Randomized experiments on social networks are a trending research topic. Such experiments pose statistical challenges due to the possibility of interference between units. We propose a new method for estimating attributable treatment effects under interference. The method does not require partial interference, but instead uses an identifying assumption that is similar to requiring nonnegative treatment effects. Observed pre-treatment social network information can be used to customize the test statistic, so as to increase power without making assumptions on the data generating process. The inversion of the test statistic is a combinatorial optimization problem which has a tractable relaxation, yielding conservative estimates of the attributable effect."
statistics  network_data_analysis  causal_inference  experimental_design  re:experiments_on_networks  kith_and_kin  choi.david_s.  in_NB
10 days ago
Evaluating link prediction methods - Online First - Springer
"Link prediction is a popular research area with important applications in a variety of disciplines, including biology, social science, security, and medicine. The fundamental requirement of link prediction is the accurate and effective prediction of new links in networks. While there are many different methods proposed for link prediction, we argue that the practical performance potential of these methods is often unknown because of challenges in the evaluation of link prediction, which impact the reliability and reproducibility of results. We describe these challenges, provide theoretical proofs and empirical examples demonstrating how current methods lead to questionable conclusions, show how the fallacy of these conclusions is illuminated by methods we propose, and develop recommendations for consistent, standard, and applicable evaluation metrics. We also recommend the use of precision-recall threshold curves and associated areas in lieu of receiver operating characteristic curves due to complications that arise from extreme imbalance in the link prediction classification problem."
to:NB  network_data_analysis  statistics  cross-validation  link_prediction  re:XV_for_networks
11 days ago
Pathways to Exploration: Rationales and Approaches for a U.S. Program of Human Space Exploration | The National Academies Press
"The United States has publicly funded its human spaceflight program on a continuous basis for more than a half-century, through three wars and a half-dozen recessions, from the early Mercury and Gemini suborbital and Earth orbital missions, to the lunar landings, and thence to the first reusable winged crewed spaceplane that the United States operated for three decades. Today the United States is the major partner in a massive orbital facility - the International Space Station - that is becoming the focal point for the first tentative steps in commercial cargo and crewed orbital space flights. And yet, the long-term future of human spaceflight beyond this project is unclear. Pronouncements by multiple presidents of bold new ventures by Americans to the Moon, to Mars, and to an asteroid in its native orbit, have not been matched by the same commitment that accompanied President Kennedy's now fabled 1961 speech-namely, the substantial increase in NASA funding needed to make it happen. Are we still committed to advancing human spaceflight? What should a long-term goal be, and what does the United States need to do to achieve it?"
to:NB  books:noted  space_exploration
12 days ago
Identifying the Culprit: Assessing Eyewitness Identification | The National Academies Press
"Eyewitnesses play an important role in criminal cases when they can identify culprits. Estimates suggest that tens of thousands of eyewitnesses make identifications in criminal investigations each year. Research on factors that affect the accuracy of eyewitness identification procedures has given us an increasingly clear picture of how identifications are made, and more importantly, an improved understanding of the principled limits on vision and memory that can lead to failure of identification. Factors such as viewing conditions, duress, elevated emotions, and biases influence the visual perception experience. Perceptual experiences are stored by a system of memory that is highly malleable and continuously evolving, neither retaining nor divulging content in an informational vacuum. As such, the fidelity of our memories to actual events may be compromised by many factors at all stages of processing, from encoding to storage and retrieval. Unknown to the individual, memories are forgotten, reconstructed, updated, and distorted. Complicating the process further, policies governing law enforcement procedures for conducting and recording identifications are not standard, and policies and practices to address the issue of misidentification vary widely. These limitations can produce mistaken identifications with significant consequences. What can we do to make certain that eyewitness identification convicts the guilty and exonerates the innocent?"
to:NB  books:noted  psychology  law
12 days ago
Causal tracking reliabilism and the Gettier problem - Springer
"This paper argues that reliabilism can handle Gettier cases once it restricts knowledge producing reliable processes to those that involve a suitable causal link between the subject’s belief and the fact it references. Causal tracking reliabilism (as this version of reliabilism is called) also avoids the problems that refuted the causal theory of knowledge, along with problems besetting more contemporary theories (such as virtue reliabilism and the “safety” account of knowledge). Finally, causal tracking reliabilism allows for a response to Linda Zagzebski’s challenge that no theory of knowledge can both eliminate the possibility of Gettier cases while also allowing fully warranted but false beliefs."
to:NB  epistemology
14 days ago
Hume’s definitions of ‘Cause’: Without idealizations, within the bounds of science - Springer
"Interpreters have found it exceedingly difficult to understand how Hume could be right in claiming that his two definitions of ‘cause’ are essentially the same. As J. A. Robinson points out, the definitions do not even seem to be extensionally equivalent. Don Garrett offers an influential solution to this interpretative problem, one that attributes to Hume the reliance on an ideal observer. I argue that the theoretical need for an ideal observer stems from an idealized concept of definition, which many interpreters, including Garrett, attribute to Hume. I argue that this idealized concept of definition indeed demands an unlimited or infinite ideal observer. But there is substantial textual evidence indicating that Hume disallows the employment of idealizations in general in the sciences. Thus Hume would reject the idealized concept of definition and its corresponding ideal observer. I then put forward an expert-relative reading of Hume’s definitions of ‘cause’, which also renders both definitions extensionally equivalent. On the expert-relative reading, the meaning of ‘cause’ changes with better observations and experiments, but it also allows Humean definitions to play important roles within our normative practices. Finally, I consider and reject Henry Allison’s argument that idealized definitions and their corresponding infinite minds are necessary for expert reflection on the limitations of current science."
to:NB  causality  hume.david  history_of_ideas  philosophy
14 days ago
The New Spirit of Capitalism
"Why is the critique of capitalism so ineffective today? In this major work, the sociologists Eve Chiapello and Luc Boltanski suggest that we should be addressing the crisis of anticapitalist critique by exploring its very roots.
"Via an unprecedented analysis of management texts which influenced the thinking of employers and contributed to reorganization of companies over the last decades, the authors trace the contours of a new spirit of capitalism. From the middle of the 1970s onwards, capitalism abandoned the hierarchical Fordist work structure and developed a new network-based form of organization which was founded on employee initiative and relative work autonomy, but at the cost of material and psychological security.
"This new spirit of capitalism triumphed thanks to a remarkable recuperation of the “artistic critique”—that which, after May 1968, attacked the alienation of everyday life by capitalism and bureaucracy. At the same time, the “social critique” was disarmed by the appearance of neocapitalism and remained fixated on the old schemas of hierarchical production."
to:NB  books:noted  social_criticism  sociology  the_wired_ideology  capitalism  management
18 days ago
Vidyasagar, M.: Hidden Markov Processes: Theory and Applications to Biology
"This book explores important aspects of Markov and hidden Markov processes and the applications of these ideas to various problems in computational biology. The book starts from first principles, so that no previous knowledge of probability is necessary. However, the work is rigorous and mathematical, making it useful to engineers and mathematicians, even those not interested in biological applications. A range of exercises is provided, including drills to familiarize the reader with concepts and more advanced problems that require deep thinking about the theory. Biological applications are taken from post-genomic biology, especially genomics and proteomics.
"The topics examined include standard material such as the Perron-Frobenius theorem, transient and recurrent states, hitting probabilities and hitting times, maximum likelihood estimation, the Viterbi algorithm, and the Baum-Welch algorithm. The book contains discussions of extremely useful topics not usually seen at the basic level, such as ergodicity of Markov processes, Markov Chain Monte Carlo (MCMC), information theory, and large deviation theory for both i.i.d and Markov processes. The book also presents state-of-the-art realization theory for hidden Markov models. Among biological applications, it offers an in-depth look at the BLAST (Basic Local Alignment Search Technique) algorithm, including a comprehensive explanation of the underlying theory. Other applications such as profile hidden Markov models are also explored."
to:NB  books:noted  markov_models  state-space_models  em_algorithm  large_deviations  stochastic_processes  statistical_inference_for_stochastic_processes  statistics  genomics  bioinformatics  vidyasagar.mathukumali
18 days ago
AER (104,10) p. 3115 - Financial Networks and Contagion
"We study cascades of failures in a network of interdependent financial organizations: how discontinuous changes in asset values (e.g., defaults and shutdowns) trigger further failures, and how this depends on network structure. Integration (greater dependence on counterparties) and diversification (more counterparties per organization) have different, nonmonotonic effects on the extent of cascades. Diversification connects the network initially, permitting cascades to travel; but as it increases further, organizations are better insured against one another's failures. Integration also faces trade-offs: increased dependence on other organizations versus less sensitivity to own investments. Finally, we illustrate the model with data on European debt cross-holdings."
to:NB  social_networks  finance  economics  jackson.matthew_o.
18 days ago
Reconstructing Macroeconomic Theory to Manage Economic Policy
"Macroeconomics has not done well in recent years: The standard models didn't predict the Great Recession; and even said it couldn't happen. After the bubble burst, the models did not predict the full consequences.
"The paper traces the failures to the attempts, beginning in the 1970s, to reconcile macro and microeconomics, by making the former adopt the standard competitive micro-models that were under attack even then, from theories of imperfect and asymmetric information, game theory, and behavioral economics.
"The paper argues that any theory of deep downturns has to answer these questions: What is the source of the disturbances? Why do seemingly small shocks have such large effects? Why do deep downturns last so long? Why is there such persistence, when we have the same human, physical, and natural resources today as we had before the crisis?
"The paper presents a variety of hypotheses which provide answers to these questions, and argues that models based on these alternative assumptions have markedly different policy implications, including large multipliers. It explains why the apparent liquidity trap today is markedly different from that envisioned by Keynes in the Great Depression, and why the Zero Lower Bound is not the central impediment to the effectiveness of monetary policy in restoring the economy to full employment."
to:NB  to_read  macroeconomics  economics  stiglitz.joseph  financial_crisis_of_2007--
18 days ago
Sociality, Hierarchy, Health: Comparative Biodemography: Papers from a Workshop | The National Academies Press
"Sociality, Hierarchy, Health: Comparative Biodemography is a collection of papers that examine cross-species comparisons of social environments with a focus on social behaviors along with social hierarchies and connections, to examine their effects on health, longevity, and life histories. This report covers a broad spectrum of nonhuman animals, exploring a variety of measures of position in social hierarchies and social networks, drawing links among these factors to health outcomes and trajectories, and comparing them to those in humans. Sociality, Hierarchy, Health revisits both the theoretical underpinnings of biodemography and the empirical findings that have emerged over the past two decades."
to:NB  books:noted  social_networks  medicine  inequality  sociology  natural_science_of_the_human_species  ethology  primates
20 days ago
IEEE Xplore Abstract - Control theoretic smoothing splines
"Some of the relationships between optimal control and statistics are examined. We produce generalized, smoothing splines by solving an optimal control problem for linear control systems, minimizing the L2-norm of the control signal, while driving the scalar output of the control system close to given, prespecified interpolation points. We then prove a convergence result for the smoothing splines, using results from the theory of numerical quadrature. Finally, we show, in simulations, that our approach works in practice as well as in theory"
to:NB  splines  control_theory  smoothing  statistics  via:arsyed
20 days ago
Knox, P., ed.: Atlas of Cities (Hardcover).
"More than half the world’s population lives in cities, and that proportion is expected to rise to three-quarters by 2050. Urbanization is a global phenomenon, but the way cities are developing, the experience of city life, and the prospects for the future of cities vary widely from region to region. The Atlas of Cities presents a unique taxonomy of cities that looks at different aspects of their physical, economic, social, and political structures; their interactions with each other and with their hinterlands; the challenges and opportunities they present; and where cities might be going in the future.
"Each chapter explores a particular type of city—from the foundational cities of Greece and Rome and the networked cities of the Hanseatic League, through the nineteenth-century modernization of Paris and the industrialization of Manchester, to the green and “smart” cities of today. Expert contributors explore how the development of these cities reflects one or more of the common themes of urban development: the mobilizing function (transport, communication, and infrastructure); the generative function (innovation and technology); the decision-making capacity (governance, economics, and institutions); and the transformative capacity (society, lifestyle, and culture)."
to:NB  books:noted  cities  visual_display_of_quantitative_information  geography
20 days ago
Network-based statistical comparison of citation topology of bibliographic databases : Scientific Reports : Nature Publishing Group
"Modern bibliographic databases provide the basis for scientific research and its evaluation. While their content and structure differ substantially, there exist only informal notions on their reliability. Here we compare the topological consistency of citation networks extracted from six popular bibliographic databases including Web of Science, CiteSeer and arXiv.org. The networks are assessed through a rich set of local and global graph statistics. We first reveal statistically significant inconsistencies between some of the databases with respect to individual statistics. For example, the introduced field bow-tie decomposition of DBLP Computer Science Bibliography substantially differs from the rest due to the coverage of the database, while the citation information within arXiv.org is the most exhaustive. Finally, we compare the databases over multiple graph statistics using the critical difference diagram. The citation topology of DBLP Computer Science Bibliography is the least consistent with the rest, while, not surprisingly, Web of Science is significantly more reliable from the perspective of consistency. This work can serve either as a reference for scholars in bibliometrics and scientometrics or a scientific evaluation guideline for governments and research agencies."

--- The methods don't look very compelling, but the last tag applies.
to:NB  network_data_analysis  bibliometry  citation_networks  re:network_differences
21 days ago
PLOS ONE: Classic Maya Bloodletting and the Cultural Evolution of Religious Rituals: Quantifying Patterns of Variation in Hieroglyphic Texts
"Religious rituals that are painful or highly stressful are hypothesized to be costly signs of commitment essential for the evolution of complex society. Yet few studies have investigated how such extreme ritual practices were culturally transmitted in past societies. Here, we report the first study to analyze temporal and spatial variation in bloodletting rituals recorded in Classic Maya (ca. 250–900 CE) hieroglyphic texts. We also identify the sociopolitical contexts most closely associated with these ancient recorded rituals. Sampling an extensive record of 2,480 hieroglyphic texts, this study identifies every recorded instance of the logographic sign for the word ch’ahb’ that is associated with ritual bloodletting. We show that documented rituals exhibit low frequency whose occurrence cannot be predicted by spatial location. Conversely, network ties better capture the distribution of bloodletting rituals across the southern Maya region. Our results indicate that bloodletting rituals by Maya nobles were not uniformly recorded, but were typically documented in association with antagonistic statements and may have signaled royal commitments among connected polities."

--- Not a context in which I ever expected to find myself cited.
Also: did they look for other words (glyphs, I guess) that might refer to such sacrifices? They might just be seeing a linguistic difference, rather than one of practices
have_read  maya_civilization  archaeology  social_networks  social_influence  homophily  re:homophily_and_confounding  in_NB  to:blog
24 days ago
""It's the bravest satellite," Hadfield says. "It's very brave, and very lucky." And it's been going for a long time, and "it's a very tough little spaceship, and it knows what it's doing.""
space_exploration  robots_and_robotics  awww  to:blog
24 days ago
The Slack Wire: Liza Featherstone on Focus Groups
"We are not monads, with a fixed set of preferences. As Liza says in the talk, human beings are profoundly social creatures -- our selves don't exist in isolation from others. (This is why solitary confinement is a form of torture.) Capitalism is intolerable but it has, historically, produced genuine progress in science and technology, and there's a sense in which focus groups could be an example. It's grotesque that this insight -- that people's beliefs and desires only emerge in exchange with others -- has been mainly used to sell soft drinks and candidates. But it's a real insight nonetheless."
track_down_references  social_life_of_the_mind  institutions  collective_cognition  re:democratic_cognition  marketing  focus_groups
25 days ago
Metric Embedding, Hyperbolic Space, and Social Networks
"We consider the problem of embedding an undirected graph into hyperbolic space with minimum distortion. A fundamental problem in its own right, it has also drawn a great deal of interest from applied communities interested in empirical analysis of large-scale graphs. In this paper, we establish a connection between distortion and quasi-cyclicity of graphs, and use it to derive lower and upper bounds on metric distortion. Two particularly simple and natural graphs with large quasi-cyclicity are n-node cycles and n × n square lattices, and our lower bound shows that any hyperbolic-space embedding of these graphs incurs a multiplicative distortion of at least Ω(n/log n). This is in sharp contrast to Euclidean space, where both of these graphs can be embedded with only constant multiplicative distortion. We also establish a relation between quasi-cyclicity and δ-hyperbolicity of a graph as a way to prove upper bounds on the distortion. Using this relation, we show that graphs with small quasi-cyclicity can be embedded into hyperbolic space with only constant additive distortion. Finally, we also present an efficient (linear-time) randomized algorithm for embedding a graph with small quasi-cyclicity into hyperbolic space, so that with high probability at least a (1 − &epsis;) fraction of the node-pairs has only constant additive distortion. Our results also give a plausible theoretical explanation for why social networks have been observed to embed well into hyperbolic space: they tend to have small quasi-cyclicity."
geometry  hyperbolic_geometry  network_data_analysis  in_NB
25 days ago
Expert testimony supporting post-sentence civil incarceration of violent sexual offenders
"Many states have laws permitting the civil incarceration of violent sexual offenders after they have served their sentence. These laws must balance two strong interests: those of society in preventing the offender from committing further violent sexual offences, and those of the individual whose liberty is curtailed in the fear that he may, in the future, commit another offence.
"This article reviews the purposes of civil incarceration, and the criteria used in state law permitting it, highlighting the important role played by estimates of the probability that the individual will recidivate. It then examines the methods used to estimate this probability, and the way these methods are presented by experts in court. A comparison of this expert testimony with the Daubert and Frye standards shows how questionable current practice is. We conclude with a discussion of the choices facing society."
25 days ago
Consistent Specification Testing Via Nonparametric Series Regression
"This paper proposes two consistent one-sided specification tests for parametric regression models, one based on the sample covariance between the residual from the parametric model and the discrepancy between the parametric and nonparametric fitted values; the other based on the difference in sums of squared residuals between the parametric and nonparametric models. We estimate the nonparametric model by series regression. The new test statistics converge in distribution to a unit normal under correct specification and grow to infinity faster than the parametric rate (n<sup>-1/2</sup>) under misspecification, while avoiding weighting, sample splitting, and non-nested testing procedures used elsewhere in the literature. Asymptotically, our tests can be viewed as a test of the joint hypothesis that the true parameters of a series regression model are zero, where the dependent variable is the residual from the parametric model, and the series terms are functions of the explanatory variables, chosen so as to support nonparametric estimation of a conditional expectation. We specifically consider Fourier series and regression splines, and present a Monte Carlo study of the finite sample performance of the new tests in comparison to consistent tests of Bierens (1990), Eubank and Spiegelman (1990), Jayasuriya (1990), Wooldridge (1992), and Yatchew (1992); the results show the new tests have good power, performing quite well in some situations. We suggest a joint Bonferroni procedure that combines a new test with those of Bierens and Wooldridge to capture the best features of the three approaches."
26 days ago
Forecasting economic time series using flexible versus fixed specification and linear versus nonlinear econometric models
"Nine macroeconomic variables are forecast in a real-time scenario using a variety of flexible specification, fixed specification, linear, and nonlinear econometric models. All models are allowed to evolve through time, and our analysis focuses on model selection and performance. In the context of real-time forecasts, flexible specification models (including linear autoregressive models with exogenous variables and nonlinear artificial neural networks) appear to offer a useful and viable alternative to less flexible fixed specification linear models for a subset of the economic variables which we examine, particularly at forecast horizons greater than 1-step ahead. We speculate that one reason for this result is that the economy is evolving (rather slowly) over time. This feature cannot easily be captured by fixed specification linear models, however, and manifests itself in the form of evolving coefficient estimates. We also provide additional evidence supporting the claim that models which ‘win’ based on one model selection criterion (say a squared error measure) do not necessarily win when an alternative selection criterion is used (say a confusion rate measure), thus highlighting the importance of the particular cost function which is used by forecasters and ‘end-users’ to evaluate their models. A wide variety of different model selection criteria and statistical tests are used to illustrate our findings."
to:NB  economics  macroeconomics  prediction  white.halbert  to_read  re:your_favorite_dsge_sucks
26 days ago
Forecast evaluation with shared data sets
"Data sharing is common practice in forecasting experiments in situations where fresh data samples are difficult or expensive to generate. This means that forecasters often analyze the same data set using a host of different models and sets of explanatory variables. This practice introduces statistical dependencies across forecasting studies that can severely distort statistical inference. Here we examine a new and inexpensive recursive bootstrap procedure that allows forecasters to account explicitly for these dependencies. The procedure allows forecasters to merge empirical evidence and draw inference in the light of previously accumulated results. In an empirical example, we merge results from predictions of daily stock prices based on (1) technical trading rules and (2) calendar rules, demonstrating both the significance of problems arising from data sharing and the simplicity of accounting for data sharing using these new methods."
to:NB  statistics  prediction  model_checking  to_read  white.halbert
26 days ago
Maximum Likelihood Estimation of Misspecified Models
"This paper examines the consequences and detection of model misspecification when using maximum likelihood techniques for estimation and inference. The quasi-maximum likelihood estimator (OMLE) converges to a well defined limit, and may or may not be consistent for particular parameters of interest. Standard tests (Wald, Lagrange Multiplier, or Likelihood Ratio) are invalid in the presence of misspecification, but more general statistics are given which allow inferences to be drawn robustly. The properties of the QMLE and the information matrix are exploited to yield several useful tests for model misspecification."
to:NB  likelihood  estimation  misspecification  statistics  white.halbert
26 days ago
Comments on testing economic theories and the use of model selection criteria
"This paper outlines difficulties with testing economic theories, particularly that the theories may be vague, may relate to a decision interval different from the observation period, and may need a metric to convert a complicated testing situation to an easier one. We argue that it is better to use model selection procedures rather than formal hypothesis testing when deciding on model specification. This is because testing favors the null hypothesis, typically uses an arbitrary choice of significance level, and researchers using the same data can end up with different final models."
to:NB  model_selection  hypothesis_testing  statistics  economics  social_science_methodology  white.halbert
26 days ago
Methodology of Econometrics
"The methodology of econometrics is not the study of particular econometric techniques, but a meta-study of how econometrics contributes to economic science. As such it is part of the philosophy of science. The essay begins by reviewing the salient points of the main approaches to the philosophy of science – particularly, logical positivism, Popper’s falsificationism, Lakatos methodology of scientific research programs, and the semantic approach – and orients econometrics within them. The principal methodological issues for econometrics are the application of probability theory to economics and the mapping between economic theory and probability models. Both are raised in Haavelmo’s (1944) seminal essay. Using that essay as a touchstone, the various recent approaches to econometrics are surveyed – those of the Cowles Commission, the vector autoregression program, the LSE approach, calibration, and a set of common, but heterogeneous approaches encapsulated as the “textbook econometrics.” Finally, the essay considers the light shed by econometric methodology on the main epistemological and ontological questions raised in the philosophy of science."
to:NB  econometrics  economics  statistics  philosophy_of_science
26 days ago
Tests of Conditional Predictive Ability - Giacomini - 2006 - Econometrica - Wiley Online Library
"We propose a framework for out-of-sample predictive ability testing and forecast selection designed for use in the realistic situation in which the forecasting model is possibly misspecified, due to unmodeled dynamics, unmodeled heterogeneity, incorrect functional form, or any combination of these. Relative to the existing literature (Diebold and Mariano (1995) and West (1996)), we introduce two main innovations: (i) We derive our tests in an environment where the finite sample properties of the estimators on which the forecasts may depend are preserved asymptotically. (ii) We accommodate conditional evaluation objectives (can we predict which forecast will be more accurate at a future date?), which nest unconditional objectives (which forecast was more accurate on average?), that have been the sole focus of previous literature. As a result of (i), our tests have several advantages: they capture the effect of estimation uncertainty on relative forecast performance, they can handle forecasts based on both nested and nonnested models, they allow the forecasts to be produced by general estimation methods, and they are easy to compute. Although both unconditional and conditional approaches are informative, conditioning can help fine-tune the forecast selection to current economic conditions. To this end, we propose a two-step decision rule that uses current information to select the best forecast for the future date of interest. We illustrate the usefulness of our approach by comparing forecasts from leading parameter-reduction methods for macroeconomic forecasting using a large number of predictors."
to:NB  prediction  model_checking  statistics  white.halbert
26 days ago
Sound and fury: McCloskey and significance testing in economics
"For more than 20 years, Deidre McCloskey has campaigned to convince the economics profession that it is hopelessly confused about statistical significance. She argues that many practices associated with significance testing are bad science and that most economists routinely employ these bad practices: ‘Though to a child they look like science, with all that really hard math, no science is being done in these and 96 percent of the best empirical economics …’ (McCloskey 1999). McCloskey's charges are analyzed and rejected. That statistical significance is not economic significance is a jejune and uncontroversial claim, and there is no convincing evidence that economists systematically mistake the two. Other elements of McCloskey's analysis of statistical significance are shown to be ill‐founded, and her criticisms of practices of economists are found to be based in inaccurate readings and tendentious interpretations of those economists' work. Properly used, significance tests are a valuable tool for assessing signal strength, for assisting in model specification, and for determining causal structure."
to:NB  economics  econometrics  social_science_methodology  statistics  hypothesis_testing
26 days ago
Cambridge Journals Online - Journal of the History of Economic Thought - Abstract - On the Reception of Haavelmo’s Econometric Thought
"The significance of Haavelmo’s “The Probability Approach in Econometrics” (1944), the foundational document of modern econometrics, has been interpreted in widely different ways. Some regard it as a blueprint for a provocative (but ultimately unsuccessful) program dominated by the need for a priori theoretical identification of econometric models. Others focus more on statistical adequacy than on theoretical identification. They see its deepest insights as unduly neglected. The present article uses bibliometric techniques and a close reading of econometrics articles and textbooks to trace the way in which the economics profession received, interpreted, and transmitted Haavelmo’s ideas. A key irony is that the first group calls for a reform of econometric thinking that goes several steps beyond Haavelmo’s initial vision; the second group argues that essentially what the first group advocates was already in Haavelmo’s “Probability Approach” from the beginning."
to:NB  history_of_science  economics  history_of_economics  social_science_methodology  statistics  econometrics
26 days ago
[1211.4246] What Regularized Auto-Encoders Learn from the Data Generating Distribution
"What do auto-encoders learn about the underlying data generating distribution? Recent work suggests that some auto-encoder variants do a good job of capturing the local manifold structure of data. This paper clarifies some of these previous observations by showing that minimizing a particular form of regularized reconstruction error yields a reconstruction function that locally characterizes the shape of the data generating density. We show that the auto-encoder captures the score (derivative of the log-density with respect to the input). It contradicts previous interpretations of reconstruction error as an energy function. Unlike previous results, the theorems provided here are completely generic and do not depend on the parametrization of the auto-encoder: they show what the auto-encoder would tend to if given enough capacity and examples. These results are for a contractive training criterion we show to be similar to the denoising auto-encoder training criterion with small corruption noise, but with contraction applied on the whole reconstruction function rather than just encoder. Similarly to score matching, one can consider the proposed training criterion as a convenient alternative to maximum likelihood because it does not involve a partition function. Finally, we show how an approximate Metropolis-Hastings MCMC can be setup to recover samples from the estimated distribution, and this is confirmed in sampling experiments."
26 days ago
ON THE VECTOR AUTOREGRESSIVE SIEVE BOOTSTRAP - Meyer - 2014 - Journal of Time Series Analysis - Wiley Online Library
"The concept of autoregressive sieve bootstrap is investigated for the case of vector autoregressive (VAR) time series. This procedure fits a finite-order VAR model to the given data and generates residual-based bootstrap replicates of the time series. The paper explores the range of validity of this resampling procedure and provides a general check criterion, which allows to decide whether the VAR sieve bootstrap asymptotically works for a specific statistic or not. In the latter case, we will point out the exact reason that causes the bootstrap to fail."
time_series  statistics  bootstrap  in_NB
27 days ago
A Gentle Introduction to Optimization | Optimization, OR and risk analysis | Cambridge University Press
"Optimization is an essential technique for solving problems in areas as diverse as accounting, computer science and engineering. Assuming only basic linear algebra and with a clear focus on the fundamental concepts, this textbook is the perfect starting point for first- and second-year undergraduate students from a wide range of backgrounds and with varying levels of ability. Modern, real-world examples motivate the theory throughout. The authors keep the text as concise and focused as possible, with more advanced material treated separately or in starred exercises. Chapters are self-contained so that instructors and students can adapt the material to suit their own needs and a wide selection of over 140 exercises gives readers the opportunity to try out the skills they gain in each section. Solutions are available for instructors. The book also provides suggestions for further reading to help students take the next step to more advanced material."

--- It's the could-be-taught-to-freshmen part which interests me if it's true.
books:noted  optimization  re:freshman_seminar_on_optimization  in_NB
4 weeks ago
An Evaluation of Course Evaluations
"Student ratings of teaching have been used, studied, and debated for almost a century. This article examines student ratings of teaching from a statistical perspective. The common practice of relying on averages of student teaching evaluation scores as the primary measure of teaching effectiveness for promotion and tenure decisions should be abandoned for substantive and statistical reasons: There is strong evidence that student responses to questions of “effectiveness” do not measure teaching effectiveness. Response rates and response variability matter. And comparing averages of categorical responses, even if the categories are represented by numbers, makes little sense. Student ratings of teaching are valuable when they ask the right questions, report response rates and score distributions, and are balanced by a variety of other sources and methods to evaluate teaching."