cshalizi + have_read   1420

[1302.0890] Local Log-linear Models for Capture-Recapture
"Log-linear models are often used to estimate the size of a closed population using capture-recapture data. When capture probabilities are related to auxiliary covariates, one may select a separate model based on each of several post-strata. We extend post-stratification to its logical extreme by selecting a local log-linear model for each observed unit, while smoothing to achieve stability. Our local models serve a dual purpose: In addition to estimating the size of the population, we estimate the rate of missingness as a function of covariates. A simulation demonstrates the superiority of our method when the generating model varies over the covariate space. Data from the Breeding Bird Survey is used to illustrate the method."

--- When did the title change from "Smooth Poststratification"?
to:NB  have_read  surveys  smoothing  statistics  estimation  kurtz.zachary  kith_and_kin 
3 days ago by cshalizi
Back to the Future: Review of Bit by Bit by Matt Salganik
"When I heard a few years ago that Salganik was writing a textbook, I was surprised and a little disappointed that this would be a distraction from his cutting edge research in areas like information cascades and respondent driven sampling. I was a fool. Just as chapter 5 of the book describes how computational approaches can enable mass collaboration on research projects by spreading the work from credentialed experts to masses of people with low or unkown skill, Bit by Bit itself will do more for computational social science by spreading the heretofore tacit knowledge of the field than a top researcher could accomplish directly. I strongly recommend Bit by Bit and fully expect it will be the standard methods textbook for computational social science until advances in the field render it dated. If we are lucky, we will benefit from a new edition every five to ten years so the book can keep pace with a rapidly evolving field. However for now it is incredibly current and I highly recommend it to any social scientist who teaches, practices, or aspires to practice or even just understand computational social science."
book_reviews  have_read  social_science_methodology  sociology  rossman.gabriel 
8 days ago by cshalizi
Confabulation in the humanities - Matthew Lincoln, PhD
Now, realize that this doesn't _just_ apply to interpreting quantitative analyses, but also to more traditionally-humanistic explanations...
data_analysis  humanities  everything_is_obvious_once_you_know_the_answer  to_teach  via:?  have_read 
8 days ago by cshalizi
[1901.10861] A Simple Explanation for the Existence of Adversarial Examples with Small Hamming Distance
"The existence of adversarial examples in which an imperceptible change in the input can fool well trained neural networks was experimentally discovered by Szegedy et al in 2013, who called them "Intriguing properties of neural networks". Since then, this topic had become one of the hottest research areas within machine learning, but the ease with which we can switch between any two decisions in targeted attacks is still far from being understood, and in particular it is not clear which parameters determine the number of input coordinates we have to change in order to mislead the network. In this paper we develop a simple mathematical framework which enables us to think about this baffling phenomenon from a fresh perspective, turning it into a natural consequence of the geometry of ℝn with the L0 (Hamming) metric, which can be quantitatively analyzed. In particular, we explain why we should expect to find targeted adversarial examples with Hamming distance of roughly m in arbitrarily deep neural networks which are designed to distinguish between m input classes."
in_NB  adversarial_examples  have_read 
9 days ago by cshalizi
Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors
"Context-predicting models (more commonly known as embeddings or neural language models) are the new kids on the distributional semantics block. Despite the buzz surrounding these models, the literature is still lacking a systematic comparison of the predictive models with classic, count-vector-based distributional semantic approaches. In this paper, we perform such an extensive evaluation, on a wide range of lexical semantics tasks and across many parameter settings. The results, to our own surprise, show that the buzz is fully justified, as the context-predicting models obtain a thorough and resounding victory against their count-based counterparts."
to:NB  have_read  natural_language_processing  text_mining  word2vec  data_mining  to_teach:data-mining 
16 days ago by cshalizi
[1402.3722] word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method
"The word2vec software of Tomas Mikolov and colleagues (this https URL ) has gained a lot of traction lately, and provides state-of-the-art word embeddings. The learning models behind the software are described in two research papers. We found the description of the models in these papers to be somewhat cryptic and hard to follow. While the motivations and presentation may be obvious to the neural-networks language-modeling crowd, we had to struggle quite a bit to figure out the rationale behind the equations.
"This note is an attempt to explain equation (4) (negative sampling) in "Distributed Representations of Words and Phrases and their Compositionality" by Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado and Jeffrey Dean."
to:NB  natural_language_processing  text_mining  statistics  neural_networks  data_mining  word2vec  have_read  to_teach:data-mining 
16 days ago by cshalizi
[1301.3781] Efficient Estimation of Word Representations in Vector Space
"We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities."

--- The last tag is added with an air of "do I really have to?"
to:NB  have_read  neural_networks  text_mining  word2vec  data_mining  to_teach:data-mining 
16 days ago by cshalizi
Supercentenarians and the oldest-old are concentrated into regions with no birth certificates and short lifespans | bioRxiv
"The observation of individuals attaining remarkable ages, and their concentration into geographic sub-regions or ‘blue zones’, has generated considerable scientific interest. Proposed drivers of remarkable longevity include high vegetable intake, strong social connections, and genetic markers. Here, we reveal new predictors of remarkable longevity and ‘supercentenarian’ status. In the United States, supercentenarian status is predicted by the absence of vital registration. The state-specific introduction of birth certificates is associated with a 69-82% fall in the number of supercentenarian records. In Italy, which has more uniform vital registration, remarkable longevity is instead predicted by low per capita incomes and a short life expectancy. Finally, the designated ‘blue zones’ of Sardinia, Okinawa, and Ikaria corresponded to regions with low incomes, low literacy, high crime rate and short life expectancy relative to their national average. As such, relative poverty and short lifespan constitute unexpected predictors of centenarian and supercentenarian status, and support a primary role of fraud and error in generating remarkable human age records."

--- This is a lovely little case study.
to:NB  have_read  data_collection  demography  bureaucracy  statistics  to_teach  via:kjhealy  fraud 
16 days ago by cshalizi
now publishers - The Bias Bias in Behavioral Economics
"Behavioral economics began with the intention of eliminating the psychological blind spot in rational choice theory and ended up portraying psychology as the study of irrationality. In its portrayal, people have systematic cognitive biases that are not only as persistent as visual illusions but also costly in real life—meaning that governmental paternalism is called upon to steer people with the help of “nudges.” These biases have since attained the status of truisms. In contrast, I show that such a view of human nature is tainted by a “bias bias,” the tendency to spot biases even when there are none. This may occur by failing to notice when small sample statistics differ from large sample statistics, mistaking people’s random error for systematic error, or confusing intelligent inferences with logical errors. Unknown to most economists, much of psychological research reveals a different portrayal, where people appear to have largely fine-tuned intuitions about chance, frequency, and framing. A systematic review of the literature shows little evidence that the alleged biases are potentially costly in terms of less health, wealth, or happiness. Getting rid of the bias bias is a precondition for psychology to play a positive role in economics."
in_NB  gigerenzer.gerd  cognitive_science  decision-making  behavioral_economics  psychology  heuristics  rationality  via:gelman  have_read  re:anti-nudge 
16 days ago by cshalizi
[1907.04713] Entropy and Compression: A simple proof of an inequality of Khinchin-Ornstein-Shields
"We prove that Entropy is a lower bound for the average compression ratio of any lossless compressor by giving a simple proof of an inequality that is a slightly variation of an inequality firstly proved by A. I. Khinchin in 1953. The same idea leads to a simple proof of the analogous Ornstein-Shields pointwise inequality of 1990. Our proof is simpler of the ones (of the same pointwise inequality) given by Shields in 1996."

--- This is a very nice use of the typical-set idea, plus some clever book-keeping (essentially: it's not possible to give many typical strings short code-words, because there aren't many short code-words).
to:NB  information_theory  have_read 
4 weeks ago by cshalizi
[1703.04467] spmoran: An R package for Moran's eigenvector-based spatial regression analysis
"This study illustrates how to use "spmoran," which is an R package for Moran's eigenvector-based spatial regression analysis for up to millions of observations. This package estimates fixed or random effects eigenvector spatial filtering models and their extensions including a spatially varying coefficient model, a spatial unconditional quantile regression model, and low rank spatial econometric models. These models are estimated computationally efficiently."

--- ETA after reading: The approach sounds interesting enough that I want to track down the references that actually explain it, rather than just the software.
in_NB  spatial_statistics  regression  statistics  to_teach:data_over_space_and_time  R  have_read 
4 weeks ago by cshalizi
[1907.07582] Testing for Unobserved Heterogeneity via k-means Clustering
"Clustering methods such as k-means have found widespread use in a variety of applications. This paper proposes a formal testing procedure to determine whether a null hypothesis of a single cluster, indicating homogeneity of the data, can be rejected in favor of multiple clusters. The test is simple to implement, valid under relatively mild conditions (including non-normality, and heterogeneity of the data in aspects beyond those in the clustering analysis), and applicable in a range of contexts (including clustering when the time series dimension is small, or clustering on parameters other than the mean). We verify that the test has good size control in finite samples, and we illustrate the test in applications to clustering vehicle manufacturers and U.S. mutual funds."
in_NB  hypothesis_testing  model_selection  model_checking  clustering  statistics  time_series  have_read 
5 weeks ago by cshalizi
[1905.02175] Adversarial Examples Are Not Bugs, They Are Features
"Adversarial examples have attracted significant attention in machine learning, but the reasons for their existence and pervasiveness remain unclear. We demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans. After capturing these features within a theoretical framework, we establish their widespread existence in standard datasets. Finally, we present a simple setting where we can rigorously tie the phenomena we observe in practice to a misalignment between the (human-specified) notion of robustness and the inherent geometry of the data."

--- I'm not as convinced as they are that they've managed to create networks using only "robust" features that aren't vulnerable to new adversarial attacks. But I _am_ convinced that they're able to create non-robust features and show they generalize to the original data set.
in_NB  adversarial_examples  have_read 
9 weeks ago by cshalizi
[1811.00645] The Holdout Randomization Test: Principled and Easy Black Box Feature Selection
"We consider the problem of feature selection using black box predictive models. For example, high-throughput devices in science are routinely used to gather thousands of features for each sample in an experiment. The scientist must then sift through the many candidate features to find explanatory signals in the data, such as which genes are associated with sensitivity to a prospective therapy. Often, predictive models are used for this task: the model is fit, error on held out data is measured, and strong performing models are assumed to have discovered some fundamental properties of the system. A model-specific heuristic is then used to inspect the model parameters and rank important features, with top features reported as "discoveries." However, such heuristics provide no statistical guarantees and can produce unreliable results. We propose the holdout randomization test (HRT) as a principled approach to feature selection using black box predictive models. The HRT is model agnostic and produces a valid p-value for each feature, enabling control over the false discovery rate (or Type I error) for any predictive model. Further, the HRT is computationally efficient and, in simulations, has greater power than a competing knockoffs-based approach."
in_NB  cross-validation  variable_selection  statistics  blei.david  have_read 
12 weeks ago by cshalizi
[1905.11753] Non-Markovian out-of-equilibrium dynamics: A general numerical procedure to construct time-dependent memory kernels for coarse-grained observables
"We present a numerical method to compute non-equilibrium memory kernels based on experimental data or molecular dynamics simulations. The procedure uses a recasting of the non-stationary generalized Langevin equation, in which we expand the memory kernel in a series that can be reconstructed iteratively. Each term in the series can be computed based solely on knowledge of the two-time auto-correlation function of the observable of interest. As a proof of principle, we apply the method to crystallization from a super-cooled Lennard Jones melt. We analyze the nucleation and growth dynamics of crystallites and observe that the memory kernel has a time extent that is about one order of magnitude larger than the typical timescale needed for a particle to be attached to the crystallite in the growth regime."

To-do after reading: Compare to Wiener's procedure for modeling nonlinear, non-Markov, possibly non-stationary processes in terms of memory kernels, where you measure the response to white noise, and successive terms in the series are uncorrelated.

- After skimming: they presume the time-evolution of the observable is a time-dependent linear time, plus a time-dependent linear memory kernel, plus a residual, and then go through some fancy footwork to write the memory kernel in terms of an infinite series of functionals of the covariance function. But I'm much more curious why everything physically significant isn't shunted into the residual process... (They do not address the numerical stability of getting enough terms in their infinite series, let alone the sample complexity of actually doing it from simulated or real trajectories.)
to:NB  non-equilibrium  statistical_mechanics  stochastic_processes  have_read 
12 weeks ago by cshalizi
Academe’s Extinction Event: Failure, Whiskey, and Professional Collapse at the MLA - The Chronicle of Higher Education
As usual with this sort of writing, it's very hard to separate the author's idiosyncratic personal issues (I am myself very fond of a good sazerac, but hoo boy) from the actual evidence or analysis.
academia  via:civilstat  have_read  our_decrepit_institutions 
may 2019 by cshalizi
[1710.03296] Testing for Network and Spatial Autocorrelation
"Testing for dependence has been a well-established component of spatial statistical analyses for decades. In particular, several popular test statistics have desirable properties for testing for the presence of spatial autocorrelation in continuous variables. In this paper we propose two contributions to the literature on tests for autocorrelation. First, we propose a new test for autocorrelation in categorical variables. While some methods currently exist for assessing spatial autocorrelation in categorical variables, the most popular method is unwieldy, somewhat ad hoc, and fails to provide grounds for a single omnibus test. Second, we discuss the importance of testing for autocorrelation in network, rather than spatial, data, motivated by applications in social network data. We demonstrate that existing tests for autocorrelation in spatial data for continuous variables and our new test for categorical variables can both be used in the network setting."
heard_the_talk  have_read  ogburn.elizabeth  kith_and_kin  statistics  spatial_statistics  network_data_analysis  to_teach:baby-nets  to_teach:data_over_space_and_time  re:neutral_cultural_networks  in_NB 
april 2019 by cshalizi
Nonparametric Instrumental Regression
"The focus of the paper is the nonparametric estimation of an instrumental regression function P defined by conditional moment restrictions stemming from a structural econometric model : E[Y-P(Z)|W]=0 and involving endogenous variables Y and Z and instruments W. The function P is the solution of an ill-posed inverse problem and we propose an estimation procedure based on Tikhonov regularization. The paper analyses identification and overidentification of this model and presents asymptotic properties of the estimated nonparametric instrumental regression function."

--- Was this ever published? It definitely seems like the most elegant approach to nonparametric IVs I've seen (French econometricians!).
to:NB  have_read  regression  instrumental_variables  nonparametrics  inverse_problems  causal_inference  re:ADAfaEPoV  econometrics 
april 2019 by cshalizi
The Multivariate Poisson-Log Normal Distribution on JSTOR
"The statistical analysis of multivariate counts has proved difficult because of the lack of a parametric class of distributions supporting a rich enough correlation structure. With increasing availability of powerful computing facilities an obvious candidate for consideration is now the multivariate log normal mixture of independent Poisson distributions, the multivariate Poisson-log normal distribution. The properties of this discrete multivariate distribution are studied and its uses in a variety of applications to multivariate count data are illustrated."
to:NB  have_read  multivariate_distributions  statistics 
april 2019 by cshalizi
Adam Tooze · Is this the end of the American century?: America Pivots · LRB 4 April 2019
"As of today, two years into the Trump presidency, it is a gross exaggeration to talk of an end to the American world order. The two pillars of its global power – military and financial – are still firmly in place. What has ended is any claim on the part of American democracy to provide a political model. This is certainly a historic break. Trump closes the chapter begun by Woodrow Wilson in the First World War, with his claim that American democracy articulated the deepest feelings of liberal humanity. A hundred years later, Trump has for ever personified the sleaziness, cynicism and sheer stupidity that dominates much of American political life. What we are facing is a radical disjunction between the continuity of basic structures of power and their political legitimation.
"If America’s president mounted on a golf buggy is a suitably ludicrous emblem of our current moment, the danger is that it suggests far too pastoral a scenario: American power trundling to retirement across manicured lawns. That is not our reality. Imagine instead the president and his buggy careening around the five-acre flight deck of a $13 billion, Ford-class, nuclear-powered aircraft carrier engaged in ‘dynamic force deployment’ to the South China Sea. That better captures the surreal revival of great-power politics that hangs over the present. Whether this turns out to be a violent and futile rearguard action, or a new chapter in the age of American world power, remains to be seen."
tooze.adam  the_continuing_crises  us_politics  american_hegemony  have_read 
april 2019 by cshalizi
Rich club organization and intermodule communication in the cat connectome. - PubMed - NCBI
"Macroscopic brain networks have been shown to display several properties of an efficient communication architecture. In light of global communication, the formation of a densely connected neural "rich club" of hubs is of particular interest, because brain hubs have been suggested to play a key role in enabling short communication pathways within neural networks. Here, analyzing the cat connectome as reconstructed from tract tracing data (Scannell et al., 1995), we provide several lines of evidence of an important role of the structural rich club to interlink functional domains. First, rich club hub nodes were found to be mostly present at the boundaries between functional communities and well represented among intermodule hubs, displaying a diverse connectivity profile. Second, rich club connections, linking nodes of the rich club, and feeder connections, linking non-rich club nodes to rich club nodes, were found to comprise 86% of the intermodule connections, whereas local connections between peripheral nodes mostly spanned between nodes of the same functional community. Third, almost 90% of all intermodule communication paths were found to follow a sequence or "path motif" that involved rich club or feeder edges and thus traversed a rich club node. Together, our findings provide evidence of the structural rich club to form a central infrastructure for intermodule communication in the brain."
to:NB  have_read  neuroscience  functional_connectivity  network_data_analysis  re:friday_science_cat_blogging 
march 2019 by cshalizi
Do ImageNet Classifiers Generalize to ImageNet?
"We build new test sets for the CIFAR-10 and ImageNet datasets. Both benchmarks have been
the focus of intense research for almost a decade, raising the danger of overfitting to excessively
re-used test sets. By closely following the original dataset creation processes, we test to what
extent current classification models generalize to new data. We evaluate a broad range of models
and find accuracy drops of 3% – 15% on CIFAR-10 and 11% – 14% on ImageNet. However,
accuracy gains on the original test sets translate to larger gains on the new test sets. Our results
suggest that the accuracy drops are not caused by adaptivity, but by the models’ inability to
generalize to slightly “harder” images than those found in the original test sets."

--- The astonishing thing to me is the _linear_ relationship between accuracy on the old and new data-set versions. It's uncannily good. (Also: tiny changes in data-preparation make a big difference!)
to:NB  have_read  classifiers  neural_networks  data_sets  to_teach:data-mining 
february 2019 by cshalizi
How a Feel-Good AI Story Went Wrong in Flint - The Atlantic
Interesting (and depressing) in so many ways. (The least of which is grist for my "AI is really ML, and ML is really regression" mill.)
classifiers  data_mining  our_decrepit_institutions  infrastructure  public_policy  have_read  via:?  to_teach:data-mining  to_teach:data_over_space_and_time 
january 2019 by cshalizi
[1612.07545] A Revisit of Hashing Algorithms for Approximate Nearest Neighbor Search
"Approximate Nearest Neighbor Search (ANNS) is a fundamental problem in many areas of machine learning and data mining. During the past decade, numerous hashing algorithms are proposed to solve this problem. Every proposed algorithm claims outperform other state-of-the-art hashing methods. However, the evaluation of these hashing papers was not thorough enough, and those claims should be re-examined. The ultimate goal of an ANNS method is returning the most accurate answers (nearest neighbors) in the shortest time. If implemented correctly, almost all the hashing methods will have their performance improved as the code length increases. However, many existing hashing papers only report the performance with the code length shorter than 128. In this paper, we carefully revisit the problem of search with a hash index, and analyze the pros and cons of two popular hash index search procedures. Then we proposed a very simple but effective two level index structures and make a thorough comparison of eleven popular hashing algorithms. Surprisingly, the random-projection-based Locality Sensitive Hashing (LSH) is the best performed algorithm, which is in contradiction to the claims in all the other ten hashing papers. Despite the extreme simplicity of random-projection-based LSH, our results show that the capability of this algorithm has been far underestimated. For the sake of reproducibility, all the codes used in the paper are released on GitHub, which can be used as a testing platform for a fair comparison between various hashing algorithms."
to:NB  data_mining  approximation  nearest_neighbors  locality-sensitive_hashing  hashing  have_read  via:vaguery  random_projections  k-means  databases 
january 2019 by cshalizi
Diffusion by Continuous Movements - Taylor - 1922 - Proceedings of the London Mathematical Society - Wiley Online Library
Apparently (?) the original source for what I've been calling the "world's simplest ergodic theorem" (http://bactra.org/weblog/668.html), and the associated calculation of the correlation time. (This would explain why one of the places I learned it was Frisch's book on turbulence.)

--- Reference via Eshel's _Spatiotemporal Data Analysis_ (review forthcoming), though that mangled the bibliographic information.
stochastic_processes  turbulence  ergodic_theory  probability  have_skimmed  taylor.g.i.  physics  re:almost_none  to_teach:data_over_space_and_time  in_NB  have_read 
december 2018 by cshalizi
Who Does Ross Douthat Think He Is? – Hmm Daily
Much good stuff here, of which I will just highlight two bits:

"Except Ross Douthat is not that kind of Catholic. He is a convert, whose ancestry runs right through the Protestant establishment, including his great-grandfather having been the governor of Connecticut. Calling himself a Catholic in the discussion of historic power and opportunity was a Rachel Dolezal–grade feat of impersonation. To the extent there is a story to be told about the decline of the cultural dominance of the Protestant ruling class, it would be the story of how Ross Douthat came to identify as Catholic, without ceding any power or influence along the way."

And:

"Douthat presents that version of things as a speculative alternative history:
'So it’s possible to imagine adaptation rather than surrender as a different WASP strategy across the 1960s and 1970s. In such a world the establishment would have still admitted more blacks, Jews, Catholics and Hispanics (and more women) to its ranks … but it would have done so as a self-consciously elite-crafting strategy, rather than under the pseudo-democratic auspices of the SAT and the high school resume and the dubious ideal of “merit.” '
"What is the difference between “a self-consciously elite-crafting strategy” and “the SAT and the high school resume and the dubious ideal of ‘merit'”? This is exactly what the Ivies did: they adapted their conception of the elite to include more different demographic groups, whose elite status was to be measured with tests and resumes."
have_read  us_politics  why_oh_why_cant_we_have_a_better_intelligentsia 
december 2018 by cshalizi
Solving Differential Equations in R: Package deSolve | Soetaert | Journal of Statistical Software
"In this paper we present the R package deSolve to solve initial value problems (IVP) written as ordinary differential equations (ODE), differential algebraic equations (DAE) of index 0 or 1 and partial differential equations (PDE), the latter solved using the method of lines approach. The differential equations can be represented in R code or as compiled code. In the latter case, R is used as a tool to trigger the integration and post-process the results, which facilitates model development and application, whilst the compiled code significantly increases simulation speed. The methods implemented are efficient, robust, and well documented public-domain Fortran routines. They include four integrators from the ODEPACK package (LSODE, LSODES, LSODA, LSODAR), DVODE and DASPK2.0. In addition, a suite of Runge-Kutta integrators and special-purpose solvers to efficiently integrate 1-, 2- and 3-dimensional partial differential equations are available. The routines solve both stiff and non-stiff systems, and include many options, e.g., to deal in an efficient way with the sparsity of the Jacobian matrix, or finding the root of equations. In this article, our objectives are threefold: (1) to demonstrate the potential of using R for dynamic modeling, (2) to highlight typical uses of the different methods implemented and (3) to compare the performance of models specified in R code and in compiled code for a number of test cases. These comparisons demonstrate that, if the use of loops is avoided, R code can efficiently integrate problems comprising several thousands of state variables. Nevertheless, the same problem may be solved from 2 to more than 50 times faster by using compiled code compared to an implementation using only R code. Still, amongst the benefits of R are a more flexible and interactive implementation, better readability of the code, and access to R’s high-level procedures. deSolve is the successor of package odesolve which will be deprecated in the future; it is free software and distributed under the GNU General Public License, as part of the R software project."
to:NB  dynamical_systems  computational_statistics  R  to_teach:data_over_space_and_time  have_read 
december 2018 by cshalizi
Demographic Models for Projecting Population and Migration: Methods for African Historical Analysis | Manning | Journal of World-Historical Information
"This study presents methods for projecting population and migration over time in cases were empirical data are missing or undependable. The methods are useful for cases in which the researcher has details of population size and structure for a limited period of time (most obviously, the end point), with scattered evidence on other times. It enables estimation of population size, including its structure in age, sex, and status, either forward or backward in time. The program keeps track of all the details. The calculated data can be reported or sampled and compared to empirical findings at various times and places to expected values based on other procedures of estimation.
"The application of these general methods that is developed here is the projection of African populations backwards in time from 1950, since 1950 is the first date for which consistently strong demographic estimates are available for national-level populations all over the African continent. The models give particular attention to migration through enslavement, which was highly important in Africa from 1650 to 1900. Details include a sensitivity analysis showing relative significance of input variables and techniques for calibrating various dimensions of the projection with each other. These same methods may be applicable to quite different historical situations, as long as the data conform in structure to those considered here."

--- The final for the Kids.
to:NB  have_read  demography  history  africa  imperialism  slavery  great_transformation  to_teach:data_over_space_and_time  simulation  manning.patrick 
december 2018 by cshalizi
[1808.04739] Simulating Markov random fields with a conclique-based Gibbs sampler
"For spatial and network data, we consider models formed from a Markov random field (MRF) structure and the specification of a conditional distribution for each observation. At issue, fast simulation from such MRF models is often an important consideration, particularly when repeated generation of large numbers of data sets is required (e.g., for approximating sampling distributions). However, a standard Gibbs strategy for simulating from MRF models involves single-updates, performed with the conditional distribution of each observation in a sequential manner, whereby a Gibbs iteration may become computationally involved even for relatively small samples. As an alternative, we describe a general way to simulate from MRF models using Gibbs sampling with "concliques" (i.e., groups of non-neighboring observations). Compared to standard Gibbs sampling, this simulation scheme can be much faster by reducing Gibbs steps and by independently updating all observations per conclique at once. We detail the simulation method, establish its validity, and assess its computational performance through numerical studies, where speed advantages are shown for several spatial and network examples."

--- Slides: http://andeekaplan.com/phd-thesis/slides/public.pdf
--- There's an R package on Github but I couldn't get it to compile...
in_NB  random_fields  simulation  stochastic_processes  spatial_statistics  network_data_analysis  markov_models  statistics  computational_statistics  to_teach:data_over_space_and_time  have_read 
december 2018 by cshalizi
5601 Notes: The Sandwich Estimator
I believe the subscript in n inside the sums defining V_n and J_n should be i. Otherwise, this is terrific (unsurprisingly).
to:NB  to_teach  have_read  statistics  estimation  fisher_information  misspecification  geyer.charles 
october 2018 by cshalizi
Quantile Regression
"Quantile regression, as introduced by Koenker and Bassett (1978), may be viewed as an extension of classical least squares estimation of conditional mean models to the estimation of an ensemble of models for several conditional quantile functions. The central special case is the median regression estimator which minimizes a sum of absolute errors. Other conditional quantile functions are estimated by minimizing an asymmetrically weighted sum of absolute errors. Quantile regression methods are illustrated with applications to models for CEO pay, food expenditure, and infant birthweight."
to:NB  have_read  regression  statistics  econometrics 
october 2018 by cshalizi
Lognormal-de Wijsian Geostatistics for Ore Evaluation
Krige on kriging. I have to admit I hadn't fully realized that the historical context was "keep South Africa going"...
in_NB  have_read  spatial_statistics  prediction  statistics  geology  to_teach:data_over_space_and_time 
september 2018 by cshalizi
[math/0506080] Two new Markov order estimators
"We present two new methods for estimating the order (memory depth) of a finite alphabet Markov chain from observation of a sample path. One method is based on entropy estimation via recurrence times of patterns, and the other relies on a comparison of empirical conditional probabilities. The key to both methods is a qualitative change that occurs when a parameter (a candidate for the order) passes the true order. We also present extensions to order estimation for Markov random fields."
in_NB  markov_models  statistical_inference_for_stochastic_processes  model_selection  recurrence_times  entropy_estimation  information_theory  stochastic_processes  have_read  have_talked_about  random_fields 
september 2018 by cshalizi
A personal essay on Bayes factors
I would have said nobody blogs like this anymore, and I am very happy to be very wrong.
have_read  model_selection  bayesianism  statistics  psychology  social_science_methodology  via:tslumley 
september 2018 by cshalizi
Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning
"Randomized neural networks are immortalized in this AI Koan: In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. What are you doing?'' asked Minsky. I am training a randomly wired neural net to play tic-tac-toe,'' Sussman replied. Why is the net wired randomly?'' asked Minsky. Sussman replied, I do not want it to have any preconceptions of how to play.'' Minsky then shut his eyes. Why do you close your eyes?'' Sussman asked his teacher. So that the room will be empty,'' replied Minsky. At that moment, Sussman was enlightened. We analyze shallow random networks with the help of concentration of measure inequalities. Specifically, we consider architectures that compute a weighted sum of their inputs after passing them through a bank of arbitrary randomized nonlinearities. We identify conditions under which these networks exhibit good classification performance, and bound their test error in terms of the size of the dataset and the number of random nonlinearities."

--- Have I never bookmarked this before?
in_NB  approximation  kernel_methods  random_projections  statistics  prediction  classifiers  rahimi.ali  recht.benjamin  machine_learning  have_read 
september 2018 by cshalizi
[1205.4591] Forecastable Component Analysis (ForeCA)
" introduce Forecastable Component Analysis (ForeCA), a novel dimension reduction technique for temporally dependent signals. Based on a new forecastability measure, ForeCA finds an optimal transformation to separate a multivariate time series into a forecastable and an orthogonal white noise space. I present a converging algorithm with a fast eigenvector solution. Applications to financial and macro-economic time series show that ForeCA can successfully discover informative structure, which can be used for forecasting as well as classification. The R package ForeCA (this http URL) accompanies this work and is publicly available on CRAN."
to:NB  have_read  time_series  kith_and_kin  goerg.georg  prediction  statistics  to_teach:data_over_space_and_time 
september 2018 by cshalizi
Safe spaces, academic freedom, and the university as a complex association - Bleeding Heart Libertarians
This is great, but I am less convinced than Levy is that (at least some of) the demands aren't for making the _whole_ university into safe spaces for sub-associations.
academia  academic_freedom  freedom_of_expression  levy.jacob_t.  have_read  via:? 
september 2018 by cshalizi
Analysis of a complex of statistical variables into principal components.
"The problem is stated in detail, a method of analysis is derived and its geometrical meaning shown, methods of solution are illustrated and certain derivative problems are discussed. (To be concluded in October issue.) "

--- In which Harold Hotelling re-invents principal components analysis, 32 years after Karl Pearson. (Part 2: http://dx.doi.org/10.1037/h0070888)
to:NB  have_read  principal_components  data_analysis  hotelling.harold  re:ADAfaEPoV 
september 2018 by cshalizi
On lines and planes of closest fit to systems of points in space (K. Pearson, 1901)
In which Karl Pearson invents principal components analysis, with the entirely sensible objective of finding low-dimensional approximations to high-dimensional data. (i.e., basically the way I teach it!)
to:NB  principal_components  data_analysis  pearson.karl  re:ADAfaEPoV  have_read 
september 2018 by cshalizi
Parzen : On Estimation of a Probability Density Function and Mode
In which Parzen introduces kernel density estimation, three years after Rosenblatt introduced it _in the same journal_.
to:NB  statistics  density_estimation  have_read  parzen.emanuel  re:ADAfaEPoV 
september 2018 by cshalizi
Rosenblatt : Remarks on Some Nonparametric Estimates of a Density Function (1956)
"This note discusses some aspects of the estimation of the density function of a univariate probability distribution. All estimates of the density function satisfying relatively mild conditions are shown to be biased. The asymptotic mean square error of a particular class of estimates is evaluated."

--- In which Rosenblatt introduces kernel density estimation.
to:NB  statistics  density_estimation  have_read  rosenblatt.murray  re:ADAfaEPoV 
september 2018 by cshalizi
cultural cognition project - Cultural Cognition Blog - Return of the chick sexers . . .
"To put it in terms used to appraise scientific methods, we know the professional judgment of the chick sexer is not only reliable—consistently attuned to whatever it is that appropriately trained members of their craft are unconsciously discerning—but also valid: that is, we know that the thing the chick sexers are seeing (or measuring, if we want to think of them as measuring instruments of a special kind) is the thing we want to ascertain (or measure), viz., the gender of the chicks.
"In the production of lawyers, we have reliability only, without validity—or at least without validation.  We do successfully (remarkably!) train lawyers to make out the same patterns when they focus their gaze at the “mystifying cloud of words” that Cardozo identified the law as comprising. But we do nothing to assure that what they are discerning is the form of justice that the law is held forth as embodying.
"Observers fret—and scholars using empirical methods of questionable reliability and validity purport to demonstrate—that judges are mere “politicians in robes,” whose decisions reflect the happenstance of their partisan predilections.
"That anxiety that judges will disagree based on their “ideologies” bothers me not a bit.
"What does bother me—more than just a bit—is the prospect that the men and women I’m training to be lawyers and judges will, despite the diversity of their political and moral sensibilities, converge on outcomes that defy the basic liberal principles that we expect to animate our institutions.
"The only thing that I can hope will stop that from happening is for me to tell them that this is how it works.  Because if it troubles me, I have every reason to think that they, as reflective decent people committed to respecting the freedom & reason of others, will find some of this troubling too.
"Not so troubling that they can’t become good lawyers. 
"But maybe troubling enough that they won't stop being reflective moral people in their careers as lawyers; troubling enough so that if they find themselves in a position to do so, they will enrich the stock of virtuous-lawyer prototypes that populate our situation sense  by doing something  that they, as reflective, moral people—“conservative” or “liberal”—recognize is essential to reconciling being a “good lawyer” with being a member of a profession essential to the good of a liberal democratic regime."

--- Preach, preach! (But this is also one turn away from seeing the legal sensibility as itself ideological, in the service of particular social interests...)
have_read  cognition  expertise  cultural_transmission_of_cognitive_tools  tacit_knowledge  professions  ideology  moral_responsibility  kahan.dan  via:tsuomela 
september 2018 by cshalizi
[1808.00023] The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning
"The nascent field of fair machine learning aims to ensure that decisions guided by algorithms are equitable. Over the last several years, three formal definitions of fairness have gained prominence: (1) anti-classification, meaning that protected attributes---like race, gender, and their proxies---are not explicitly used to make decisions; (2) classification parity, meaning that common measures of predictive performance (e.g., false positive and false negative rates) are equal across groups defined by the protected attributes; and (3) calibration, meaning that conditional on risk estimates, outcomes are independent of protected attributes. Here we show that all three of these fairness definitions suffer from significant statistical limitations. Requiring anti-classification or classification parity can, perversely, harm the very groups they were designed to protect; and calibration, though generally desirable, provides little guarantee that decisions are equitable. In contrast to these formal fairness criteria, we argue that it is often preferable to treat similarly risky people similarly, based on the most statistically accurate estimates of risk that one can produce. Such a strategy, while not universally applicable, often aligns well with policy objectives; notably, this strategy will typically violate both anti-classification and classification parity. In practice, it requires significant effort to construct suitable risk estimates. One must carefully define and measure the targets of prediction to avoid retrenching biases in the data. But, importantly, one cannot generally address these difficulties by requiring that algorithms satisfy popular mathematical formalizations of fairness. By highlighting these challenges in the foundation of fair machine learning, we hope to help researchers and practitioners productively advance the area."

--- ETA: This is a really good and convincing paper.
in_NB  prediction  algorithmic_fairness  goel.sharad  via:rvenkat  have_read  heard_the_talk 
august 2018 by cshalizi
If we already understood the brain, would we even know it? – [citation needed]
"What I’m suggesting is that, when we say things like “we don’t really understand the brain yet”, we’re not really expressing factual statements about the collective sum of neuroscience knowledge currently held by all human beings. What each of us really means is something more like there are questions I personally am able to pose about the brain that seem to make sense in my head, but that I don’t currently know the answer to–and I don’t think I could piece together the answer even if you handed me a library of books containing all of the knowledge we’ve accumulated about the brain."
have_read  complexity  emergence  explanation  neuroscience  yarkoni.tal 
august 2018 by cshalizi
A re-replication of a psychological classic provides a cautionary tale about overhyped science – Research Digest
Ummm. If the effects being studied are this fragile, why on Earth would we think they have real-world importance? Even very fragile, hard-to-elicit effects _can_ illuminate deep theoretical questions (I started out as a high-energy particle physicist!), but what are those questions, here exactly? I half-suspect the problem with social psychology (et al.) isn't bad social/experimental protocols, or bad statistics, but a failure to really theorize. Back to the blackboard!
track_down_references  have_read  replication  psychology  to:blog 
august 2018 by cshalizi
Local causal states and discrete coherent structures (Rupe and Crutchfield, 2018)
"Coherent structures form spontaneously in nonlinear spatiotemporal systems and are found at all spatial scales in natural phenomena from laboratory hydrodynamic flows and chemical reactions to ocean, atmosphere, and planetary climate dynamics. Phenomenologically, they appear as key components that organize the macroscopic behaviors in such systems. Despite a century of effort, they have eluded rigorous analysis and empirical prediction, with progress being made only recently. As a step in this, we present a formal theory of coherent structures in fully discrete dynamical field theories. It builds on the notion of structure introduced by computational mechanics, generalizing it to a local spatiotemporal setting. The analysis’ main tool employs the local causal states, which are used to uncover a system’s hidden spatiotemporal symmetries and which identify coherent structures as spatially localized deviations from those symmetries. The approach is behavior-driven in the sense that it does not rely on directly analyzing spatiotemporal equations of motion, rather it considers only the spatiotemporal fields a system generates. As such, it offers an unsupervised approach to discover and describe coherent structures. We illustrate the approach by analyzing coherent structures generated by elementary cellular automata, comparing the results with an earlier, dynamic-invariant-set approach that decomposes fields into domains, particles, and particle interactions."

--- *ahem* *cough* https://arxiv.org/abs/nlin/0508001 *ahem*
to:NB  have_read  pattern_formation  complexity  prediction  stochastic_processes  spatio-temporal_statistics  cellular_automata  crutchfield.james_p.  modesty_forbids_further_comment 
august 2018 by cshalizi
Phys. Rev. A 38, 2066 (1988) - Thermally induced escape: The principle of minimum available noise energy
"The average time required for thermally induced escape from a basin of attraction increases exponentially with inverse temperature in proportion to exp(E_A/kT) in the limit of low temperature. A minimum principle states that the activation energy E_A is the minimum available noise energy required to execute a state-space trajectory which takes the system from the attractor of the noise-free system to the boundary of its basin of attraction and that the minimizing trajectory is the most probable low-temperature escape path. This principle is applied to the problem of thermally induced escape from two attractors of the dc-biased Josephson junction, the zero-voltage state and the voltage state, to determine activation energies and most probable escape paths. These two escape problems exemplify the classical case of escape from a potential well and the more general case of escape from an attractor of a nonequilibrium system. Monte Carlo simulations are used to verify the accuracy of the activation energies and most probable escape paths derived from the minimum principle."
in_NB  have_read  large_deviations  stochastic_processes  dynamical_systems  non-equilibrium  statistical_mechanics  re:do-institutions-evolve  re:almost_none 
august 2018 by cshalizi
Activation energy for thermally induced escape from a basin of attraction - ScienceDirect
"In the limit of low temperature the most probable path for escape from a basin of attraction is the path which minimizes the available thermal noise energy required for escape. This minimum energy is the activation energy of escape."
in_NB  have_read  large_deviations  non-equilibrium  statistical_mechanics  dynamical_systems  stochastic_processes  re:do-institutions-evolve  re:almost_none 
august 2018 by cshalizi
Neo-darwinian evolution implies punctuated equilibria | Nature [1985]
"The two central elements of neo-darwinian evolution are small random variations and natural selection. In Wright's view, these lead to random drift of mean population characters in a fixed, multiply peaked ‘adaptive landscape’, with long periods spent near fitness peaks. Using recent theoretical results5, we show here that transitions between peaks are rapid and unidirectional even though (indeed, because) random variations are small and transitions initially require movement against selection. Thus, punctuated equilibrium, the palaeontological pattern of rapid transitions between morphological equlibria, is a natural manifestation of the standard wrightian evolutionary theory and requires no special developmental, genetic or ecological mechanisms."
in_NB  have_read  evolutionary_biology  large_deviations  stochastic_processes  re:do-institutions-evolve  evolution 
august 2018 by cshalizi
« earlier      
per page:    204080120160

related tags

"tabby's_star"_is_cute  20th_century_history  401_k  1960s  aaronson.scott  abbott.andrew  abdul-jabbar.kareem  abstraction  abstract_algebra  academia  academic_freedom  acemoglu.daron  ackerman.seth  adamic.lada  adams.terrence  addiction  additive_models  adolescence  adversarial_examples  advertising  advice  aesthetics  affectionate_parody  affirmative_action  africa  agent-based_models  aggregation  aging  agre.philip_e.  agriculture  ai  airoldi.edo  akerlof.george  aldous.david_j.  algebra  algorithmic_fairness  algorithmic_information_theory  algorithms  allen.danielle_s.  allen.genevera_i.  allocative_efficiency  amaral.luis  amazon  american_hegemony  american_history  american_imperialism  american_southwest  amiga  analysis  anarchism  ancient_history  ancient_trade  anderson.elizabeth  anomaly_detection  another_piece_of_my_childhood_becomes_a_part_of_the_historical_record  ansell.christopher  antarctica  anthropic_arguments  anthropology  anti-contrarianism  anti-intellectualism_in_american_life  apocalypticism  appalachia  approximate_bayesian_computation  approximation  aral.sinan  archaeology  architecture  arendt.hannah  argument  arkin.william  arlot.sylvain  armchair_travel  aronow.peter  arrow_of_time  art  artificial_intelligence  artificial_life  artworld  asian_americans  astronomy  astrophysics  asymmetric_information  athey.susan  auerbach.david  authoritarianism  authority  automata_theory  automated_diagnosis  automating_craft  automation  avant_gardes  aww  awww  ay.nihat  bacteria  bad_data_analysis  bad_science  bad_science_journalism  baez.john_c.  bagehot.walter  bail  bakshy.eytan  balakrishnan.sivaraman  balduzzi.david  balkin.jack_m.  banking  banks.iain_m.  barabasi.albert-laszlo  barely-comprehensible_metaphysics  bartlett.m.s.  basic_income  bayesianism  bayesian_consistency  bdsm  bechtel.william  becker.gary  beggs.mike  behavioral_ecology  behavioral_economics  belief_propagation  belkin.mikhail  bergstrom.carl_t.  berk.richard  berk.robert_h  berube.michael  bewley.truman  bibliometry  bickel.peter_j.  biochemical_networks  bioinformatics  biology  biophysics  bitcoin  blackburn.simon  blei.david  blitzstein.joseph  blockchain  blogged  blogging  bohemia  books  books:can't_really_recommend  books:noted  books:owned  books:recommended  books:reviewed  book_reviews  boosting  bootstrap  borgs.christian  bork.robert  borsboom.denny  borsook.paulina  bounded_rationality  bowles.samuel  boyd.danah  brandeis.louis  brockwell.anthony  brooks.david  brooks.rodney  brooks.rosa  brust.steven  bubeck.sebastien  bubonic_plague  buddhism  buhlmann.peter  bureaucracy  burke.timothy  burks.arthur_w.  business  business_disaster_porn  but_it_would_explain_so_much  but_what_do_i_optimize  but_why_not_"boyajian's_star"?  by_people_i_know  C  cai.t._tony  caires.s.  calibration  california  cambridge_analytica  cambridge_capital_controversy  campaign_finance  capitalism  carbon_pricing  carey.kevin  cars  caruana.rich  cats  causality  causal_discovery  causal_inference  ceglowski.maciej  celisse.alain  cellular_automata  central_asia  central_limit_theorem  chains_with_complete_connections  chait.jonathan  change-point_problem  chaos  charisma  charity  chatterjee.sourav  chayes.jennifer  chicago  china  china:prc  choi.david  choi.david_s.  christakis.nicholas  christensen.b.j.  christianity  christmas  chu.tianjiao  circular_firing_squad  citation_networks  cities  civil_liberties  claidiere.nicolas  clark.andy  clarke.arthur_c.  clarke.kevin  class  classifiers  class_struggles_in_america  clauset.aaron  clermont.gilles  climate_change  climatology  clinical_vs_actuarial_judgment  clowns  clustering  coal  coates.ta-nehisi  cobb-douglas_production_functions  cognition  cognitive_development  cognitive_dissonance  cognitive_science  cohn.henry  cold_war  collaborative_filtering  collective_cognition  collective_support_for_individual_choice  collins.nathan  communism  community_discovery  comparative_history  comparative_politics  complexity  complexity_measures  computability  computation  computational_complexity  computational_statistics  computers  computer_networks_as_provinces_of_the_commonwealth_of_letters  computer_science  concentration_of_measure  conditional_independence_tests  confidence_sets  conformal_prediction  confounding  congress  consciousness  conservatism  conspiracy_theories  consumerism  contagion  control_theory_and_control_engineering  convergence_of_stochastic_processes  convexity  convex_sets  conways_life  copulas  corporate_governance  corporations  corpus_linguistics  correlation  correlational_psychology  corruption  cosmology  cost-benefit_analysis  cost_disease  counter-culture  covariance  covariate_shift  cox.ana_marie  cox.david_r.  crabapple.molly  cramer-rao_inequality  credit_derivatives  creeping_authoritarianism  creepy  crespi.valentino  crime  crime_fiction  crispin.jessa  criticism_of_criticism_of_criticism  critique  cross-validation  crotty.james  crutchfield.james_p.  crystallography  csiszar.imre  cthulhiana  cults_of_personality  cultural_appropriation  cultural_capital  cultural_criticism  cultural_differences  cultural_diversity  cultural_evolution  cultural_exchange  cultural_transmission  cultural_transmission_of_cognitive_tools  cultural_universals  culture  culture_industires  cumulants  curse_of_dimensionality  curve_fitting  cybenko.george  cybernetics  cybersyn  damouras.sotirios  danks.david  databases  data_analysis  data_collection  data_mining  data_sets  dauxois.thierry  davidson.paul  davies.daniel  davison.anthony  deboer.frederik  debt  debugging  debunking  deceiving_us_has_become_an_industrial_process  decision-making  decision-making_by_mutual_adjustment  decision_theory  decision_trees  decline_of_american_character  deconvolution  dedeo.simon  deep_learning  defenses_of_liberalism  degrees_of_freedom  delhi  delong.brad  democracy  demography  density_estimation  dependence_measures  design  design_for_a_brain  developmental_biology  development_economics  deviation_inequalities  dewey.john  diaconis.persi  didelez.vanessa  didion.joan  did_this_really_need_saying?  differential_privacy  diffusion_maps  diffusion_of_innovations  digital_humanities  dimension_reduction  dimon.jamie  dinardo.john  discrimination  disease  distributed_systems  diversity  division_of_labor  dolce_far_niente  domingos.pedro  don't_read_the_comments  donoho.david  dorman.peter  dragons  drexler.k._eric  drugs  dsges  dsm  dsquared  du.fu  dynamical_systems  dynamic_programming  dyson.freeman  early_modern_european_history  early_modern_world_history  earthquakes  eberhardt.frederick  eckles.dean  eco.umberto  ecology  econometrics  economics  economics_of_superstars  economic_growth  economic_history  economic_policy  eddy.william_f.  education  egypt  eichengreen.barry  eichler.michael  eigenproblems  einstein.albert  eliminative_induction  elites  elitism  ellner.stephen  ellsberg.daniel  elsevier  elster.jon  elwert.felix  emergence  emerson.john  empirical_processes  em_algorithm  endocrinology  energy  enraging  ensemble_methods  entableted  entropy  entropy_estimation  environmental_management  epidemic_models  epidemiology  epidemiology_of_ideas  epidemiology_of_representations  epistemology  equilibrium  ergodic_decomposition  ergodic_theory  erikson.robert_s  error-in-variables  eschatology  essays  estimation  ethics  ethnography  EU  evans.kellie_m.  even_the_liberal_new_republic  everything_is_obvious_once_you_know_the_answer  evidence_based  evisceration  evo-devo  evolution  evolutionary_biology  evolutionary_game_theory  evolutionary_psychology  evolution_of_complexity  evolution_of_cooperation  evolution_of_culture  evolution_of_intelligence  exchangeability  exchangeable_arrays  exchangeable_sequences  exile  experimental_biology  experimental_design  experimental_economics  experimental_political_science  experimental_psychology  experimental_sociology  experiments  expertise  experts  explanation  explanation_by_mechanisms  exploitation  exploration-exploitation  exponential_families  exponential_family_random_graphs  extreme_values  facebook  factor_analysis  fallows.james  falsification  family  fan.jianqing  fandom  fanfic  fantasy  faraway.j.j.  farrell.henry  fascism  fashion  fear  feedback  feminism  fermi.enrico  ferreira.j.a.  feynman.richard  field_theory  fienberg.stephen_e.  filtering  finance  financialization  financial_crisis_of_2007--  financial_markets  financial_speculation  fink.daniel  firefly  fish  fisher.franklin_m.  fisher.r.a.  fisher_information  fleuret.francois  florida  fluid_mechanics  flynn.james_r.  flynn_effect  fmri  foley.duncan  folklore  food  foreign_policy  forrester.jay  foster.dean_p.  foundations_of_statistics  fourier_analysis  fowler.james  fox.justin  frank.thomas  frase.peter  fraser.d.a.s.  fraud  freedman.david  freedman.david_a.  freedom_of_expression  friday_cat_blogging  friedman.ann  functional_connectivity  functional_data_analysis  functional_programming  fundamental_attribution_error  funny  funny:academic  funny:because_its_true  funny:geeky  funny:laughing_instead_of_screaming  funny:malicious  funny:morbid  funny:pointed  futurology  gabaix.xavier  gailey.jeannine_hall  galstyan.aram  game_theory  gates.bill  gaussian_processes  geekdom  geisser.seymour  gellner.ernest  gelman.andrew  geman.donald  gender  general_equilibrium  genetics  gene_expression_data_analysis  gene_regulation  genomics  genovese.christopher  geology  geometry  gerbils  getoor.lise  geyer.charles  geyer.charles_j.  ghahramani.zoubin  gigerenzer.gerd  gill.richard  gintis.herbert  git  gives_economists_a_bad_name  gladstone.max  glivenko-cantelli  globalization  glymour.clark  gneiting.tilmann  godfrey-smith.peter  gods_own_junkyard  god_and_golem_inc.  goel.sharad  goerg.georg  goernerup.olof  goldberg.michelle  goldenberg.anna  goldstein.dana  goodness-of-fit  google  gopnik.adam  gopnik.alison  government  grafton.anthony  granger_causality  graphical_models  graph_limits  graph_theory  grassberger.peter  gravitation  gray.robert_m  great_depression  great_risk_shift  great_transformation  greece  greenland.sander  gross  grossman.sanford  guns  haavelmo.trygve  hacker.jacob  hacking.ian  halpern.joseph_y.  hamilton.alexander  hansen.bruce  harrison.matthew_t.  hashing  hastie.trevor  have_forgotten  have_read  have_read_a_long_time_ago  have_read_too_many_times  have_skimmed  have_talked_about  have_taught  hayek.f.a._von  ha_ha_only_serious  healy.kieran  heard_the_talk  heath.joseph  heavy_tails  heckman.james  heene.moritz  heer.jeet  heinlein.robert  hendry.david  heritability  heteroskedasticity  heuristics  hierarchical_statistical_models  hierarchical_structure  high-dimensional_probability  high-dimensional_statistics  hilbert_space  hill.jennifer  historical_fiction  historical_genetics  historical_myths  historiography  history  history_of_economics  history_of_ideas  history_of_mathematics  history_of_physics  history_of_science  history_of_statistics  history_of_technology  hjort.nils_lid  hodrick-prescott_filter  hoeffdings_inequality  hoff.peter  hofman.jake  homogamy  homophily  homrighausen.darren  honor  hooker.giles  hoover.kevin  hormones  horror  hotelling.harold  hoyer.patrik_o.  hoyle.fred  huber.peter  humanities  human_evolution  human_genetics  hume.david  hurley.kameron  hyperbolic_geometry  hypocrisy  hypothesis  hypothesis_testing  ia!_ia!_raginsky_fhtagn!  ibn_sina  iceland  identifiability  identity_formation  identity_group_formation  ideology  imbens.guido  imbens.guido_w.  imf  imitation  imperfect_competition  imperialism  implicit_association_test  incest  independence_tests  independent_component_analysis  indirect_inference  individual_sequence_prediction  indonesia  induction  industrial_revolution  inequality  inference_to_latent_objects  influence  information_cascades  information_criteria  information_geometry  information_retrieval  information_theory  infrastructure  innovation  insects  institutional_change  institutions  instrumental_variables  insurance  intellectuals  intellectuals_in_politics  intellectual_property  intelligence_(spying)  intentional_explanation  interacting_particle_systems  internet  internet_of_things  interview  inverse_problems  in_memoriam  in_NB  in_the_acknowledgments  iq  irrationalism  ising_model  islam  islamic_civilization  italy  i_feel_like_we're_living_in_a_charlie_stross_novel_and_i_wish_he'd_cut_it_out  i_told_you_so  jackson.matthew_o.  jacobs.abigail_z.  jacoby.russell  janson.svante  janzing.dominik  japan  jiang.wenxin  jin.jiashun  johnson.steven_berlin  jones.richard  jordan.michael_i.  journalism  joyce.james  judaism  judis.john_b.  juking_the_stats  k-means  kahan.dan  kakade.sham  kalisch.markus  kallenberg.olav  karrer.brian  kass.robert  kearns.michael  kempthorne.oscar  kennan.george  kernel_estimators  kernel_methods  keynes.john_maynard  kiefer.n.m.  king.gary  king.stephen  kinsella.stephen  kirman.alan  kirshner.sergey  kissinger.henry  kitchens.bruce  kitcher.philip  kith_and_kin  kleinberg.jon  kolaczyk.eric  kolmogorov.andrei  konczal.michael  konczal.mike  kontorovich.aryeh  kontoyiannis.ioannis  kragh.helge  krivitsky.pavel  krugman.paul  kullback.solomon  kurtz.zachary  labor  labor_theory_of_value  lacerda.gustavo  lafferty.john  landauers_principle  landemore.helene  landscape  lang.kevin  langford.john  laplace_approximation  large_deviations  lasso  latent_dirichlet_allocation  lauritzen.steffen  law  lazer.david  lead  leamer.ed  learning  learning_in_games  learning_theory  lebowitz.joel  leckie.ann  lee.ann_b.  leek.jeffrey_t.  leenders.roger  lei.jing  lem.stanislaw  leonard.andrew  lepore.jill  lerman.kristina  lerner.abba  lesswrong  levi.john_martin  levi.margaret  levina.elizaveta  levitt.steven  levy.ferdinand  levy.frank  levy.j.t.  levy.jacob_t.  lewis.michael  le_cam.lucien  liberalism  liberman.mark  libertarianism  likelihood  likelihood_ratio_tests  limits_of_rationality  limit_theorems  linear_algebra  linear_regression  linguistics  lippman.laura  literary_criticism  literary_history  lithwick.dahlia  liu.han  lives_of_the_artists  lives_of_the_hustlers  lives_of_the_scholars  lives_of_the_scientists  lloyd.g.e.r.  lobbying  locality-sensitive_hashing  logic  logical_positivism  logistic_regression  logothetis.nikos  long-range_dependence  lovasz.laszlo  love  lovecraft.h.p.  low-rank_approximation  low-regret_learning  lucretius  luddites  lumley.thomas  maathuis.marloes_h.  machine_learning  mackenzie.donald  mackey.michael_c.  macleod.ken  macroeconomics  macro_from_micro  maes.pattie  major_transitions_of_evolution  management  manifold_learning  mankad.shawn  manning.patrick  mansfield.harvey_c.  manski.charles  manski.charles_f.  marcus.gary  marcus.gary_f.  marginalism  marketing  markets_as_collective_calculating_devices  market_failures_in_everything  market_making  market_socialism  markov_models  mars  martin.john_levi  martingales  marx.karl  marxism  mary_sue  masculinity  mason.j.w.  mason.winter  mass_culture  matching  materialism  mathematics  matloff.norman  matthew_effect  maxwells_demon  may.robert_m.  maya_civilization  mcauley.paul  mccloskey.deirdre  mccullagh.peter  mcdonald.daniel  mcfowland.edward_iii  mcwhorter.john  mean-field_theory  measurement  measure_theory  mechanism_design  medici.cosimo_de  medicine  medieval_eurasian_history  meehl.paul  megatherium_americanum  meinshausen.nicolai  memoir  memory  memo_to_self_real_scholars_write  mental_testing  mercier.hugo  merhav.neri  meritocracy  merton.robert.k.  meta-analysis  metalworking  methodological_advice  methodology  method_of_types  meyn.sean  miami  milanovic.branko  millenarianism  mims.christopher  minimax  minimum_description_length  mis-specification_testing  misogyny  missing_data  missing_persons  misspecification  mixing  mixture_models  modeling  model_averaging  model_checking  model_discovery  model_selection  model_theory  modernism  modernity  modern_ruins  modesty_forbids_further_comment  modest_proposals  mohism  mohri.meryar  molecular_biology  money  monopolistic_competition  monopoly  monte_carlo  moocs  moore.cris  moore.cristopher  moral_hazard  moral_panics  moral_philosophy  moral_psychology  moral_responsibility  moretti.franco  morley.james  morphology_of_the_folktale_of_our_time  morris.martina  mortality  movies  moyn.samuel  mucha.peter  mueller.john  multiple_comparisons  multiple_testing  multivariate_distributions  music  nagata.linda  nagle.andrea  naidu.suresh  nanotechnology  narrative  narrative_communities  national_surveillance_state  native_american_civilizations  native_american_history  natural_born_cyborgs  natural_history_of_truthiness  natural_language_processing  neanderthals  nearest_neighbors  neo-conservatism  neo-liberalism  neoliberalism  nepveu.kate  nerdworld  networked_life  networks  network_data_analysis  network_experiments  network_formation  network_sampling  network_visualization  neural_coding_and_decoding  neural_computation  neural_data_analysis  neural_networks  neurath.otto  neuropsychology  neuroscience  neville.jennifer  newman.mark  neyman.jerzy  nichols.thomas_e.  nichols.tom  nietzsche.friedrich  niezink.nynke  nigeria  nilsson_jacobi.martin  nisbett.richard  niyogi.partha  nobel.andrew  noel.hans  noethers_theorem  non-equilibrium  non-stationarity  nonparametrics  nordhaus.william  norman_cohn_died_for_your_sins  north.douglass  norton.john  norton.john_d.  norton.quinn  nostalgia  not_quite_scooped_exactly  novels  no_free_lunch_theorems  no_really_via:jbdelong  no_really_via:warrenellis  no_such_thing_as_false_positives  nudging  nugent.rebecca  nukes  nussbaum.martha  nyhan.brendan  o'connor.brendan  obama.barack  obesity  obituaries  obvious_to_one_skilled_in_the_art  occams_razor  oconnor.brendan  ogburn.elizabeth  ok_not_quite_scooped  oligopoly  one_effort_more_economists_if_you_would_be_scientists  one_mans_vicious_circle_is_another_mans_successive_approximation  online_learning  oops  operations_research  oppression  optics  optimization  oracle_inequalities  orbanz.peter  order_statistics  organizations  organized_crime_as_state-making  organized_crime_as_state_making  orientalism  ortberg.mallory  or_lack_thereof  our_decrepit_institutions  our_national_shame  out_of_their_depth  owen.art  oxytocin  p-values  padgett.john  page.scott  pagerank  paine.thomas  pakistan  paleontology  palmer.ada  panama_papers  parenting  particle_filters  parzen.emanuel  pasquale.frank  pasta.j  paternalism  path_dependence  pattern_formation  pattern_recognition  pearl.judea  pearson.karl  pedagogy  peer_production  peer_review  peirce.c.s.  pennsylvania  pensions  pentland.alex  perception  percolation  perry.patrick_o.  phase_transitions  philosophy  philosophy_of_mind  philosophy_of_science  philosophy_of_social_sciience  photography  photos  physics  physics_of_information  pierson.paul  piketty.thomas  pillow.jonathan  pittsburgh  plagues_and_peoples  plato  please_dont_let_me_be_scooped  poetry  point_processes  polanyi.karl  polanyi.michael  poldrack.russell  police  political_economy  political_networks  political_organizing  political_philosophy  political_science  politics  pollard.david  popper.karl_r.  popular_culture  popular_science  popular_social_science  populism  porter.mitchell  post-soviet_politics  potscher.benedict_m.  poverty  power  pr0n  practices_relating_to_the_transmission_of_genetic_information  pragmatism  prediction  prediction_markets  preference  prejudice  presentation_of_self  pretty_pictures  priest.dana  principal_components  principle_of_indifference  principle_of_least_action  privacy  private_property  privatization  probability  productivity  professionalism  professions  programming  progressive_forces  psychiatry  psychoceramica  psychoceramics  psychology  psychometrics  psychotherapy  ptsd  public_health  public_opinion  public_policy  publishing  quantum_mechanics  quiggin.john  R  race  racine.jeffrey  racine.jeffrey_s.  racism  racist_idiocy  raginsky.maxim  rahimi.ali  rakhlin.sasha  RAND  randomization  random_fields  random_forests  random_graphs  random_projections  random_walks  rants  rapture_for_nerds  rationality  rational_choice  rauchway.eric  ravikumar.pradeep  re:6dfb  re:actually-dr-internet-is-the-name-of-the-monsters-creator  re:ADAfaEPoV  re:almost_none  re:anti-nudge  re:any_p-value_distinguishable_from_zero_is_insufficiently_informative  re:AoS_project  re:bayes_as_evol  re:computational_lens  re:critique_of_diffusion  re:democratic_cognition  re:do-institutions-evolve  re:donor_networks  re:fitness_sampling  re:friday_science_cat_blogging  re:functional_communities  re:growing_ensemble_project  re:g_paper  re:homophily_and_confounding  re:hyperbolic_networks  re:in_soviet_union_optimization_problem_solves_you  re:knightian_uncertainty  re:learning_your_way_around_godel's_theorem  re:model_selection_for_networks  re:network_differences  re:neutral_cultural_networks  re:neutral_model_of_inquiry  re:pac-and-mar  re:phil-of-bayes_paper  re:reading_capital  re:simulating_coupled_markov_chains  re:small-area_estimation_by_smoothing  re:smoothing_adjacency_matrices  re:social_networks_as_sensor_networks  re:sporns_review  re:stacs  re:urban_scaling_what_urban_scaling  re:what_is_a_macrostate  re:what_is_the_right_null_model_for_linear_regression  re:XV_for_mixing  re:XV_for_networks  re:your_favorite_dsge_sucks  re:your_favorite_ergm_sucks  read_the_draft  real_analysis  reception_history  recht.benjamin  recommendation_systems  recurrence_times  recursive_estimation  reductionism  reed  reed.adolf  reefs  regression  regulation  reichenbach.hans  relational_learning  relativity  reliability_vs_validity  religion  renaissance_history  renormalization  renyi.alfred  renyi_entropy  replication  replication_crisis  reproducibility  reputation  reputation_systems  resampling  research_ethics  reutlinger.alexander  revealed_preferences  rhetoric  rhetorical_self-fashioning  richardson.thomas  richardson.thomas_s.  riedewald.mirek  rigollet.philippe  rinaldo.alessandro  risk  risk_assessment  risk_vs_uncertainty  robert.christian  robin.corey  robins.james  robots_and_robotics  robustness  robust_statistics  rodrik.dani  roeder.kathryn  rogers.john  romance  roman_a_clef  roman_empire  romer.paul  rosenblatt.murray  rosenblueth.arturo  rossman.gabriel  rosvall.martin  roy.nilanjana  rubin.donald  rubin.donald_b.  rubin.jonathan  rubinstein.ariel  running_dogs_of_reaction  rural_decay  rural_decline  russell.bertrand  russia  sabloff.jeremy  said.edward  salakhutdinov.ruslan  salmon.wesley  samii.cyrus  sampling  sandwiches  sarkar.purnamrita  satire  saudi_arabia  savage.leonard_j.  scheines.richard  scholarly_misconstruction_of_reality  schulz.kathryn  schumpeter.joseph  schuster.peter  science  science_as_a_social_process  science_fiction  science_in_society  science_policy  science_studies  scientific_computing  scooped  scooped?  scott.james  secrecy  securitization  self-fulfilling_prophecy  self-promotion  semi-supervised_learning  sensitive_dependence_on_initial_conditions  series_of_footnotes  servants  service_industry  seti  sexism  sexist_idiocy  sfw  shakespeare.william  shalev-shwartz.shai  sharing_economy  shore.jesse_c.  shot_after_a_fair_trial  silly_priors  silverman.bernard  simon.herbert  simulation  single_vision_and_newtons_sleep  slavery  slee.tom  slime_molds  sloths  small-area_estimation  smith.adam  smith.eric  smith.w._eugene  smoothing  snijders.t.a.b.  snijders.tom  snow.c.p.  snowden.edward  sobel.michael_e.  socialism  socialist_calculation_debate  social_cognition  social_construction  social_contagion  social_engineering  social_influence  social_life_of_the_mind  social_measurement  social_media  social_misconstruction_of_reality  social_movements  social_networks  social_psychology  social_science_methodology  social_theory  sociolinguistics  sociology  sociology_of_science  socrates  software  software_engineering  solidarity  solow.robert  something_about_america  sorokina.daria  space  space_exploration  spanos.aris  sparsity  spatial_statistics  spatio-temporal_statistics  spectral_clustering  spectral_methods  sperber.dan  spiders  spirits_of_places  spirtes.peter  splines  sponges  sprites.peter  st._louis  stability_of_learning  standardized_testing  stanley.h._eugene  stark.philip_b.  state-building  state-space_models  state_estimation  stationary_features  statistical_inference_for_stochastic_processes  statistical_interaction  statistical_mechanics  statistics  sterling.bruce  stigler.stephen  stiglitz.joseph  stochastic_differential_equations  stochastic_processes  stochastic_volatility  stories  strategic_ambiguity  strategic_position_in_networks  strauss.leo  structural_risk_minimization  student_evaluations  stupid_security  su.shi  sufficiency  summers.larry  sunstein.cass  superefficiency  superheroes  suresh.naidu  surveillance  surveys  swartz.aaron  symbiosis  symbolic_dynamics  symmetry  synchronizing_words  systems  tacit_knowledge  tang  taskar.ben  taxes  taxis  taylor.frederick  taylor.g.i.  teaching  technocracy  technological_change  technological_unemployment  teleology  television  temin.peter  termen.lev_sergeyevich  terrorism  terrorism_fears  testosterone  tewari.ambuj  texas  textual_criticism  text_mining  theoretical_computer_science  theory_of_mind  theory_of_value  thermodynamics  the_american_dilemma  the_continuing_crises  the_corporation_as_command_economy  the_nightmare_from_which_we_are_trying_to_awake  the_present_before_it_was_widely_distributed  the_problem_is_not_the_p-values  the_public_and_its_problems  the_rapture_for_nerds  the_singularity_has_happened  the_violence_inherent_in_the_system  the_wired_ideology  the_work_of_art_in_the_age_of_mechanical_reproduction  thomas.a.c.  tibshirani.robert  tibshirani.ryan  tilly.charles  time_series  tkacik.maureen  to:blog  to:NB  todorova.sonia  tofias.michael  tolkien.j.r.r.  tooze.adam  topic_models  totalitarianism  touchette.hugo  towards_an_algorithmic_criticism  to_read  to_teach  to_teach:baby-nets  to_teach:complexity-and-inference  to_teach:data-mining  to_teach:data_over_space_and_time  to_teach:graphons  to_teach:linear_models  to_teach:statcomp  to_teach:undergrad-ADA  to_teach:undergrad-research  tracked_down_references  track_down_references  transaction_costs  transmission_of_inequality  travelers'_tales  tripathy.shreejoy  trolling  trolls  trolls_and_trolling  true_knowledge  trump.donald  trust  truth  tsallis_statistics  tsingou.mary  tufekci.zeynep  tukey.john_w.  tuncel.selim  turbulence  turkey  tv_tropes  twilight_of_the_elites  twin_studies  two-sample_tests  two_cultures  typography  uber  ukraine  ulam.stanislaw  uncertainty  under_precisely_controlled_experimental_conditions_the_organism_does_what_it_damn_well_pleases  unions  universal_basic_income  university_of_chicago  urban.nathaniel  urbanism  urban_economics  user_interfaces  uses_of_the_past  ussr  us_civil_war  us_culture_wars  us_military  us_politics  utopia  utter_stupidity  value-added_measurement_in_education  vampires  vanderweele.tyler  van_der_vaart.aad  van_handel.ramon  van_roy.benjamin  variable_selection  variance_estimation  variational_inference  vast_right-wing_conspiracy  vc-dimension  venkatasubramanian.suresh  ventura.sam  ventura.valerie  version_control  ver_steeg.greg  via:?  via:???  via:absfac  via:abumuqawama  via:aeo  via:ariddell  via:arinaldo  via:arsyed  via:arthegall  via:auerbach  via:blyth  via:bob_williamson  via:brendano  via:civilstat  via:clay  via:coates.ta-nehisi  via:cris_moore  via:crooked_timber  via:deaneckles  via:ded-maxim  via:dena  via:djm1107  via:dsquared  via:eaterofsun  via:erindanielson  via:everyone  via:felix_gilman  via:fionajay  via:fred_feinberg  via:gelman  via:georg  via:gptp2016  via:guslacerda  via:henry_farrell  via:io9  via:iqss  via:jbdelong  via:jcgoodwin  via:kass  via:katenepveu  via:ken_macleod  via:kevin_drum  via:kjhealy  via:klk  via:krugman  via:languagelog  via:larry_wasserman  via:martens  via:mathbabe  via:matthew_berryman  via:mejn  via:mindhacks  via:minoli  via:moritz-heene  via:mraginsky  via:ogburn  via:orgtheory  via:perspectivelute  via:phnk  via:rortybomb  via:rvenkat  via:samii  via:shivak  via:simon_d.  via:simply_statistics  via:slaniel  via:spirtes  via:tealtan  via:tozier  via:tslumley  via:tsuomela  via:unfogged  via:unlikely_worlds  via:vaguery  via:vqv  via:waggish  via:warrenellis  via:whimsley  via:wiggins  via:xmarquez  via:yorksranter  vidal.gore  video_games  vietnam_war  violence  viruses  visual_display_of_quantitative_information  voter_model  vovk.vladimir_g.  wade.nicholas  wager.stefan  wagner.andreas  wainwright.martin_j.  waiting_times  wald.abraham  waldman.steven_randy  war  wasserman.larry  was_on_the_committee  watkins.nicholas  watson.libby  watts.duncan  weak_dependence  weather_prediction  weaver.rhiannon  web  weintraub.jeff  welfare_economics  welfare_state  wells.h.g.  we_are_as_gods_and_might_as_well_get_good_at_it  we_can_look  whats_gone_wrong_with_america  white.halbert  why_corporations_are_messed_up  why_oh_why_cant_we_have_a_better_academic_publishing_system  why_oh_why_cant_we_have_a_better_intelligentsia  why_oh_why_cant_we_have_a_better_press_corps  wiener.norbert  wiesner.karoline  wilkins.jon_f.  wilks.s._s.  willett.rebecca_m.  williamson.robert_c.  wisconsin  wlezien.christopher  wolfe.gene  wolfe.patrick_j.  wolff.robert_paul  wolfowitz.j.  word2vec  world_bank  world_history  wright.erik_olin  writing  writing_advice  wtf?  wu.wei_biao  xing.eric  xkcd  yang.wesley  yarkoni.tal  yglesias.matthew  yoga  your_favorite_deep_neural_network_sucks  you_are_the_product_being_sold  yu.bin  yudkowsky.eliezer  zenker.sven  zhang.tong  ziliak.stephen  zilsel.edgar  zingales.luigi  zuckerman.ethan 

Copy this bookmark:



description:


tags: