cshalizi + prediction   326

Evaluating Probabilistic Forecasts with scoringRules | Jordan | Journal of Statistical Software
"Probabilistic forecasts in the form of probability distributions over future events have become popular in several fields including meteorology, hydrology, economics, and demography. In typical applications, many alternative statistical models and data sources can be used to produce probabilistic forecasts. Hence, evaluating and selecting among competing methods is an important task. The scoringRules package for R provides functionality for comparative evaluation of probabilistic models based on proper scoring rules, covering a wide range of situations in applied work. This paper discusses implementation and usage details, presents case studies from meteorology and economics, and points to the relevant background literature."
to:NB  prediction  statistics  to_teach:undergrad-ADA  to_teach:data-mining 
3 days ago by cshalizi
[1908.07204] Forecasting observables with particle filters: Any filter will do!
"We investigate the impact of filter choice on forecast accuracy in state space models. The filters are used both to estimate the posterior distribution of the parameters, via a particle marginal Metropolis-Hastings (PMMH) algorithm, and to produce draws from the filtered distribution of the final state. Multiple filters are entertained, including two new data-driven methods. Simulation exercises are used to document the performance of each PMMH algorithm, in terms of computation time and the efficiency of the chain. We then produce the forecast distributions for the one-step-ahead value of the observed variable, using a fixed number of particles and Markov chain draws. Despite distinct differences in efficiency, the filters yield virtually identical forecasting accuracy, with this result holding under both correct and incorrect specification of the model. This invariance of forecast performance to the specification of the filter also characterizes an empirical analysis of S&P500 daily returns."
to:NB  time_series  particle_filters  prediction  state_estimation  state-space_models  to_teach:data_over_space_and_time  statistics 
3 days ago by cshalizi
The Incompatible Incentives of Private Sector AI by Tom Slee :: SSRN
"Algorithms that sort people into categories are plagued by incompatible incentives. While more accurate algorithms may address problems of statistical bias and unfairness, they cannot solve the ethical challenges that arise from incompatible incentives.
"Subjects of algorithmic decisions seek to optimize their outcomes, but such efforts may degrade the accuracy of the algorithm. To maintain their accuracy, algorithms must be accompanied by supplementary rules: “guardrails” that dictate the limits of acceptable behaviour by subjects. Algorithm owners are drawn into taking on the tasks of governance, managing and validating the behaviour of those who interact with their systems.
"The governance role offers temptations to indulge in regulatory arbitrage. If governance is left to algorithm owners, it may lead to arbitrary and restrictive controls on individual behaviour. The goal of algorithmic governance by automated decision systems, social media recommender systems, and rating systems is a mirage, retreating into the distance whenever we seem to approach it."
to:NB  mechanism_design  prediction  data_mining  slee.tom  to_read  to_teach:data-mining 
4 days ago by cshalizi
[1811.06407] Neural Predictive Belief Representations
"Unsupervised representation learning has succeeded with excellent results in many applications. It is an especially powerful tool to learn a good representation of environments with partial or noisy observations. In partially observable domains it is important for the representation to encode a belief state, a sufficient statistic of the observations seen so far. In this paper, we investigate whether it is possible to learn such a belief representation using modern neural architectures. Specifically, we focus on one-step frame prediction and two variants of contrastive predictive coding (CPC) as the objective functions to learn the representations. To evaluate these learned representations, we test how well they can predict various pieces of information about the underlying state of the environment, e.g., position of the agent in a 3D maze. We show that all three methods are able to learn belief representations of the environment, they encode not only the state information, but also its uncertainty, a crucial aspect of belief states. We also find that for CPC multi-step predictions and action-conditioning are critical for accurate belief representations in visually complex environments. The ability of neural representations to capture the belief information has the potential to spur new advances for learning and planning in partially observable domains, where leveraging uncertainty is essential for optimal decision making."
to:NB  prediction  predictive_representations  inference_to_latent_objects  neural_networks  to_read 
4 days ago by cshalizi
[1908.06729] Autoregressive-Model-Based Methods for Online Time Series Prediction with Missing Values: an Experimental Evaluation
"Time series prediction with missing values is an important problem of time series analysis since complete data is usually hard to obtain in many real-world applications. To model the generation of time series, autoregressive (AR) model is a basic and widely used one, which assumes that each observation in the time series is a noisy linear combination of some previous observations along with a constant shift. To tackle the problem of prediction with missing values, a number of methods were proposed based on various data models. For real application scenarios, how do these methods perform over different types of time series with different levels of data missing remains to be investigated. In this paper, we focus on online methods for AR-model-based time series prediction with missing values. We adapted five mainstream methods to fit in such a scenario. We make detailed discussion on each of them by introducing their core ideas about how to estimate the AR coefficients and their different strategies to deal with missing values. We also present algorithmic implementations for better understanding. In order to comprehensively evaluate these methods and do the comparison, we conduct experiments with various configurations of relative parameters over both synthetic and real data. From the experimental results, we derived several noteworthy conclusions and shows that imputation is a simple but reliable strategy to handle missing values in online prediction tasks."
to:NB  time_series  prediction  missing_data  statistics  to_teach:data_over_space_and_time 
4 days ago by cshalizi
[1908.06437] Block Nearest Neighboor Gaussian processes for large datasets
"This work develops a valid spatial block-Nearest Neighbor Gaussian process (block-NNGP) for estimation and prediction of location-referenced large spatial datasets. The key idea behind our approach is to subdivide the spatial domain into several blocks which are dependent under some constraints. The cross-blocks capture the large-scale spatial variation, while each block capture the small-scale dependence. The block-NNGP is embeded as a sparsity-inducing prior within a hierarchical modeling framework. Markov chain Monte Carlo (MCMC) algorithms are executed without storing or decomposing large matrices, while the sparse block precision matrix is efficiently computed through parallel computing. We also consider alternate MCMC algorithms through composite sampling for faster computing time, and more reproducible Bayesian inference. The performance of the block-NNGP is illustrated using simulation studies and applications with massive real data, for locations in the order of 10^4."
to:NB  spatial_statistics  prediction  computational_statistics  statistics  to_teach:data_over_space_and_time 
4 days ago by cshalizi
[1908.06936] ExaGeoStatR: A Package for Large-Scale Geostatistics in R
"Parallel computing in Gaussian process calculation becomes a necessity for avoiding computational and memory restrictions associated with Geostatistics applications. The evaluation of the Gaussian log-likelihood function requires O(n^2) storage and O(n^3) operations where n is the number of geographical locations. In this paper, we present ExaGeoStatR, a package for large-scale Geostatistics in R that supports parallel computation of the maximum likelihood function on shared memory, GPU, and distributed systems. The parallelization depends on breaking down the numerical linear algebra operations into a set of tasks and rendering them for a task-based programming model. ExaGeoStatR supports several maximum likelihood computation variants such as exact, Diagonal Super Tile (DST), and Tile Low-Rank (TLR) approximation besides providing a tool to generate large-scale synthetic datasets which can be used to test and compare different approximations methods. The package can be used directly through the R environment without any C, CUDA, or MPIknowledge. Here, we demonstrate the ExaGeoStatR package by illustrating its implementation details, analyzing its performance on various parallel architectures, and assessing its accuracy using both synthetic datasets and a sea surface temperature dataset. The performance evaluation involves spatial datasets with up to 250K observations."
to:NB  spatial_statistics  prediction  computational_statistics  R  statistics  to_teach:data_over_space_and_time 
4 days ago by cshalizi
[1611.04460] Predictive, finite-sample model choice for time series under stationarity and non-stationarity
"In statistical research there usually exists a choice between structurally simpler or more complex models. We argue that, even if a more complex, locally stationary time series model were true, then a simple, stationary time series model may be advantageous to work with under parameter uncertainty. We present a new model choice methodology, where one of two competing approaches is chosen based on its empirical, finite-sample performance with respect to prediction, in a manner that ensures interpretability. A rigorous, theoretical analysis of the procedure is provided. As an important side result we prove, for possibly diverging model order, that the localised Yule-Walker estimator is strongly, uniformly consistent under local stationarity. An R package, forecastSNSTS, is provided and used to apply the methodology to financial and meteorological data in empirical examples. We further provide an extensive simulation study and discuss when it is preferable to base forecasts on the more volatile time-varying estimates and when it is advantageous to forecast as if the data were from a stationary process, even though they might not be."
to:NB  time_series  prediction  model_selection  statistics  non-stationarity 
8 days ago by cshalizi
[1908.02718] A Characterization of Mean Squared Error for Estimator with Bagging
"Bagging can significantly improve the generalization performance of unstable machine learning algorithms such as trees or neural networks. Though bagging is now widely used in practice and many empirical studies have explored its behavior, we still know little about the theoretical properties of bagged predictions. In this paper, we theoretically investigate how the bagging method can reduce the Mean Squared Error (MSE) when applied on a statistical estimator. First, we prove that for any estimator, increasing the number of bagged estimators N in the average can only reduce the MSE. This intuitive result, observed empirically and discussed in the literature, has not yet been rigorously proved. Second, we focus on the standard estimator of variance called unbiased sample variance and we develop an exact analytical expression of the MSE for this estimator with bagging.
"This allows us to rigorously discuss the number of iterations N and the batch size m of the bagging method. From this expression, we state that only if the kurtosis of the distribution is greater than 32, the MSE of the variance estimator can be reduced with bagging. This result is important because it demonstrates that for distribution with low kurtosis, bagging can only deteriorate the performance of a statistical prediction. Finally, we propose a novel general-purpose algorithm to estimate with high precision the variance of a sample."
to:NB  ensemble_methods  prediction  regression  statistics 
16 days ago by cshalizi
[1908.02614] The power of dynamic social networks to predict individuals' mental health
"Precision medicine has received attention both in and outside the clinic. We focus on the latter, by exploiting the relationship between individuals' social interactions and their mental health to develop a predictive model of one's likelihood to be depressed or anxious from rich dynamic social network data. To our knowledge, we are the first to do this. Existing studies differ from our work in at least one aspect: they do not model social interaction data as a network; they do so but analyze static network data; they examine "correlation" between social networks and health but without developing a predictive model; or they study other individual traits but not mental health. In a systematic and comprehensive evaluation, we show that our predictive model that uses dynamic social network data is superior to its static network as well as non-network equivalents when run on the same data."
to:NB  social_networks  psychiatry  sociology  prediction  network_data_analysis  lizardo.omar  to_read 
16 days ago by cshalizi
Dimension reduction for the conditional mean and variance functions in time series - Park - - Scandinavian Journal of Statistics - Wiley Online Library
"This paper deals with the nonparametric estimation of the mean and variance functions of univariate time series data. We propose a nonparametric dimension reduction technique for both mean and variance functions of time series. This method does not require any model specification and instead we seek directions in both the mean and variance functions such that the conditional distribution of the current observation given the vector of past observations is the same as that of the current observation given a few linear combinations of the past observations without loss of inferential information. The directions of the mean and variance functions are estimated by maximizing the Kullback‐Leibler distance function. The consistency of the proposed estimators is established. A computational procedure is introduced to detect lags of the conditional mean and variance functions in practice. Numerical examples and simulation studies are performed to illustrate and evaluate the performance of the proposed estimators."
to:NB  prediction  time_series  dimension_reduction  statistics  information_theory 
17 days ago by cshalizi
[1810.02909] On the Art and Science of Machine Learning Explanations
"This text discusses several popular explanatory methods that go beyond the error measurements and plots traditionally used to assess machine learning models. Some of the explanatory methods are accepted tools of the trade while others are rigorously derived and backed by long-standing theory. The methods, decision tree surrogate models, individual conditional expectation (ICE) plots, local interpretable model-agnostic explanations (LIME), partial dependence plots, and Shapley explanations, vary in terms of scope, fidelity, and suitable application domain. Along with descriptions of these methods, this text presents real-world usage recommendations supported by a use case and public, in-depth software examples for reproducibility."
to:NB  data_mining  prediction  explanation  to_teach:data-mining 
19 days ago by cshalizi
[1907.08742] Estimating the Algorithmic Variance of Randomized Ensembles via the Bootstrap
"Although the methods of bagging and random forests are some of the most widely used prediction methods, relatively little is known about their algorithmic convergence. In particular, there are not many theoretical guarantees for deciding when an ensemble is "large enough" --- so that its accuracy is close to that of an ideal infinite ensemble. Due to the fact that bagging and random forests are randomized algorithms, the choice of ensemble size is closely related to the notion of "algorithmic variance" (i.e. the variance of prediction error due only to the training algorithm). In the present work, we propose a bootstrap method to estimate this variance for bagging, random forests, and related methods in the context of classification. To be specific, suppose the training dataset is fixed, and let the random variable Errt denote the prediction error of a randomized ensemble of size t. Working under a "first-order model" for randomized ensembles, we prove that the centered law of Errt can be consistently approximated via the proposed method as t→∞. Meanwhile, the computational cost of the method is quite modest, by virtue of an extrapolation technique. As a consequence, the method offers a practical guideline for deciding when the algorithmic fluctuations of Errt are negligible."
to:NB  ensemble_methods  computational_statistics  statistics  prediction  to_teach:data-mining 
4 weeks ago by cshalizi
[1907.09013] Conscientious Classification: A Data Scientist's Guide to Discrimination-Aware Classification
"Recent research has helped to cultivate growing awareness that machine learning systems fueled by big data can create or exacerbate troubling disparities in society. Much of this research comes from outside of the practicing data science community, leaving its members with little concrete guidance to proactively address these concerns. This article introduces issues of discrimination to the data science community on its own terms. In it, we tour the familiar data mining process while providing a taxonomy of common practices that have the potential to produce unintended discrimination. We also survey how discrimination is commonly measured, and suggest how familiar development processes can be augmented to mitigate systems' discriminatory potential. We advocate that data scientists should be intentional about modeling and reducing discriminatory outcomes. Without doing so, their efforts will result in perpetuating any systemic discrimination that may exist, but under a misleading veil of data-driven objectivity."
to:NB  classifiers  algorithmic_fairness  prediction  to_teach:data-mining  o'neil.cathy 
4 weeks ago by cshalizi
[1907.08679] Recommender Systems with Heterogeneous Side Information
"In modern recommender systems, both users and items are associated with rich side information, which can help understand users and items. Such information is typically heterogeneous and can be roughly categorized into flat and hierarchical side information. While side information has been proved to be valuable, the majority of existing systems have exploited either only flat side information or only hierarchical side information due to the challenges brought by the heterogeneity. In this paper, we investigate the problem of exploiting heterogeneous side information for recommendations. Specifically, we propose a novel framework jointly captures flat and hierarchical side information with mathematical coherence. We demonstrate the effectiveness of the proposed framework via extensive experiments on various real-world datasets. Empirical results show that our approach is able to lead a significant performance gain over the state-of-the-art methods."
to:NB  recommender_systems  prediction  to_teach:data-mining 
4 weeks ago by cshalizi
[1907.01552] Forecasting high-dimensional dynamics exploiting suboptimal embeddings
"Delay embedding---a method for reconstructing dynamical systems by delay coordinates---is widely used to forecast nonlinear time series as a model-free approach. When multivariate time series are observed, several existing frameworks can be applied to yield a single forecast combining multiple forecasts derived from various embeddings. However, the performance of these frameworks is not always satisfactory because they randomly select embeddings or use brute force and do not consider the diversity of the embeddings to combine. Herein, we develop a forecasting framework that overcomes these existing problems. The framework exploits various "suboptimal embeddings" obtained by minimizing the in-sample error via combinatorial optimization. The framework achieves the best results among existing frameworks for sample toy datasets and a real-world flood dataset. We show that the framework is applicable to a wide range of data lengths and dimensions. Therefore, the framework can be applied to various fields such as neuroscience, ecology, finance, fluid dynamics, weather, and disaster prevention."
to:NB  dynamical_systems  time_series  prediction  geometry_from_a_time_series 
5 weeks ago by cshalizi
[1906.08832] A Flexible Pipeline for Prediction of Tropical Cyclone Paths
"Hurricanes and, more generally, tropical cyclones (TCs) are rare, complex natural phenomena of both scientific and public interest. The importance of understanding TCs in a changing climate has increased as recent TCs have had devastating impacts on human lives and communities. Moreover, good prediction and understanding about the complex nature of TCs can mitigate some of these human and property losses. Though TCs have been studied from many different angles, more work is needed from a statistical approach of providing prediction regions. The current state-of-the-art in TC prediction bands comes from the National Hurricane Center of the National Oceanographic and Atmospheric Administration (NOAA), whose proprietary model provides "cones of uncertainty" for TCs through an analysis of historical forecast errors.
"The contribution of this paper is twofold. We introduce a new pipeline that encourages transparent and adaptable prediction band development by streamlining cyclone track simulation and prediction band generation. We also provide updates to existing models and novel statistical methodologies in both areas of the pipeline, respectively."
to:NB  cyclones  prediction  statistics  spatio-temporal_statistics  schafer.chad  dunn.robin  kith_and_kin  to_teach:data_over_space_and_time 
8 weeks ago by cshalizi
[1906.05473] Selective prediction-set models with coverage guarantees
"Though black-box predictors are state-of-the-art for many complex tasks, they often fail to properly quantify predictive uncertainty and may provide inappropriate predictions for unfamiliar data. Instead, we can learn more reliable models by letting them either output a prediction set or abstain when the uncertainty is high. We propose training these selective prediction-set models using an uncertainty-aware loss minimization framework, which unifies ideas from decision theory and robust maximum likelihood. Moreover, since black-box methods are not guaranteed to output well-calibrated prediction sets, we show how to calculate point estimates and confidence intervals for the true coverage of any selective prediction-set model, as well as a uniform mixture of K set models obtained from K-fold sample-splitting. When applied to predicting in-hospital mortality and length-of-stay for ICU patients, our model outperforms existing approaches on both in-sample and out-of-sample age groups, and our recalibration method provides accurate inference for prediction set coverage."
to:NB  prediction  statistics 
10 weeks ago by cshalizi
[1906.04711] ProPublica's COMPAS Data Revisited
"In this paper I re-examine the COMPAS recidivism score and criminal history data collected by ProPublica in 2016, which has fueled intense debate and research in the nascent field of `algorithmic fairness' or `fair machine learning' over the past three years. ProPublica's COMPAS data is used in an ever-increasing number of studies to test various definitions and methodologies of algorithmic fairness. This paper takes a closer look at the actual datasets put together by ProPublica. In particular, I examine the distribution of defendants across COMPAS screening dates and find that ProPublica made an important data processing mistake when it created some of the key datasets most often used by other researchers. Specifically, the datasets built to study the likelihood of recidivism within two years of the original COMPAS screening date. As I show in this paper, ProPublica made a mistake implementing the two-year sample cutoff rule for recidivists in such datasets (whereas it implemented an appropriate two-year sample cutoff rule for non-recidivists). As a result, ProPublica incorrectly kept a disproportionate share of recidivists. This data processing mistake leads to biased two-year recidivism datasets, with artificially high recidivism rates. This also affects the positive and negative predictive values. On the other hand, this data processing mistake does not impact some of the key statistical measures highlighted by ProPublica and other researchers, such as the false positive and false negative rates, nor the overall accuracy."
to:NB  data_sets  crime  prediction  to_teach:data-mining 
10 weeks ago by cshalizi
[1905.12262] Flexible Mining of Prefix Sequences from Time-Series Traces
"Mining temporal assertions from time-series data using information theory to filter real properties from incidental ones is a practically significant challenge. The problem is complex for continuous or hybrid systems because the degrees of influence on a consequent from a timed-sequence of predicates (called its prefix sequence), varies continuously over dense time intervals. We propose a parameterized method that uses interval arithmetic for flexibly learning prefix sequences having influence on a defined consequent over various time scales and predicates over system variables."
to:NB  prediction  time_series  variable-length_markov_chains  re:AoS_project 
12 weeks ago by cshalizi
[1905.11744] Evaluating time series forecasting models: An empirical study on performance estimation methods
"Performance estimation aims at estimating the loss that a predictive model will incur on unseen data. These procedures are part of the pipeline in every machine learning project and are used for assessing the overall generalisation ability of predictive models. In this paper we address the application of these methods to time series forecasting tasks. For independent and identically distributed data the most common approach is cross-validation. However, the dependency among observations in time series raises some caveats about the most appropriate way to estimate performance in this type of data and currently there is no settled way to do so. We compare different variants of cross-validation and of out-of-sample approaches using two case studies: One with 62 real-world time series and another with three synthetic time series. Results show noticeable differences in the performance estimation methods in the two scenarios. In particular, empirical experiments suggest that cross-validation approaches can be applied to stationary time series. However, in real-world scenarios, when different sources of non-stationary variation are at play, the most accurate estimates are produced by out-of-sample methods that preserve the temporal order of observations."
to:NB  time_series  prediction  cross-validation  model_selection  re:XV_for_mixing  statistics 
12 weeks ago by cshalizi
[1905.10634] Adaptive, Distribution-Free Prediction Intervals for Deep Neural Networks
"This paper addresses the problem of assessing the variability of predictions from deep neural networks. There is a growing literature on using and improving the predictive accuracy of deep networks, but a concomitant improvement in the quantification of their uncertainty is lacking. We provide a prediction interval network (PI-Network) which is a transparent, tractable modification of the standard predictive loss used to train deep networks. The PI-Network outputs three values instead of a single point estimate and optimizes a loss function inspired by quantile regression. We go beyond merely motivating the construction of these networks and provide two prediction interval methods with provable, finite sample coverage guarantees without any assumptions on the underlying distribution from which our data is drawn. We only require that the observations are independent and identically distributed. Furthermore, our intervals adapt to heteroskedasticity and asymmetry in the conditional distribution of the response given the covariates. The first method leverages the conformal inference framework and provides average coverage. The second method provides a new, stronger guarantee by conditioning on the observed data. Lastly, our loss function does not compromise the predictive accuracy of the network like other prediction interval methods. We demonstrate the ease of use of the PI-Network as well as its improvements over other methods on both simulated and real data. As the PI-Network can be used with a host of deep learning methods with only minor modifications, its use should become standard practice, much like reporting standard errors along with mean estimates."
to:NB  prediction  confidence_sets  neural_networks  regression  leeb.hannes  statistics 
12 weeks ago by cshalizi
Keeping Score: Predictive Analytics in Policing | Annual Review of Criminology
"Predictive analytics in policing is a data-driven approach to (a) characterizing crime patterns across time and space and (b) leveraging this knowledge for the prevention of crime and disorder. This article outlines the current state of the field, providing a review of forecasting tools that have been successfully applied by police to the task of crime prediction. We then discuss options for structured design and evaluation of a predictive policing program so that the benefits of proactive intervention efforts are maximized given fixed resource constraints. We highlight examples of predictive policing programs that have been implemented and evaluated by police agencies in the field. Finally, we discuss ethical issues related to predictive analytics in policing and suggest approaches for minimizing potential harm to vulnerable communities while providing an equitable distribution of the benefits of crime prevention across populations within police jurisdiction."
to:NB  police  crime  prediction  data_mining  to_teach:data-mining 
may 2019 by cshalizi
Mapping Sea-Level Change in Time, Space, and Probability | Annual Review of Environment and Resources
"Future sea-level rise generates hazards for coastal populations, economies, infrastructure, and ecosystems around the world. The projection of future sea-level rise relies on an accurate understanding of the mechanisms driving its complex spatio-temporal evolution, which must be founded on an understanding of its history. We review the current methodologies and data sources used to reconstruct the history of sea-level change over geological (Pliocene, Last Interglacial, and Holocene) and instrumental (tide-gauge and satellite alimetry) eras, and the tools used to project the future spatial and temporal evolution of sea level. We summarize the understanding of the future evolution of sea level over the near (through 2050), medium (2100), and long (post-2100) terms. Using case studies from Singapore and New Jersey, we illustrate the ways in which current methodologies and data sources can constrain future projections, and how accurate projections can motivate the development of new sea-level research questions across relevant timescales."

(Last tag unusually tentative)
to:NB  climate_change  prediction  oceanography  to_teach:data_over_space_and_time 
may 2019 by cshalizi
On the Statistical Formalism of Uncertainty Quantification | Annual Review of Statistics and Its Application
"The use of models to try to better understand reality is ubiquitous. Models have proven useful in testing our current understanding of reality; for instance, climate models of the 1980s were built for science discovery, to achieve a better understanding of the general dynamics of climate systems. Scientific insights often take the form of general qualitative predictions (i.e., “under these conditions, the Earth's poles will warm more than the rest of the planet”); such use of models differs from making quantitative forecasts of specific events (i.e. “high winds at noon tomorrow at London's Heathrow Airport”). It is sometimes hoped that, after sufficient model development, any model can be used to make quantitative forecasts for any target system. Even if that were the case, there would always be some uncertainty in the prediction. Uncertainty quantification aims to provide a framework within which that uncertainty can be discussed and, ideally, quantified, in a manner relevant to practitioners using the forecast system. A statistical formalism has developed that claims to be able to accurately assess the uncertainty in prediction. This article is a discussion of if and when this formalism can do so. The article arose from an ongoing discussion between the authors concerning this issue, the second author generally being considerably more skeptical concerning the utility of the formalism in providing quantitative decision-relevant information."
to:NB  to_read  statistics  prediction  risk_vs_uncertainty  smith.leonard  berger.james  foundations_of_statistics 
may 2019 by cshalizi
On Prediction Properties of Kriging: Uniform Error Bounds and Robustness: Journal of the American Statistical Association: Vol 0, No 0
"Kriging based on Gaussian random fields is widely used in reconstructing unknown functions. The kriging method has pointwise predictive distributions which are computationally simple. However, in many applications one would like to predict for a range of untried points simultaneously. In this work, we obtain some error bounds for the simple and universal kriging predictor under the uniform metric. It works for a scattered set of input points in an arbitrary dimension, and also covers the case where the covariance function of the Gaussian process is misspecified. These results lead to a better understanding of the rate of convergence of kriging under the Gaussian or the Matérn correlation functions, the relationship between space-filling designs and kriging models, and the robustness of the Matérn correlation functions. Supplementary materials for this article are available online."

(The last tag is really more "look at this before I teach that course next time, to see if any of it is worth giving a pointer to for the more advanced students")
in_NB  kriging  spatial_statistics  prediction  smoothing  statistics  to_teach:data_over_space_and_time 
may 2019 by cshalizi
[1904.04765] Generic Variance Bounds on Estimation and Prediction Errors in Time Series Analysis: An Entropy Perspective
"In this paper, we obtain generic bounds on the variances of estimation and prediction errors in time series analysis via an information-theoretic approach. It is seen in general that the error bounds are determined by the conditional entropy of the data point to be estimated or predicted given the side information or past observations. Additionally, we discover that in order to achieve the prediction error bounds asymptotically, the necessary and sufficient condition is that the "innovation" is asymptotically white Gaussian. When restricted to Gaussian processes and 1-step prediction, our bounds are shown to reduce to the Kolmogorov-Szegö formula and Wiener-Masani formula known from linear prediction theory."
to:NB  information_theory  prediction  time_series  statistics  to_teach:data_over_space_and_time 
may 2019 by cshalizi
[1904.06019] Conformal Prediction Under Covariate Shift
"We extend conformal prediction methodology beyond the case of exchangeable data. In particular, we show that a weighted version of conformal prediction can be used to compute distribution-free prediction intervals for problems in which the test and training covariate distributions differ, but the likelihood ratio between these two distributions is known---or, in practice, can be estimated accurately with access to a large set of unlabeled data (test covariate points). Our weighted extension of conformal prediction also applies more generally, to settings in which the data satisfies a certain weighted notion of exchangeability. We discuss other potential applications of our new conformal methodology, including latent variable and missing data problems."
to:NB  to_read  statistics  prediction  conformal_prediction  ramdas.aaditya  tibshirani.ryan  kith_and_kin  covariate_shift 
may 2019 by cshalizi
[1903.01048] Early Detection of Influenza outbreaks in the United States
"Public health surveillance systems often fail to detect emerging infectious diseases, particularly in resource limited settings. By integrating relevant clinical and internet-source data, we can close critical gaps in coverage and accelerate outbreak detection. Here, we present a multivariate algorithm that uses freely available online data to provide early warning of emerging influenza epidemics in the US. We evaluated 240 candidate predictors and found that the most predictive combination does \textit{not} include surveillance or electronic health records data, but instead consists of eight Google search and Wikipedia pageview time series reflecting changing levels of interest in influenza-related topics. In cross validation on 2010-2016 data, this algorithm sounds alarms an average of 16.4 weeks prior to influenza activity reaching the Center for Disease Control and Prevention (CDC) threshold for declaring the start of the season. In an out-of-sample test on data from the rapidly-emerging fall wave of the 2009 H1N1 pandemic, it recognized the threat five weeks in advance of this surveillance threshold. Simpler algorithms, including fixed week-of-the-year triggers, lag the optimized alarms by only a few weeks when detecting seasonal influenza, but fail to provide early warning in the 2009 pandemic scenario. This demonstrates a robust method for designing next generation outbreak detection algorithms. By combining scan statistics with machine learning, it identifies tractable combinations of data sources (from among thousands of candidates) that can provide early warning of emerging infectious disease threats worldwide."
to:NB  statistics  prediction  epidemiology  meyers.lauren_ancel 
april 2019 by cshalizi
[1903.02131] A Prediction Tournament Paradox
"In a prediction tournament, contestants "forecast" by asserting a numerical probability for each of (say) 100 future real-world events. The scoring system is designed so that (regardless of the unknown true probabilities) more accurate forecasters will likely score better. This is true for one-on-one comparisons between contestants. But consider a realistic-size tournament with many contestants, with a range of accuracies. It may seem self-evident that the winner will likely be one of the most accurate forecasters. But, in the setting where the range extends to very accurate forecasters, simulations show this is mathematically false, within a somewhat plausible model. Even outside that setting the winner is less likely than intuition suggests to be one of the handful of best forecasters. Though implicit in recent technical papers, this paradox has apparently not been explicitly pointed out before, though is easily explained. It perhaps has implications for the ongoing IARPA-sponsored research programs involving forecasting."
to:NB  prediction  aldous.david 
april 2019 by cshalizi
[1903.08125] Predictive Clustering
"We show how to convert any clustering into a prediction set. This has the effect of converting the clustering into a (possibly overlapping) union of spheres or ellipsoids. The tuning parameters can be chosen to minimize the size of the prediction set. When applied to k-means clustering, this method solves several problems: the method tells us how to choose k, how to merge clusters and how to replace the Voronoi partition with more natural shapes. We show that the same reasoning can be applied to other clustering methods."
to:NB  statistics  kith_and_kin  prediction  clustering  rinaldo.alessandro  wasserman.larry 
april 2019 by cshalizi
Polar Vortex 2019: Why Forecasts Are So Accurate Now - The Atlantic
Actually teaching this would mean learning a lot about the history & current state of weather forecasting...
prediction  meteorology  to_teach 
february 2019 by cshalizi
[1710.05013] A Case Study Competition Among Methods for Analyzing Large Spatial Data
"The Gaussian process is an indispensable tool for spatial data analysts. The onset of the "big data" era, however, has lead to the traditional Gaussian process being computationally infeasible for modern spatial data. As such, various alternatives to the full Gaussian process that are more amenable to handling big spatial data have been proposed. These modern methods often exploit low rank structures and/or multi-core and multi-threaded computing environments to facilitate computation. This study provides, first, an introductory overview of several methods for analyzing large spatial data. Second, this study describes the results of a predictive competition among the described methods as implemented by different groups with strong expertise in the methodology. Specifically, each research group was provided with two training datasets (one simulated and one observed) along with a set of prediction locations. Each group then wrote their own implementation of their method to produce predictions at the given location and each which was subsequently run on a common computing environment. The methods were then compared in terms of various predictive diagnostics. Supplementary materials regarding implementation details of the methods and code are available for this article online."
in_NB  spatial_statistics  prediction  computational_statistics  statistics  to_teach:data_over_space_and_time 
october 2018 by cshalizi
Prediction Interval for Autoregressive Time Series via Oracally Efficient Estimation of Multi‐Step‐Ahead Innovation Distribution Function - Kong - 2018 - Journal of Time Series Analysis - Wiley Online Library
"A kernel distribution estimator (KDE) is proposed for multi‐step‐ahead prediction error distribution of autoregressive time series, based on prediction residuals. Under general assumptions, the KDE is proved to be oracally efficient as the infeasible KDE and the empirical cumulative distribution function (cdf) based on unobserved prediction errors. Quantile estimator is obtained from the oracally efficient KDE, and prediction interval for multi‐step‐ahead future observation is constructed using the estimated quantiles and shown to achieve asymptotically the nominal confidence levels. Simulation examples corroborate the asymptotic theory."
in_NB  prediction  time_series  statistics  kernel_estimators 
october 2018 by cshalizi
On the Sensitivity of Granger Causality to Errors‐In‐Variables, Linear Transformations and Subsampling - Anderson - - Journal of Time Series Analysis - Wiley Online Library
"This article studies the sensitivity of Granger causality to the addition of noise, the introduction of subsampling, and the application of causal invertible filters to weakly stationary processes. Using canonical spectral factors and Wold decompositions, we give general conditions under which additive noise or filtering distorts Granger‐causal properties by inducing (spurious) Granger causality, as well as conditions under which it does not. For the errors‐in‐variables case, we give a continuity result, which implies that: a ‘small’ noise‐to‐signal ratio entails ‘small’ distortions in Granger causality. On filtering, we give general necessary and sufficient conditions under which ‘spurious’ causal relations between (vector) time series are not induced by linear transformations of the variables involved. This also yields transformations (or filters) which can eliminate Granger causality from one vector to another one. In a number of cases, we clarify results in the existing literature, with a number of calculations streamlining some existing approaches."
to:NB  time_series  prediction  granger_causality  measurement 
september 2018 by cshalizi
Lognormal-de Wijsian Geostatistics for Ore Evaluation
Krige on kriging. I have to admit I hadn't fully realized that the historical context was "keep South Africa going"...
in_NB  have_read  spatial_statistics  prediction  statistics  geology  to_teach:data_over_space_and_time 
september 2018 by cshalizi
Hello World | W. W. Norton & Company
"If you were accused of a crime, who would you rather decide your sentence—a mathematically consistent algorithm incapable of empathy or a compassionate human judge prone to bias and error? What if you want to buy a driverless car and must choose between one programmed to save as many lives as possible and another that prioritizes the lives of its own passengers? And would you agree to share your family’s full medical history if you were told that it would help researchers find a cure for cancer?
"These are just some of the dilemmas that we are beginning to face as we approach the age of the algorithm, when it feels as if the machines reign supreme. Already, these lines of code are telling us what to watch, where to go, whom to date, and even whom to send to jail. But as we rely on algorithms to automate big, important decisions—in crime, justice, healthcare, transportation, and money—they raise questions about what we want our world to look like. What matters most: Helping doctors with diagnosis or preserving privacy? Protecting victims of crime or preventing innocent people being falsely accused?
"Hello World takes us on a tour through the good, the bad, and the downright ugly of the algorithms that surround us on a daily basis. Mathematician Hannah Fry reveals their inner workings, showing us how algorithms are written and implemented, and demonstrates the ways in which human bias can literally be written into the code. By weaving in relatable, real world stories with accessible explanations of the underlying mathematics that power algorithms, Hello World helps us to determine their power, expose their limitations, and examine whether they really are improvement on the human systems they replace."
to:NB  books:noted  data_mining  machine_learning  prediction 
september 2018 by cshalizi
Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning
"Randomized neural networks are immortalized in this AI Koan: In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. What are you doing?'' asked Minsky. I am training a randomly wired neural net to play tic-tac-toe,'' Sussman replied. Why is the net wired randomly?'' asked Minsky. Sussman replied, I do not want it to have any preconceptions of how to play.'' Minsky then shut his eyes. Why do you close your eyes?'' Sussman asked his teacher. So that the room will be empty,'' replied Minsky. At that moment, Sussman was enlightened. We analyze shallow random networks with the help of concentration of measure inequalities. Specifically, we consider architectures that compute a weighted sum of their inputs after passing them through a bank of arbitrary randomized nonlinearities. We identify conditions under which these networks exhibit good classification performance, and bound their test error in terms of the size of the dataset and the number of random nonlinearities."

--- Have I never bookmarked this before?
in_NB  approximation  kernel_methods  random_projections  statistics  prediction  classifiers  rahimi.ali  recht.benjamin  machine_learning  have_read 
september 2018 by cshalizi
[1205.4591] Forecastable Component Analysis (ForeCA)
" introduce Forecastable Component Analysis (ForeCA), a novel dimension reduction technique for temporally dependent signals. Based on a new forecastability measure, ForeCA finds an optimal transformation to separate a multivariate time series into a forecastable and an orthogonal white noise space. I present a converging algorithm with a fast eigenvector solution. Applications to financial and macro-economic time series show that ForeCA can successfully discover informative structure, which can be used for forecasting as well as classification. The R package ForeCA (this http URL) accompanies this work and is publicly available on CRAN."
to:NB  have_read  time_series  kith_and_kin  goerg.georg  prediction  statistics  to_teach:data_over_space_and_time 
september 2018 by cshalizi
[1808.00023] The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning
"The nascent field of fair machine learning aims to ensure that decisions guided by algorithms are equitable. Over the last several years, three formal definitions of fairness have gained prominence: (1) anti-classification, meaning that protected attributes---like race, gender, and their proxies---are not explicitly used to make decisions; (2) classification parity, meaning that common measures of predictive performance (e.g., false positive and false negative rates) are equal across groups defined by the protected attributes; and (3) calibration, meaning that conditional on risk estimates, outcomes are independent of protected attributes. Here we show that all three of these fairness definitions suffer from significant statistical limitations. Requiring anti-classification or classification parity can, perversely, harm the very groups they were designed to protect; and calibration, though generally desirable, provides little guarantee that decisions are equitable. In contrast to these formal fairness criteria, we argue that it is often preferable to treat similarly risky people similarly, based on the most statistically accurate estimates of risk that one can produce. Such a strategy, while not universally applicable, often aligns well with policy objectives; notably, this strategy will typically violate both anti-classification and classification parity. In practice, it requires significant effort to construct suitable risk estimates. One must carefully define and measure the targets of prediction to avoid retrenching biases in the data. But, importantly, one cannot generally address these difficulties by requiring that algorithms satisfy popular mathematical formalizations of fairness. By highlighting these challenges in the foundation of fair machine learning, we hope to help researchers and practitioners productively advance the area."

--- ETA: This is a really good and convincing paper.
in_NB  prediction  algorithmic_fairness  goel.sharad  via:rvenkat  have_read  heard_the_talk 
august 2018 by cshalizi
Local causal states and discrete coherent structures (Rupe and Crutchfield, 2018)
"Coherent structures form spontaneously in nonlinear spatiotemporal systems and are found at all spatial scales in natural phenomena from laboratory hydrodynamic flows and chemical reactions to ocean, atmosphere, and planetary climate dynamics. Phenomenologically, they appear as key components that organize the macroscopic behaviors in such systems. Despite a century of effort, they have eluded rigorous analysis and empirical prediction, with progress being made only recently. As a step in this, we present a formal theory of coherent structures in fully discrete dynamical field theories. It builds on the notion of structure introduced by computational mechanics, generalizing it to a local spatiotemporal setting. The analysis’ main tool employs the local causal states, which are used to uncover a system’s hidden spatiotemporal symmetries and which identify coherent structures as spatially localized deviations from those symmetries. The approach is behavior-driven in the sense that it does not rely on directly analyzing spatiotemporal equations of motion, rather it considers only the spatiotemporal fields a system generates. As such, it offers an unsupervised approach to discover and describe coherent structures. We illustrate the approach by analyzing coherent structures generated by elementary cellular automata, comparing the results with an earlier, dynamic-invariant-set approach that decomposes fields into domains, particles, and particle interactions."

--- *ahem* *cough* https://arxiv.org/abs/nlin/0508001 *ahem*
to:NB  have_read  pattern_formation  complexity  prediction  stochastic_processes  spatio-temporal_statistics  cellular_automata  crutchfield.james_p.  modesty_forbids_further_comment 
august 2018 by cshalizi
[1705.08105] FRK: An R Package for Spatial and Spatio-Temporal Prediction with Large Datasets
"FRK is an R software package for spatial/spatio-temporal modelling and prediction with large datasets. It facilitates optimal spatial prediction (kriging) on the most commonly used manifolds (in Euclidean space and on the surface of the sphere), for both spatial and spatio-temporal fields. It differs from many of the packages for spatial modelling and prediction by avoiding stationary and isotropic covariance and variogram models, instead constructing a spatial random effects (SRE) model on a fine-resolution discretised spatial domain. The discrete element is known as a basic areal unit (BAU), whose introduction in the software leads to several practical advantages. The software can be used to (i) integrate multiple observations with different supports with relative ease; (ii) obtain exact predictions at millions of prediction locations (without conditional simulation); and (iii) distinguish between measurement error and fine-scale variation at the resolution of the BAU, thereby allowing for reliable uncertainty quantification. The temporal component is included by adding another dimension. A key component of the SRE model is the specification of spatial or spatio-temporal basis functions; in the package, they can be generated automatically or by the user. The package also offers automatic BAU construction, an expectation-maximisation (EM) algorithm for parameter estimation, and functionality for prediction over any user-specified polygons or BAUs. Use of the package is illustrated on several spatial and spatio-temporal datasets, and its predictions and the model it implements are extensively compared to others commonly used for spatial prediction and modelling."
in_NB  to_read  R  heard_the_talk  prediction  spatial_statistics  spatio-temporal_statistics  to_teach:data_over_space_and_time 
august 2018 by cshalizi
Indirect inference through prediction
"By recasting indirect inference estimation as a prediction rather than a minimization and by using regularized regressions, we can bypass the three major problems of estimation: selecting the summary statistics, defining the distance function and minimizing it numerically. By substituting regression with classification we can extend this approach to model selection as well. We present three examples: a statistical fit, the parametrization of a simple RBC model and heuristics selection in a fishery agent-based model."
agent-based_models  prediction  statistics  estimation  indirect_inference  simulation  have_read  in_NB 
july 2018 by cshalizi
Material Signals: A Historical Sociology of High-Frequency Trading | American Journal of Sociology: Vol 123, No 6
"Drawing on interviews with 194 market participants (including 54 practitioners of high-frequency trading or HFT), this article first identifies the main classes of “signals” (patterns of data) that influence how HFT algorithms buy and sell shares and interact with each other. Second, it investigates historically the processes that have led to three of the most important categories of these signals, finding that they arise from three features of U.S. share trading that are the result of episodes of meso-level conflict. Third, the article demonstrates the contingency of these features by briefly comparing HFT in share trading to HFT in futures, Treasurys, and foreign exchange. The article thus argues that how HFT algorithms act and interact is a specific, contingent product not just of the current but also of the past interaction of people, organizations, algorithms, and machines."
to:NB  finance  sociology  prediction 
june 2018 by cshalizi
Bootstrap bias corrections for ensemble methods | SpringerLink
"This paper examines the use of a residual bootstrap for bias correction in machine learning regression methods. Accounting for bias is an important obstacle in recent efforts to develop statistical inference for machine learning. We demonstrate empirically that the proposed bootstrap bias correction can lead to substantial improvements in both bias and predictive accuracy. In the context of ensembles of trees, we show that this correction can be approximated at only double the cost of training the original ensemble. Our method is shown to improve test set accuracy over random forests by up to 70% on example problems from the UCI repository."
to;NB  ensemble_methods  prediction  bootstrap  hooker.giles  statistics 
may 2018 by cshalizi
[1706.08576] Invariant Causal Prediction for Nonlinear Models
"An important problem in many domains is to predict how a system will respond to interventions. This task is inherently linked to estimating the system's underlying causal structure. To this end, 'invariant causal prediction' (ICP) (Peters et al., 2016) has been proposed which learns a causal model exploiting the invariance of causal relations using data from different environments. When considering linear models, the implementation of ICP is relatively straight-forward. However, the nonlinear case is more challenging due to the difficulty of performing nonparametric tests for conditional independence. In this work, we present and evaluate an array of methods for nonlinear and nonparametric versions of ICP for learning the causal parents of given target variables. We find that an approach which first fits a nonlinear model with data pooled over all environments and then tests for differences between the residual distributions across environments is quite robust across a large variety of simulation settings. We call this procedure "Invariant residual distribution test". In general, we observe that the performance of all approaches is critically dependent on the true (unknown) causal structure and it becomes challenging to achieve high power if the parental set includes more than two variables. As a real-world example, we consider fertility rate modelling which is central to world population projections. We explore predicting the effect of hypothetical interventions using the accepted models from nonlinear ICP. The results reaffirm the previously observed central causal role of child mortality rates."
to:NB  causal_inference  causal_discovery  statistics  regression  prediction  peters.jonas  meinshausen.nicolai  to_read  heard_the_talk  to_teach:undergrad-ADA  re:ADAfaEPoV 
may 2018 by cshalizi
[1501.01332] Causal inference using invariant prediction: identification and confidence intervals
"What is the difference of a prediction that is made with a causal model and a non-causal model? Suppose we intervene on the predictor variables or change the whole environment. The predictions from a causal model will in general work as well under interventions as for observational data. In contrast, predictions from a non-causal model can potentially be very wrong if we actively intervene on variables. Here, we propose to exploit this invariance of a prediction under a causal model for causal inference: given different experimental settings (for example various interventions) we collect all models that do show invariance in their predictive accuracy across settings and interventions. The causal model will be a member of this set of models with high probability. This approach yields valid confidence intervals for the causal relationships in quite general scenarios. We examine the example of structural equation models in more detail and provide sufficient assumptions under which the set of causal predictors becomes identifiable. We further investigate robustness properties of our approach under model misspecification and discuss possible extensions. The empirical properties are studied for various data sets, including large-scale gene perturbation experiments."
to:NB  to_read  causal_inference  causal_discovery  statistics  prediction  regression  buhlmann.peter  meinshausen.nicolai  peters.jonas  heard_the_talk  re:ADAfaEPoV  to_teach:undergrad-ADA 
may 2018 by cshalizi
[1708.03579] Self-exciting point processes with spatial covariates: modeling the dynamics of crime
"Crime has both varying patterns in space, related to features of the environment, economy, and policing, and patterns in time arising from criminal behavior, such as retaliation. Serious crimes may also be presaged by minor crimes of disorder. We demonstrate that these spatial and temporal patterns are generally confounded, requiring analyses to take both into account, and propose a spatio-temporal self-exciting point process model which incorporates spatial features, near-repeat and retaliation effects, and triggering. We develop inference methods and diagnostic tools, such as residual maps, for this model, and through extensive simulation and crime data obtained from Pittsburgh, Pennsylvania, demonstrate its properties and usefulness."
in_NB  spatio-temporal_statistics  point_processes  prediction  statistics  crime  kith_and_kin  reinhart.alex  greenhouse.joel  on_the_thesis_committee  to_teach:data_over_space_and_time 
may 2018 by cshalizi
Forecasting the spatial transmission of influenza in the United States | PNAS
"Recurrent outbreaks of seasonal and pandemic influenza create a need for forecasts of the geographic spread of this pathogen. Although it is well established that the spatial progression of infection is largely attributable to human mobility, difficulty obtaining real-time information on human movement has limited its incorporation into existing infectious disease forecasting techniques. In this study, we develop and validate an ensemble forecast system for predicting the spatiotemporal spread of influenza that uses readily accessible human mobility data and a metapopulation model. In retrospective state-level forecasts for 35 US states, the system accurately predicts local influenza outbreak onset,—i.e., spatial spread, defined as the week that local incidence increases above a baseline threshold—up to 6 wk in advance of this event. In addition, the metapopulation prediction system forecasts influenza outbreak onset, peak timing, and peak intensity more accurately than isolated location-specific forecasts. The proposed framework could be applied to emergent respiratory viruses and, with appropriate modifications, other infectious diseases."
to:NB  epidemic_models  influenza  contagion  prediction  statistics 
may 2018 by cshalizi
Slowness as a Proxy for Temporal Predictability: An Empirical Comparison | Neural Computation | MIT Press Journals
"The computational principles of slowness and predictability have been proposed to describe aspects of information processing in the visual system. From the perspective of slowness being a limited special case of predictability we investigate the relationship between these two principles empirically. On a collection of real-world data sets we compare the features extracted by slow feature analysis (SFA) to the features of three recently proposed methods for predictable feature extraction: forecastable component analysis, predictable feature analysis, and graph-based predictable feature analysis. Our experiments show that the predictability of the learned features is highly correlated, and, thus, SFA appears to effectively implement a method for extracting predictable features according to different measures of predictability."
to:NB  time_series  prediction  statistics 
may 2018 by cshalizi
Predictive Processing and the Representation Wars | SpringerLink
"Clark has recently suggested that predictive processing advances a theory of neural function with the resources to put an ecumenical end to the “representation wars” of recent cognitive science. In this paper I defend and develop this suggestion. First, I broaden the representation wars to include three foundational challenges to representational cognitive science. Second, I articulate three features of predictive processing’s account of internal representation that distinguish it from more orthodox representationalist frameworks. Specifically, I argue that it posits a resemblance-based representational architecture with organism-relative contents that functions in the service of pragmatic success, not veridical representation. Finally, I argue that internal representation so understood is either impervious to the three anti-representationalist challenges I outline or can actively embrace them."
to:NB  philosophy_of_mind  cognitive_science  representation  prediction 
march 2018 by cshalizi
[1803.04383] Delayed Impact of Fair Machine Learning
"Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time. Conventional wisdom suggests that fairness criteria promote the long-term well-being of those groups they aim to protect.
"We study how static fairness criteria interact with temporal indicators of well-being, such as long-term improvement, stagnation, and decline in a variable of interest. We demonstrate that even in a one-step feedback model, common fairness criteria in general do not promote improvement over time, and may in fact cause harm in cases where an unconstrained objective would not.
"We completely characterize the delayed impact of three standard criteria, contrasting the regimes in which these exhibit qualitatively different behavior. In addition, we find that a natural form of measurement error broadens the regime in which fairness criteria perform favorably.
"Our results highlight the importance of measurement and temporal modeling in the evaluation of fairness criteria, suggesting a range of new challenges and trade-offs."

--- A _lot_ is going to hinge here on how they model the feedback process.
--- My evil spirit is making me wonder how hard it would be to write a "Rhetoric of Reaction"-esque attack on algorithmic fairness.
to:NB  algorithmic_fairness  prediction  credit_ratings  data_mining  via:whimsley 
march 2018 by cshalizi
[1802.07814] Learning to Explain: An Information-Theoretic Perspective on Model Interpretation
"We introduce instancewise feature selection as a methodology for model interpretation. Our method is based on learning a function to extract a subset of features that are most informative for each given example. This feature selector is trained to maximize the mutual information between selected features and the response variable, where the conditional distribution of the response variable given the input is the model to be explained. We develop an efficient variational approximation to the mutual information, and show that the resulting method compares favorably to other model explanation methods on a variety of synthetic and real data sets using both quantitative metrics and human evaluation."
to:NB  information_theory  explanation  prediction  jordan.michael_i.  wainwright.martin_j.  statistics 
march 2018 by cshalizi
[1702.04690] Simple rules for complex decisions
"From doctors diagnosing patients to judges setting bail, experts often base their decisions on experience and intuition rather than on statistical models. While understandable, relying on intuition over models has often been found to result in inferior outcomes. Here we present a new method, select-regress-and-round, for constructing simple rules that perform well for complex decisions. These rules take the form of a weighted checklist, can be applied mentally, and nonetheless rival the performance of modern machine learning algorithms. Our method for creating these rules is itself simple, and can be carried out by practitioners with basic statistics knowledge. We demonstrate this technique with a detailed case study of judicial decisions to release or detain defendants while they await trial. In this application, as in many policy settings, the effects of proposed decision rules cannot be directly observed from historical data: if a rule recommends releasing a defendant that the judge in reality detained, we do not observe what would have happened under the proposed action. We address this key counterfactual estimation problem by drawing on tools from causal inference. We find that simple rules significantly outperform judges and are on par with decisions derived from random forests trained on all available features. Generalizing to 22 varied decision-making domains, we find this basic result replicates. We conclude with an analytical framework that helps explain why these simple decision rules perform as well as they do."
to:NB  to_read  decision-making  classifiers  fast-and-frugal_heuristics  heuristics  clinical-vs-actuarial_prediction  prediction  crime  bail  via:vaguery 
january 2018 by cshalizi
[1801.02858] Scalable high-resolution forecasting of sparse spatiotemporal events with kernel methods: a winning solution to the NIJ "Real-Time Crime Forecasting Challenge"
"This article describes Team Kernel Glitches' solution to the National Institute of Justice's (NIJ) Real-Time Crime Forecasting Challenge. The goal of the NIJ Real-Time Crime Forecasting Competition was to maximize two different crime hotspot scoring metrics for calls-for-service to the Portland Police Bureau (PPB) in Portland, Oregon during the period from March 1, 2017 to May 31, 2017. Our solution to the challenge is a spatiotemporal forecasting model combining scalable randomized Reproducing Kernel Hilbert Space (RKHS) methods for approximating Gaussian processes with autoregressive smoothing kernels in a regularized supervised learning framework. Our model can be understood as an approximation to the popular log-Gaussian Cox Process model: we discretize the spatiotemporal point pattern and learn a log intensity function using the Poisson likelihood and highly efficient gradient-based optimization methods. Model hyperparameters including quality of RKHS approximation, spatial and temporal kernel lengthscales, number of autoregressive lags, bandwidths for smoothing kernels, as well as cell shape, size, and rotation, were learned using crossvalidation. Resulting predictions exceeded baseline KDE estimates by 0.157. Performance improvement over baseline predictions were particularly large for sparse crimes over short forecasting horizons."

--- There seems to be some substantial improvements here over Seth's Ph.D. thesis...
in_NB  to_read  spatio-temporal_statistics  point_processes  statistics  prediction  crime  flaxman.seth 
january 2018 by cshalizi
Looking Forward: Prediction and Uncertainty in Modern America, Pietruska
"In the decades after the Civil War, the world experienced monumental changes in industry, trade, and governance. As Americans faced this uncertain future, public debate sprang up over the accuracy and value of predictions, asking whether it was possible to look into the future with any degree of certainty. In Looking Forward, Jamie L. Pietruska uncovers a culture of prediction in the modern era, where forecasts became commonplace as crop forecasters, “weather prophets,” business forecasters, utopian novelists, and fortune-tellers produced and sold their visions of the future. Private and government forecasters competed for authority—as well as for an audience—and a single prediction could make or break a forecaster’s reputation. 
"Pietruska argues that this late nineteenth-century quest for future certainty had an especially ironic consequence: it led Americans to accept uncertainty as an inescapable part of both forecasting and twentieth-century economic and cultural life. Drawing together histories of science, technology, capitalism, environment, and culture, Looking Forward explores how forecasts functioned as new forms of knowledge and risk management tools that sometimes mitigated, but at other times exacerbated, the very uncertainties they were designed to conquer. Ultimately Pietruska shows how Americans came to understand the future itself as predictable, yet still uncertain."
to:NB  books:noted  prediction  history_of_ideas  history_of_science  19th_century_history  american_history 
january 2018 by cshalizi
[1706.02744] Avoiding Discrimination through Causal Reasoning
"Recent work on fairness in machine learning has focused on various statistical discrimination criteria and how they trade off. Most of these criteria are observational: They depend only on the joint distribution of predictor, protected attribute, features, and outcome. While convenient to work with, observational criteria have severe inherent limitations that prevent them from resolving matters of fairness conclusively.
"Going beyond observational criteria, we frame the problem of discrimination based on protected attributes in the language of causal reasoning. This viewpoint shifts attention from "What is the right fairness criterion?" to "What do we want to assume about the causal data generating process?" Through the lens of causality, we make several contributions. First, we crisply articulate why and when observational criteria fail, thus formalizing what was before a matter of opinion. Second, our approach exposes previously ignored subtleties and why they are fundamental to the problem. Finally, we put forward natural causal non-discrimination criteria and develop algorithms that satisfy them."
to:NB  to_read  causality  algorithmic_fairness  prediction  machine_learning  janzing.dominik  re:ADAfaEPoV  via:arsyed 
november 2017 by cshalizi
Tetlock, P.E.: Expert Political Judgment: How Good Is It? How Can We Know?. (New Edition) (eBook, Paperback and Hardcover)
"Tetlock first discusses arguments about whether the world is too complex for people to find the tools to understand political phenomena, let alone predict the future. He evaluates predictions from experts in different fields, comparing them to predictions by well-informed laity or those based on simple extrapolation from current trends. He goes on to analyze which styles of thinking are more successful in forecasting. Classifying thinking styles using Isaiah Berlin's prototypes of the fox and the hedgehog, Tetlock contends that the fox--the thinker who knows many little things, draws from an eclectic array of traditions, and is better able to improvise in response to changing events--is more successful in predicting the future than the hedgehog, who knows one big thing, toils devotedly within one tradition, and imposes formulaic solutions on ill-defined problems. He notes a perversely inverse relationship between the best scientific indicators of good judgement and the qualities that the media most prizes in pundits--the single-minded determination required to prevail in ideological combat.
"Clearly written and impeccably researched, the book fills a huge void in the literature on evaluating expert opinion. It will appeal across many academic disciplines as well as to corporations seeking to develop standards for judging expert decision-making. Now with a new preface in which Tetlock discusses the latest research in the field, the book explores what constitutes good judgment in predicting future events and looks at why experts are often wrong in their forecasts."
in_NB  books:noted  prediction  expertise  cognitive_science 
september 2017 by cshalizi
[1709.02012v1] On Fairness and Calibration
"The machine learning community has become increasingly concerned with the potential for bias and discrimination in predictive models, and this has motivated a growing line of work on what it means for a classification procedure to be "fair." In particular, we investigate the tension between minimizing error disparity across different population groups while maintaining calibrated probability estimates. We show that calibration is compatible only with a single error constraint (i.e. equal false-negatives rates across groups), and show that any algorithm that satisfies this relaxation is no better than randomizing a percentage of predictions for an existing classifier. These unsettling findings, which extend and generalize existing results, are empirically confirmed on several datasets."
to:NB  to_read  calibration  prediction  classifiers  kleinberg.jon  via:arsyed 
september 2017 by cshalizi
Minding the Weather | The MIT Press
"This book argues that the human cognition system is the least understood, yet probably most important, component of forecasting accuracy. Minding the Weather investigates how people acquire massive and highly organized knowledge and develop the reasoning skills and strategies that enable them to achieve the highest levels of performance.
"The authors consider such topics as the forecasting workplace; atmospheric scientists’ descriptions of their reasoning strategies; the nature of expertise; forecaster knowledge, perceptual skills, and reasoning; and expert systems designed to imitate forecaster reasoning. Drawing on research in cognitive science, meteorology, and computer science, the authors argue that forecasting involves an interdependence of humans and technologies. Human expertise will always be necessary."
to:NB  prediction  meteorology  cognitive_science  books:noted 
september 2017 by cshalizi
Empirical prediction intervals improve energy forecasting
"Hundreds of organizations and analysts use energy projections, such as those contained in the US Energy Information Administration (EIA)’s Annual Energy Outlook (AEO), for investment and policy decisions. Retrospective analyses of past AEO projections have shown that observed values can differ from the projection by several hundred percent, and thus a thorough treatment of uncertainty is essential. We evaluate the out-of-sample forecasting performance of several empirical density forecasting methods, using the continuous ranked probability score (CRPS). The analysis confirms that a Gaussian density, estimated on past forecasting errors, gives comparatively accurate uncertainty estimates over a variety of energy quantities in the AEO, in particular outperforming scenario projections provided in the AEO. We report probabilistic uncertainties for 18 core quantities of the AEO 2016 projections. Our work frames how to produce, evaluate, and rank probabilistic forecasts in this setting. We propose a log transformation of forecast errors for price projections and a modified nonparametric empirical density forecasting method. Our findings give guidance on how to evaluate and communicate uncertainty in future energy outlooks."

--- It's probably presumptuous of me, but I am a bit proud, because the first author learned a lot of these methods from my class...
to:NB  to_read  heard_the_talk  energy  prediction  statistics  to_teach:undergrad-ADA 
august 2017 by cshalizi
What does it mean to ask for an “explainable” algorithm?
"The second type of explainability problem is complexity. Here everything about the algorithm is known, but somebody feels that the algorithm is so complex that they cannot understand it. It will always be possible to answer what-if questions, such as how the algorithm’s result would have been different had the person been one year older, or had an extra $1000 of annual income, or had one fewer prior misdemeanor conviction, or whatever. So complexity can only be a barrier to big-picture understanding, not to understanding which factors might have changed a particular person’s outcome."

--- I am not at all sure about this, because of interactions. If the function changes sufficiently rapidly, with enough interactions between the inputs, knowing these sorts of local perturbations may tell us very little.
explanation  prediction  have_read 
july 2017 by cshalizi
Phys. Rev. E 95, 042140 (2017) - Thermodynamics of complexity and pattern manipulation
"Many organisms capitalize on their ability to predict the environment to maximize available free energy and reinvest this energy to create new complex structures. This functionality relies on the manipulation of patterns—temporally ordered sequences of data. Here, we propose a framework to describe pattern manipulators—devices that convert thermodynamic work to patterns or vice versa—and use them to build a “pattern engine” that facilitates a thermodynamic cycle of pattern creation and consumption. We show that the least heat dissipation is achieved by the provably simplest devices, the ones that exhibit desired operational behavior while maintaining the least internal memory. We derive the ultimate limits of this heat dissipation and show that it is generally nonzero and connected with the pattern's intrinsic crypticity—a complexity theoretic quantity that captures the puzzling difference between the amount of information the pattern's past behavior reveals about its future and the amount one needs to communicate about this past to optimally predict the future."
to:NB  to_read  complexity  complexity_measures  prediction  thermodynamics  maxwells_demon 
june 2017 by cshalizi
Dietze, M.C.: Ecological Forecasting (eBook and Hardcover).
"An authoritative and accessible introduction to the concepts and tools needed to make ecology a more predictive science
"Ecologists are being asked to respond to unprecedented environmental challenges. How can they provide the best available scientific information about what will happen in the future? Ecological Forecasting is the first book to bring together the concepts and tools needed to make ecology a more predictive science.
"Ecological Forecasting presents a new way of doing ecology. A closer connection between data and models can help us to project our current understanding of ecological processes into new places and times. This accessible and comprehensive book covers a wealth of topics, including Bayesian calibration and the complexities of real-world data; uncertainty quantification, partitioning, propagation, and analysis; feedbacks from models to measurements; state-space models and data fusion; iterative forecasting and the forecast cycle; and decision support."
to:NB  books:noted  ecology  prediction  statistics 
june 2017 by cshalizi
[1604.04173] Distribution-Free Predictive Inference For Regression
"We develop a general framework for distribution-free predictive inference in regression, using conformal inference. The proposed methodology allows construction of prediction bands for the response variable using any estimator of the regression function. The resulting prediction band preserves the consistency properties of the original estimator under standard assumptions, while guaranteeing finite sample marginal coverage even when the assumptions do not hold. We analyze and compare, both empirically and theoretically, two major variants of our conformal procedure: the full conformal inference and split conformal inference, along with a related jackknife method. These methods offer different tradeoffs between statistical accuracy (length of resulting prediction intervals) and computational efficiency. As extensions, we develop a method for constructing valid in-sample prediction intervals called rank-one-out conformal inference, which has essentially the same computational efficiency as split conformal inference. We also describe an extension of our procedures for producing prediction bands with varying local width, in order to adapt to heteroskedascity in the data distribution. Lastly, we propose a model-free notion of variable importance, called leave-one-covariate-out or LOCO inference. Accompanying our paper is an R package conformalInference that implements all of the proposals we have introduced. In the spirit of reproducibility, all empirical results in this paper can be easily (re)generated using this package."
to:NB  to_read  kith_and_kin  regression  prediction  wasserman.larry  tibshirani.ryan  g'sell.max  lei.jing  rinaldo.alessandro 
february 2017 by cshalizi
A material political economy: Automated Trading Desk and price prediction in high-frequency trading - Dec 06, 2016
"This article contains the first detailed historical study of one of the new high-frequency trading (HFT) firms that have transformed many of the world’s financial markets. The study, of Automated Trading Desk (ATD), one of the earliest and most important such firms, focuses on how ATD’s algorithms predicted share price changes. The article argues that political-economic struggles are integral to the existence of some of the ‘pockets’ of predictable structure in the otherwise random movements of prices, to the availability of the data that allow algorithms to identify these pockets, and to the capacity of algorithms to use these predictions to trade profitably. The article also examines the role of HFT algorithms such as ATD’s in the epochal, fiercely contested shift in US share trading from ‘fixed-role’ markets towards ‘all-to-all’ markets."
to:NB  finance  prediction  sociology  mackenzie.donald 
december 2016 by cshalizi
‘Moneyball’ for Professors?
The key paragraph:

"Using a hand-curated data set of 54 scholars who obtained doctorates after 1995 and held assistant professorships at top-10 operations research programs in 2003 or earlier, these statistical models made different decisions than the tenure committees for 16 (30%) of the candidates. Specifically, these new criteria yielded a set of scholars who, in the future, produced more papers published in the top journals and research that was cited more often than the scholars who were actually selected by tenure committees"

--- In other words, "success" here is defined entirely through the worst sort of abuse of citation metrics, i.e., through doing the things which everyone who has seriously studied citation metrics says you should _not_ use them for. (Cf. https://arxiv.org/abs/0910.3529 .) If the objective was to making academic hiring decisions _even less_ sensitive to actually intellectual quality, one could hardly do better.
I am sure that this idea will, however, be widely adopted and go from strength to strength.
bad_data_analysis  academia  bibliometry  social_networks  network_data_analysis  prediction  utter_stupidity  have_read  via:jbdelong  to:blog 
december 2016 by cshalizi
[1311.4500] Time series prediction via aggregation : an oracle bound including numerical cost
"We address the problem of forecasting a time series meeting the Causal Bernoulli Shift model, using a parametric set of predictors. The aggregation technique provides a predictor with well established and quite satisfying theoretical properties expressed by an oracle inequality for the prediction risk. The numerical computation of the aggregated predictor usually relies on a Markov chain Monte Carlo method whose convergence should be evaluated. In particular, it is crucial to bound the number of simulations needed to achieve a numerical precision of the same order as the prediction risk. In this direction we present a fairly general result which can be seen as an oracle inequality including the numerical cost of the predictor computation. The numerical cost appears by letting the oracle inequality depend on the number of simulations required in the Monte Carlo approximation. Some numerical experiments are then carried out to support our findings."
to:NB  prediction  time_series  statistics  computational_statistics  monte_carlo  ensemble_methods 
december 2016 by cshalizi
Sequential Learning, Predictability, and Optimal Portfolio Returns - JOHANNES - 2014 - The Journal of Finance - Wiley Online Library
"This paper finds statistically and economically significant out-of-sample portfolio benefits for an investor who uses models of return predictability when forming optimal portfolios. Investors must account for estimation risk, and incorporate an ensemble of important features, including time-varying volatility, and time-varying expected returns driven by payout yield measures that include share repurchase and issuance. Prior research documents a lack of benefits to return predictability, and our results suggest that this is largely due to omitting time-varying volatility and estimation risk. We also document the sequential process of investors learning about parameters, state variables, and models as new data arrive."
to:NB  finance  prediction  time_series  online_learning 
december 2016 by cshalizi
[1311.5828] The Splice Bootstrap
"This paper proposes a new bootstrap method to compute predictive intervals for nonlinear autoregressive time series model forecast. This method we call the splice boobstrap as it involves splicing the last p values of a given series to a suitably simulated series. This ensures that each simulated series will have the same set of p time series values in common, a necessary requirement for computing conditional predictive intervals. Using simulation studies we show the methods gives 90% intervals intervals that are similar to those expected from theory for simple linear and SETAR model driven by normal and non-normal noise. Furthermore, we apply the method to some economic data and demonstrate the intervals compare favourably with cross-validation based intervals."
to:NB  bootstrap  time_series  statistics  prediction  to_teach:undergrad-ADA  re:ADAfaEPoV  to_read 
december 2016 by cshalizi
Links Between Multiplicity Automata, Observable Operator Models and Predictive State Representations -- a Unified Learning Framework
"Stochastic multiplicity automata (SMA) are weighted finite automata that generalize probabilistic automata. They have been used in the context of probabilistic grammatical inference. Observable operator models (OOMs) are a generalization of hidden Markov models, which in turn are models for discrete-valued stochastic processes and are used ubiquitously in the context of speech recognition and bio-sequence modeling. Predictive state representations (PSRs) extend OOMs to stochastic input-output systems and are employed in the context of agent modeling and planning.
"We present SMA, OOMs, and PSRs under the common framework of sequential systems, which are an algebraic characterization of multiplicity automata, and examine the precise relationships between them. Furthermore, we establish a unified approach to learning such models from data. Many of the learning algorithms that have been proposed can be understood as variations of this basic learning scheme, and several turn out to be closely related to each other, or even equivalent."
to:NB  re:AoS_project  stochastic_processes  statistics  prediction  state-space_models  automata_theory 
november 2016 by cshalizi
[1210.0103] On convergence rates of Bayesian predictive densities and posterior distributions
"Frequentist-style large-sample properties of Bayesian posterior distributions, such as consistency and convergence rates, are important considerations in nonparametric problems. In this paper we give an analysis of Bayesian asymptotics based primarily on predictive densities. Our analysis is unified in the sense that essentially the same approach can be taken to develop convergence rate results in iid, mis-specified iid, independent non-iid, and dependent data cases."
to:NB  bayesian_consistency  prediction  statistics  nonparametrics  re:bayes_as_evol 
november 2016 by cshalizi
Optimal prediction of the number of unseen species
"Estimating the number of unseen species is an important problem in many scientific endeavors. Its most popular formulation, introduced by Fisher et al. [Fisher RA, Corbet AS, Williams CB (1943) J Animal Ecol 12(1):42−58], uses n samples to predict the number U of hitherto unseen species that would be observed if t⋅n new samples were collected. Of considerable interest is the largest ratio t between the number of new and existing samples for which U can be accurately predicted. In seminal works, Good and Toulmin [Good I, Toulmin G (1956) Biometrika 43(102):45−63] constructed an intriguing estimator that predicts U for all t≤1. Subsequently, Efron and Thisted [Efron B, Thisted R (1976) Biometrika 63(3):435−447] proposed a modification that empirically predicts U even for some t>1, but without provable guarantees. We derive a class of estimators that provably predict U all of the way up to t∝logn. We also show that this range is the best possible and that the estimator’s mean-square error is near optimal for any t. Our approach yields a provable guarantee for the Efron−Thisted estimator and, in addition, a variant with stronger theoretical and experimental performance than existing methodologies on a variety of synthetic and real datasets. The estimators are simple, linear, computationally efficient, and scalable to massive datasets. Their performance guarantees hold uniformly for all distributions, and apply to all four standard sampling models commonly used across various scientific disciplines: multinomial, Poisson, hypergeometric, and Bernoulli product."
to:NB  sampling  prediction  statistics 
november 2016 by cshalizi
Modeling the Heavens: Sphairopoiia and Ptolemy’s Planetary Hypotheses
"This article investigates sphairopoiia, the art of making instruments that display the heavens, in Claudius Ptolemy’s Planetary Hypotheses. It takes up two questions: what kind of instrument does Ptolemy describe? And, could such an instrument have been constructed? I argue that Ptolemy did not propose one specific type of instrument, but instead he offered a range of possible designs, with the details to be worked out by the craftsman. Moreover, in addition to exhibiting his astronomical models and having the ability to estimate predictions, the instrument he proposed would have also shown the physical workings of the heavens. What emerges is both a clearer idea of what Ptolemy wanted the technician to build, and the purpose of such instruments."
to:NB  history_of_science  astronomy  ptolemy  modeling  prediction  history_of_technology 
july 2016 by cshalizi
[1606.08813] EU regulations on algorithmic decision-making and a "right to explanation"
"We summarize the potential impact that the European Union's new General Data Protection Regulation will have on the routine use of machine learning algorithms. Slated to take effect as law across the EU in 2018, it will restrict automated individual decision-making (that is, algorithms that make decisions based on user-level predictors) which "significantly affect" users. The law will also create a "right to explanation," whereby a user can ask for an explanation of an algorithmic decision that was made about them. We argue that while this law will pose large challenges for industry, it highlights opportunities for machine learning researchers to take the lead in designing algorithms and evaluation frameworks which avoid discrimination."
to:NB  explanation  statistics  prediction  decision-making  flaxman.seth 
july 2016 by cshalizi
« earlier      
per page:    204080120160

related tags

19th_century_history  abarbanel.henry  academia  adaptive_behavior  additive_models  advertising  agent-based_models  alaska  aldous.david  algorithmic_fairness  algorithms  alquier.pierre  american_history  approximation  arrow_of_time  art  artificial_life  astronomy  automata_theory  automation  autonomous_agents  autonomy  averaged_equations_of_motion  bad_data_analysis  bail  bayesianism  bayesian_consistency  beirl.wolfgang  berger.james  bias-variance  bibliometry  bickel.david_r.  bioinformatics  biology  blogged  boltzmann_died_for_your_sins  books:disrecommended  books:noted  books:owned  books:recommended  book_reviews  boosting  bootstrap  bradley.elizabeth  branching_processes  buhlmann.peter  caires.s.  calibration  causality  causal_discovery  causal_inference  cellular_automata  cesa-bianchi.nicolo  chains_with_complete_connections  change-point_problem  chaos  classical_mechanics  classifiers  climate_change  climatology  clinical-vs-actuarial_prediction  clinical_vs_actuarial_judgment  clustering  coarse-graining  coates.ta-nehisi  coddington.mark  cognitive_science  cold_war  collective_cognition  communication  complexity  complexity_measures  computational_mechanics  computational_statistics  conferences  confidence_sets  conformal_prediction  contagion  control  control_theory_and_control_engineering  copulas  cost-benefit_analysis  covariate_shift  cramer-rao_inequality  credit_ratings  crime  cross-validation  crutchfield.james_p.  cumulative_advantage  curse_of_dimensionality  cybernetics  cyclones  darmon.david  data_mining  data_sets  dawid.a.p.  dawid.philip  decision-making  decision_theory  decision_trees  density_estimation  dependence_measures  determinism  diacu.florian  differential_equations  dimension_reduction  disasters  domingos.pedro  dsges  dsquared  dunn.robin  dynamical_systems  earthquakes  ecology  econometrics  economics  eddy.william_f.  ellenberg.jordan  empirical_processes  em_algorithm  energy  ensemble_methods  entableted  environmental_management  environmental_policy  epidemic_models  epidemiology  epistemology  ergodic_decomposition  ergodic_theory  estimation  evaluation_of_expert_advice  evisceration  expertise  explanation  exponential_families  fairness  fairness_in_machine_learning  faraway.j.j.  fast-and-frugal_heuristics  feedback  ferreira.j.a.  filtering  finance  financial_markets  fisher_information  flaxman.seth  fluctuation-response  foster.dean_p.  foundations_of_statistics  fourier_analysis  freedom_as_self-control  funny:geeky  funny:malicious  g'sell.max  game_theory  gaussian_processes  geisser.seymour  genomics  gentile.claudio  geology  geometry_from_a_time_series  girvan.michelle  gives_economists_a_bad_name  gneiting.tilmann  goel.sharad  goerg.georg  goldstein.dana  granger_causality  grants  graphical_models  greenhouse.joel  grunwald.peter  hansen.bruce  have_read  have_taught  heard_the_talk  heavy_tails  heuristics  high-dimensional_probability  high-dimensional_statistics  hilbert_space  history_of_ideas  history_of_science  history_of_technology  homeostasis  homophily  hooker.giles  human_genetics  hypothesis_testing  indirect_inference  individual_sequence_prediction  induction  inference_to_latent_objects  influenza  information_cascades  information_criteria  information_geometry  information_theory  innovation  intelligence_(spying)  in_NB  janzing.dominik  jensen.shane  jordan.michael_i.  journalism  kadane.jay  kernel_estimators  kernel_methods  kith_and_kin  kleinberg.jon  knight.frank_b.  kontorovich.aryeh  kriging  kurths.jurgen  langford.john  lasso  lauritzen.steffen  law  le.quoc  learning_in_games  learning_theory  leeb.hannes  lei.jing  likelihood  lizardo.omar  logistic_regression  long-range_dependence  low-dimensional_summaries  low-rank_approximation  low-regret_learning  lumley.thomas  machine_learning  mackenzie.donald  macroeconomics  macro_from_micro  manifold_learning  manski.charles  markov_models  martingales  mathematics  maxwells_demon  measurement  mechanism_design  meinshausen.nicolai  meteorology  methodology  meyers.lauren_ancel  minimax  missing_data  misspecification  mixing  mixture_models  modeling  model_averaging  model_checking  model_selection  modesty_forbids_further_comment  modularity  molecular_biology  monte_carlo  morvai.gusztav  multiple_testing  natural_history_of_truthiness  natural_language_processing  networks  network_data_analysis  neural_networks  neuroscience  non-equilibrium  non-stationarity  nonparametrics  no_really_utter_stupidity  no_such_thing_as_bad_publicity  nukes  o'connor.brendan  o'neil.cathy  objectivity  oceanography  online_learning  on_the_thesis_committee  optimization  or_perhaps_the_nightmare_into_which_we_are_slipping  p-values  pac-bayesian  paging_dr_craik_dr_kennth_craik  particle_filters  path_dependence  pattern_discovery  pattern_formation  peters.jonas  philosophy_of_mind  philosophy_of_science  physics  point_processes  police  police_and_policing  policing  political_science  popular_social_science  prediction  predictive_representations  prequentialism  probability  prophecy  psychiatry  psychology  ptolemy  R  racine.jeffrey  raginsky.maxim  rahimi.ali  ramdas.aaditya  rand.william  random_fields  random_projections  re:ADAfaEPoV  re:almost_none  re:AoS_project  re:bayes_as_evol  re:democratic_cognition  re:growing_ensemble_project  re:phil-of-bayes_paper  re:stacs  re:XV_for_mixing  re:XV_for_networks  re:your_favorite_dsge_sucks  recht.benjamin  recommender_systems  regression  reinhart.alex  representation  reproducibility  rinaldo.alessandro  risk_assessment  risk_vs_uncertainty  robins.james  running_dogs_of_reaction  ryabko.b._ya.  salakhutdinov.ruslan  sampling  sarkar.purnamrita  schafer.chad  scoring_rules  search_engines  self-centered  self-organization  self-promotion  sensitive_dependence_on_initial_conditions  series_of_footnotes  set_theory  shot_after_a_fair_trial  silver.nate  simulation  singh.satinder_baveja  slee.tom  smith.leonard  smith.noah  smola.alex  smoothing  social_life_of_the_mind  social_media  social_networks  social_science_methodology  sociology  sociology_of_science  sornette.didier  sparsity  spatial_statistics  spatio-temporal_statistics  splines  stability_of_learning  stark.philip_b.  state-space_models  state-space_reconstruction  state_estimation  statistical_inference_for_stochastic_processes  statistical_mechanics  statistics  steinwart.ingo  stochastic_processes  structured_data  sufficiency  sugihara.george  support-vector_machines  symbols_from_dynamics  technological_change  teleology  teleonomy  terrorism  terrorism_fears  text_mining  theoretical_computer_science  thermodynamics  the_nightmare_from_which_we_are_trying_to_awake  the_objective_function_which_can_be_admitted_to_is_not_the_true_objective_function  the_robo-nuclear_apocalypse_in_our_past_light_cone  thomas.andrew_c.  tibshirani.ryan  time_series  to:blog  to:NB  to;NB  to_be_shot_after_a_fair_trial  to_read  to_teach  to_teach:complexity-and-inference  to_teach:data-mining  to_teach:data_over_space_and_time  to_teach:linear_models  to_teach:statcomp  to_teach:undergrad-ADA  track_down_references  universal_prediction  ussr  us_politics  utter_stupidity  van_der_laan.mark  variable-length_markov_chains  variable_selection  via:?  via:abumuqawama  via:arsyed  via:arthegall  via:djm1107  via:henry_farrell  via:james-nicoll  via:jbdelong  via:klk  via:mejn  via:rvenkat  via:vaguery  via:whimsley  violence  visual_display_of_quantitative_information  vovk.vladimir_g.  wainwright.martin_j.  wasserman.larry  watts.duncan  weather_prediction  weiss.benjamin  white.halbert  why_oh_why_cant_we_have_a_better_intelligentsia  why_oh_why_cant_we_have_a_better_press_corps  wiener.norbert  willett.rebecca_m.  zhang.tong 

Copy this bookmark:



description:


tags: