cshalizi + machine_learning   508

[1810.07758] The UCR Time Series Archive
"The UCR Time Series Archive - introduced in 2002, has become an important resource in the time series data mining community, with at least one thousand published papers making use of at least one data set from the archive. The original incarnation of the archive had sixteen data sets but since that time, it has gone through periodic expansions. The last expansion took place in the summer of 2015 when the archive grew from 45 to 85 data sets. This paper introduces and will focus on the new data expansion from 85 to 128 data sets. Beyond expanding this valuable resource, this paper offers pragmatic advice to anyone who may wish to evaluate a new algorithm on the archive. Finally, this paper makes a novel and yet actionable claim: of the hundreds of papers that show an improvement over the standard baseline (1-nearest neighbor classification), a large fraction may be mis-attributing the reasons for their improvement. Moreover, they may have been able to achieve the same improvement with a much simpler modification, requiring just a single line of code."
to:NB  time_series  machine_learning 
4 days ago by cshalizi
[1909.06674] A Step Toward Quantifying Independently Reproducible Machine Learning Research
"What makes a paper independently reproducible? Debates on reproducibility center around intuition or assumptions but lack empirical results. Our field focuses on releasing code, which is important, but is not sufficient for determining reproducibility. We take the first step toward a quantifiable answer by manually attempting to implement 255 papers published from 1984 until 2017, recording features of each paper, and performing statistical analysis of the results. For each paper, we did not look at the authors code, if released, in order to prevent bias toward discrepancies between code and paper."
to:NB  to_read  machine_learning  reproducibility  sounds_grim 
4 days ago by cshalizi
[1909.04791] Techniques All Classifiers Can Learn from Deep Networks: Models, Optimizations, and Regularization
"Deep neural networks have introduced novel and useful tools to the machine learning community, and other types of classifiers can make use of these tools to improve their performance and generality. This paper reviews the current state of the art for deep learning classifier technologies that are being used outside of deep neural networks. Many components of existing deep neural network architectures can be employed by non-network classifiers. In this paper, we review the feature learning, optimization, and regularization methods that form a core of deep network technologies. We then survey non-neural network learning algorithms that make innovative use of these methods to improve classification. We conclude by discussing directions that can be pursued to expand the area of deep learning for a variety of classification algorithms."
to:NB  classifiers  machine_learning  optimization  computational_statistics  statistics  neural_networks 
7 days ago by cshalizi
[1909.04436] The Prevalence of Errors in Machine Learning Experiments
"Context: Conducting experiments is central to research machine learning research to benchmark, evaluate and compare learning algorithms. Consequently it is important we conduct reliable, trustworthy experiments. Objective: We investigate the incidence of errors in a sample of machine learning experiments in the domain of software defect prediction. Our focus is simple arithmetical and statistical errors. Method: We analyse 49 papers describing 2456 individual experimental results from a previously undertaken systematic review comparing supervised and unsupervised defect prediction classifiers. We extract the confusion matrices and test for relevant constraints, e.g., the marginal probabilities must sum to one. We also check for multiple statistical significance testing errors. Results: We find that a total of 22 out of 49 papers contain demonstrable errors. Of these 7 were statistical and 16 related to confusion matrix inconsistency (one paper contained both classes of error). Conclusions: Whilst some errors may be of a relatively trivial nature, e.g., transcription errors their presence does not engender confidence. We strongly urge researchers to follow open science principles so errors can be more easily be detected and corrected, thus as a community reduce this worryingly high error rate with our computational experiments."
to:NB  bad_data_analysis  machine_learning  to_teach:data-mining 
10 days ago by cshalizi
[1908.09257] Normalizing Flows: Introduction and Ideas
"Normalizing Flows are generative models which produce tractable distributions where both sampling and density evaluation can be efficient and exact. The goal of this survey article is to give a coherent and comprehensive review of the literature around the construction and use of Normalizing Flows for distribution learning. We aim to provide context and explanation of the models, review current state-of-the-art literature, and identify open questions and promising future directions."
to:NB  statistics  machine_learning  stochastic_processes 
26 days ago by cshalizi
[1908.09635] A Survey on Bias and Fairness in Machine Learning
"With the widespread use of AI systems and applications in our everyday lives, it is important to take fairness issues into consideration while designing and engineering these types of systems. Such systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that the decisions do not reflect discriminatory behavior toward certain groups or populations. We have recently seen work in machine learning, natural language processing, and deep learning that addresses such challenges in different subdomains. With the commercialization of these systems, researchers are becoming aware of the biases that these applications can contain and have attempted to address them. In this survey we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined in order to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and how they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields."
to:NB  to_read  algorithmic_fairness  prediction  machine_learning  lerman.kristina  galstyan.aram  to_teach:data-mining 
26 days ago by cshalizi
Machine Learning Methods That Economists Should Know About | Annual Review of Economics
"We discuss the relevance of the recent machine learning (ML) literature for economics and econometrics. First we discuss the differences in goals, methods, and settings between the ML literature and the traditional econometrics and statistics literatures. Then we discuss some specific methods from the ML literature that we view as important for empirical researchers in economics. These include supervised learning methods for regression and classification, unsupervised learning methods, and matrix completion methods. Finally, we highlight newly developed methods at the intersection of ML and econometrics that typically perform better than either off-the-shelf ML or more traditional econometric methods when applied to particular classes of problems, including causal inference for average treatment effects, optimal policy estimation, and estimation of the counterfactual effect of price changes in consumer choice models."
to:NB  machine_learning  data_mining  nonparametrics  statistics  economics  athey.susan 
27 days ago by cshalizi
[1908.06951] Gradient Boosting Machine: A Survey
"In this survey, we discuss several different types of gradient boosting algorithms and illustrate their mathematical frameworks in detail: 1. introduction of gradient boosting leads to 2. objective function optimization, 3. loss function estimations, and 4. model constructions. 5. application of boosting in ranking."
to:NB  ensemble_methods  boosting  statistics  machine_learning 
4 weeks ago by cshalizi
Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop | The National Academies Press
"The Intelligence Community Studies Board of the National Academies of Sciences, Engineering, and Medicine convened a workshop on August 9-10, 2017 to examine challenges in machine generation of analytic products from multi-source data. Workshop speakers and participants discussed research challenges related to machine-based methods for generating analytic products and for automating the evaluation of these products, with special attention to learning from small data, using multi-source data, adversarial learning, and understanding the human-machine relationship. This publication summarizes the presentations and discussions from the workshop."
to:NB  books:noted  intelligence_(spying)  machine_learning 
4 weeks ago by cshalizi
[1907.10597] Green AI
"The computations required for deep learning research have been doubling every few months, resulting in an estimated 300,000x increase from 2012 to 2018 [2]. These computations have a surprisingly large carbon footprint [38]. Ironically, deep learning was inspired by the human brain, which is remarkably energy efficient. Moreover, the financial cost of the computations can make it difficult for academics, students, and researchers, in particular those from emerging economies, to engage in deep learning research.
"This position paper advocates a practical solution by making efficiency an evaluation criterion for research alongside accuracy and related measures. In addition, we propose reporting the financial cost or "price tag" of developing, training, and running models to provide baselines for the investigation of increasingly efficient methods. Our goal is to make AI both greener and more inclusive---enabling any inspired undergraduate with a laptop to write high-quality research papers. Green AI is an emerging focus at the Allen Institute for AI."
to:NB  machine_learning  computational_statistics  smith.noah_a.  energy 
5 weeks ago by cshalizi
[1907.12652] How model accuracy and explanation fidelity influence user trust
"Machine learning systems have become popular in fields such as marketing, financing, or data mining. While they are highly accurate, complex machine learning systems pose challenges for engineers and users. Their inherent complexity makes it impossible to easily judge their fairness and the correctness of statistically learned relations between variables and classes. Explainable AI aims to solve this challenge by modelling explanations alongside with the classifiers, potentially improving user trust and acceptance. However, users should not be fooled by persuasive, yet untruthful explanations. We therefore conduct a user study in which we investigate the effects of model accuracy and explanation fidelity, i.e. how truthfully the explanation represents the underlying model, on user trust. Our findings show that accuracy is more important for user trust than explainability. Adding an explanation for a classification result can potentially harm trust, e.g. when adding nonsensical explanations. We also found that users cannot be tricked by high-fidelity explanations into having trust for a bad classifier. Furthermore, we found a mismatch between observed (implicit) and self-reported (explicit) trust."
to:NB  machine_learning  data_mining  to_teach:data-mining 
6 weeks ago by cshalizi
[1905.05134] What Clinicians Want: Contextualizing Explainable Machine Learning for Clinical End Use
"Translating machine learning (ML) models effectively to clinical practice requires establishing clinicians' trust. Explainability, or the ability of an ML model to justify its outcomes and assist clinicians in rationalizing the model prediction, has been generally understood to be critical to establishing trust. However, the field suffers from the lack of concrete definitions for usable explanations in different settings. To identify specific aspects of explainability that may catalyze building trust in ML models, we surveyed clinicians from two distinct acute care specialties (Intenstive Care Unit and Emergency Department). We use their feedback to characterize when explainability helps to improve clinicians' trust in ML models. We further identify the classes of explanations that clinicians identified as most relevant and crucial for effective translation to clinical practice. Finally, we discern concrete metrics for rigorous evaluation of clinical explainability methods. By integrating perceptions of explainability between clinicians and ML researchers we hope to facilitate the endorsement and broader adoption and sustained use of ML systems in healthcare."
to:NB  machine_learning  data_mining  explanation  medicine  goldenberg.anna  to_teach:data-mining 
6 weeks ago by cshalizi
[1908.02723] Advocacy Learning: Learning through Competition and Class-Conditional Representations
"We introduce advocacy learning, a novel supervised training scheme for attention-based classification problems. Advocacy learning relies on a framework consisting of two connected networks: 1) N Advocates (one for each class), each of which outputs an argument in the form of an attention map over the input, and 2) a Judge, which predicts the class label based on these arguments. Each Advocate produces a class-conditional representation with the goal of convincing the Judge that the input example belongs to their class, even when the input belongs to a different class. Applied to several different classification tasks, we show that advocacy learning can lead to small improvements in classification accuracy over an identical supervised baseline. Though a series of follow-up experiments, we analyze when and how such class-conditional representations improve discriminative performance. Though somewhat counter-intuitive, a framework in which subnetworks are trained to competitively provide evidence in support of their class shows promise, in many cases performing on par with standard learning approaches. This provides a foundation for further exploration into competition and class-conditional representations in supervised learning."

--- Drs. Mercier and Sperber, please call your office. (Also Drs. Jordan and Jacobs...)
to:NB  machine_learning  collective_cognition  ensemble_methods  to_read 
6 weeks ago by cshalizi
[1906.07906] Discovery of Physics from Data: Universal Laws and Discrepancy Models
"Machine learning (ML) and artificial intelligence (AI) algorithms are now being used to automate the discovery of physics principles and governing equations from measurement data alone. However, positing a universal physical law from data is challenging without simultaneously proposing an accompanying discrepancy model to account for the inevitable mismatch between theory and measurements. By revisiting the classic problem of modeling falling objects of different size and mass, we highlight a number of subtle and nuanced issues that must be addressed by modern data-driven methods for the automated discovery of physics. Specifically, we show that measurement noise and complex secondary physical mechanisms, such as unsteady fluid drag forces, can obscure the underlying law of gravitation, leading to an erroneous model. Without proposing an appropriate discrepancy model to handle these drag forces, the data supports an Aristotelian, versus a Galilean, theory of gravitation. Using the sparse identification of nonlinear dynamics (SINDy) algorithm, with the additional assumption that each separate falling object is governed by the same physical law, we are able to identify a viable discrepancy model to account for the fluid dynamic forces that explain the mismatch between a posited universal law of gravity and the measurement data. This work highlights the fact that the simple application of ML/AI will generally be insufficient to extract universal physical laws without further modification."
to:NB  physics  philosophy_of_science  machine_learning  statistics 
june 2019 by cshalizi
[1906.05433] Tackling Climate Change with Machine Learning
"Climate change is one of the greatest challenges facing humanity, and we, as machine learning experts, may wonder how we can help. Here we describe how machine learning can be a powerful tool in reducing greenhouse gas emissions and helping society adapt to a changing climate. From smart grids to disaster management, we identify high impact problems where existing gaps can be filled by machine learning, in collaboration with other fields. Our recommendations encompass exciting research questions as well as promising business opportunities. We call on the machine learning community to join the global effort against climate change."

--- My gut reaction is that this is well-intentioned but point-missing, but note the final tags.
to:NB  climate_change  machine_learning  to_read  to_be_shot_after_a_fair_trial 
june 2019 by cshalizi
Refining the Concept of Scientific Inference When Working with Big Data: Proceedings of a Workshop | The National Academies Press
"The concept of utilizing big data to enable scientific discovery has generated tremendous excitement and investment from both private and public sectors over the past decade, and expectations continue to grow. Using big data analytics to identify complex patterns hidden inside volumes of data that have never been combined could accelerate the rate of scientific discovery and lead to the development of beneficial technologies and products. However, producing actionable scientific knowledge from such large, complex data sets requires statistical models that produce reliable inferences (NRC, 2013). Without careful consideration of the suitability of both available data and the statistical models applied, analysis of big data may result in misleading correlations and false discoveries, which can potentially undermine confidence in scientific research if the results are not reproducible. In June 2016 the National Academies of Sciences, Engineering, and Medicine convened a workshop to examine critical challenges and opportunities in performing scientific inference reliably when working with big data. Participants explored new methodologic developments that hold significant promise and potential research program areas for the future. This publication summarizes the presentations and discussions from the workshop."

--- I am hapy with my contributions.
to:NB  books:recommended  books:contributed-to  statistics  computational_statistics  machine_learning  data_mining 
may 2019 by cshalizi
[1905.10854] All Neural Networks are Created Equal
"One of the unresolved questions in the context of deep learning is the triumph of GD based optimization, which is guaranteed to converge to one of many local minima. To shed light on the nature of the solutions that are thus being discovered, we investigate the ensemble of solutions reached by the same network architecture, with different random initialization of weights and random mini-batches. Surprisingly, we observe that these solutions are in fact very similar - more often than not, each train and test example is either classified correctly by all the networks, or by none at all. Moreover, all the networks seem to share the same learning dynamics, whereby initially the same train and test examples are incorporated into the learnt model, followed by other examples which are learnt in roughly the same order. When different neural network architectures are compared, the same learning dynamics is observed even when one architecture is significantly stronger than the other and achieves higher accuracy. Finally, when investigating other methods that involve the gradual refinement of a solution, such as boosting, once again we see the same learning pattern. In all cases, it appears as if all the classifiers start by learning to classify correctly the same train and test examples, while the more powerful classifiers continue to learn to classify correctly additional examples. These results are incredibly robust, observed for a large variety of architectures, hyperparameters and different datasets of images. Thus we observe that different classification solutions may be discovered by different means, but typically they evolve in roughly the same manner and demonstrate a similar success and failure behavior. For a given dataset, such behavior seems to be strongly correlated with effective generalization, while the induced ranking of examples may reflect inherent structure in the data."

to:NB  to_read  optimization  machine_learning  neural_networks  your_favorite_deep_neural_network_sucks 
may 2019 by cshalizi
[1905.10887] Classification Accuracy Score for Conditional Generative Models
"Deep generative models (DGMs) of images are now sufficiently mature that they produce nearly photorealistic samples and obtain scores similar to the data distribution on heuristics such as Frechet Inception Distance. These results, especially on large-scale datasets such as ImageNet, suggest that DGMs are learning the data distribution in a perceptually meaningful space, and can be used in downstream tasks. To test this latter hypothesis, we use class-conditional generative models from a number of model classes---variational autoencoder, autoregressive models, and generative adversarial networks---to infer the class labels of real data. We perform this inference by training the image classifier using only synthetic data, and using the classifier to predict labels on real data. The performance on this task, which we call Classification Accuracy Score (CAS), highlights some surprising results not captured by traditional metrics and comprise our contributions. First, when using a state-of-the-art GAN (BigGAN), Top-5 accuracy decreases by 41.6% compared to the original data and conditional generative models from other model classes, such as high-resolution VQ-VAE and Hierarchical Autoregressive Models, substantially outperform GANs on this benchmark. Second, CAS automatically surfaces particular classes for which generative models failed to capture the data distribution, and were previously unknown in the literature. Third, we find traditional GAN metrics such as Frechet Inception Distance neither predictive of CAS nor useful when evaluating non-GAN models. Finally, we introduce Naive Augmentation Score, a variant of CAS where the image classifier is trained on both real and synthetic data, to demonstrate that naive augmentation improves classification performance in limited circumstances. In order to facilitate better diagnoses of generative models, we open-source the proposed metric."
to:NB  machine_learning  your_favorite_deep_neural_network_sucks 
may 2019 by cshalizi
[1905.11075] Machine Learning for Fluid Mechanics
"The field of fluid mechanics is rapidly advancing, driven by unprecedented volumes of data from experiments, field measurements, and large-scale simulations at multiple spatiotemporal scales. Machine learning presents us with a wealth of techniques to extract information from data that can be translated into knowledge about the underlying fluid mechanics. Moreover, machine learning algorithms can augment domain knowledge and automate tasks related to flow control and optimization. This article presents an overview of past history, current developments, and emerging opportunities of machine learning for fluid mechanics. We outline fundamental machine learning methodologies and discuss their uses for understanding, modeling, optimizing, and controlling fluid flows. The strengths and limitations of these methods are addressed from the perspective of scientific inquiry that links data with modeling, experiments, and simulations. Machine learning provides a powerful information processing framework that can augment, and possibly even transform, current lines of fluid mechanics research and industrial applications."
to:NB  machine_learning  hydrodynamics  statistics 
may 2019 by cshalizi
Artificial Intelligence: The Ambiguous Labor Market Impact of Automating Prediction
"Recent advances in artificial intelligence are primarily driven by machine learning, a prediction technology. Prediction is useful because it is an input into decision-making. In order to appreciate the impact of artificial intelligence on jobs, it is important to understand the relative roles of prediction and decision tasks. We describe and provide examples of how artificial intelligence will affect labor, emphasizing differences between when the automation of prediction leads to automating decisions versus enhancing decision-making by humans."
to:NB  economics  machine_learning 
may 2019 by cshalizi
A Spline Theory of Deep Learning
"We build a rigorous bridge between deep networks (DNs) and approximation theory via spline functions and operators. Our key result is that a large class of DNs can be written as a composition of max-affine spline operators (MASOs), which provide a powerful portal through which to view and analyze their inner workings. For instance, conditioned on the input signal, the output of a MASO DN can be written as a simple affine transformation of the input. This implies that a DN constructs a set of signal-dependent, class-specific templates against which the signal is compared via a simple inner product; we explore the links to the classical theory of optimal classification via matched filters and the effects of data memorization. Going further, we propose a simple penalty term that can be added to the cost function of any DN learning algorithm to force the templates to be orthogonal with each other; this leads to significantly improved classification performance and reduced overfitting with no change to the DN architecture. The spline partition of the input signal space opens up a new geometric avenue to study how DNs organize signals in a hierarchical fashion. As an application, we develop and validate a new distance metric for signals that quantifies the difference between their partition encodings."
to:NB  to_read  approximation  splines  neural_networks  machine_learning  your_favorite_deep_neural_network_sucks  via:csantos 
april 2019 by cshalizi
[1902.03515] Multi-Domain Translation by Learning Uncoupled Autoencoders
"Multi-domain translation seeks to learn a probabilistic coupling between marginal distributions that reflects the correspondence between different domains. We assume that data from different domains are generated from a shared latent representation based on a structural equation model. Under this assumption, we show that the problem of computing a probabilistic coupling between marginals is equivalent to learning multiple uncoupled autoencoders that embed to a given shared latent distribution. In addition, we propose a new framework and algorithm for multi-domain translation based on learning the shared latent distribution and training autoencoders under distributional constraints. A key practical advantage of our framework is that new autoencoders (i.e., new domains) can be added sequentially to the model without retraining on the other domains, which we demonstrate experimentally on image as well as genomics datasets."

--- Last tag is tentative.
to:NB  machine_learning  to_read  inference_to_latent_objects  uhler.caroline  factor_analysis 
april 2019 by cshalizi
Emergence of analogy from relation learning | PNAS
"By middle childhood, humans are able to learn abstract semantic relations (e.g., antonym, synonym, category membership) and use them to reason by analogy. A deep theoretical challenge is to show how such abstract relations can arise from nonrelational inputs, thereby providing key elements of a protosymbolic representation system. We have developed a computational model that exploits the potential synergy between deep learning from “big data” (to create semantic features for individual words) and supervised learning from “small data” (to create representations of semantic relations between words). Given as inputs labeled pairs of lexical representations extracted by deep learning, the model creates augmented representations by remapping features according to the rank of differences between values for the two words in each pair. These augmented representations aid in coping with the feature alignment problem (e.g., matching those features that make “love-hate” an antonym with the different features that make “rich-poor” an antonym). The model extracts weight distributions that are used to estimate the probabilities that new word pairs instantiate each relation, capturing the pattern of human typicality judgments for a broad range of abstract semantic relations. A measure of relational similarity can be derived and used to solve simple verbal analogies with human-level accuracy. Because each acquired relation has a modular representation, basic symbolic operations are enabled (notably, the converse of any learned relation can be formed without additional training). Abstract semantic relations can be induced by bootstrapping from nonrelational inputs, thereby enabling relational generalization and analogical reasoning."
to:NB  machine_learning  analogy  artificial_intelligence 
march 2019 by cshalizi
Comparing continual task learning in minds and machines | PNAS
"Humans can learn to perform multiple tasks in succession over the lifespan (“continual” learning), whereas current machine learning systems fail. Here, we investigated the cognitive mechanisms that permit successful continual learning in humans and harnessed our behavioral findings for neural network design. Humans categorized naturalistic images of trees according to one of two orthogonal task rules that were learned by trial and error. Training regimes that focused on individual rules for prolonged periods (blocked training) improved human performance on a later test involving randomly interleaved rules, compared with control regimes that trained in an interleaved fashion. Analysis of human error patterns suggested that blocked training encouraged humans to form “factorized” representation that optimally segregated the tasks, especially for those individuals with a strong prior bias to represent the stimulus space in a well-structured way. By contrast, standard supervised deep neural networks trained on the same tasks suffered catastrophic forgetting under blocked training, due to representational interference in the deeper layers. However, augmenting deep networks with an unsupervised generative model that allowed it to first learn a good embedding of the stimulus space (similar to that observed in humans) reduced catastrophic forgetting under blocked training. Building artificial agents that first learn a model of the world may be one promising route to solving continual task performance in artificial intelligence research."
to:NB  machine_learning  neural_networks  cognitive_science  experimental_psychology 
october 2018 by cshalizi
[1709.05862] Recognizing Objects In-the-wild: Where Do We Stand?
"The ability to recognize objects is an essential skill for a robotic system acting in human-populated environments. Despite decades of effort from the robotic and vision research communities, robots are still missing good visual perceptual systems, preventing the use of autonomous agents for real-world applications. The progress is slowed down by the lack of a testbed able to accurately represent the world perceived by the robot in-the-wild. In order to fill this gap, we introduce a large-scale, multi-view object dataset collected with an RGB-D camera mounted on a mobile robot. The dataset embeds the challenges faced by a robot in a real-life application and provides a useful tool for validating object recognition algorithms. Besides describing the characteristics of the dataset, the paper evaluates the performance of a collection of well-established deep convolutional networks on the new dataset and analyzes the transferability of deep representations from Web images to robotic data. Despite the promising results obtained with such representations, the experiments demonstrate that object classification with real-life robotic data is far from being solved. Finally, we provide a comparative study to analyze and highlight the open challenges in robot vision, explaining the discrepancies in the performance."
to:NB  machine_learning  neural_networks  your_favorite_deep_neural_network_sucks  classifiers  to_read  via:melanie_mitchell 
october 2018 by cshalizi
Hello World | W. W. Norton & Company
"If you were accused of a crime, who would you rather decide your sentence—a mathematically consistent algorithm incapable of empathy or a compassionate human judge prone to bias and error? What if you want to buy a driverless car and must choose between one programmed to save as many lives as possible and another that prioritizes the lives of its own passengers? And would you agree to share your family’s full medical history if you were told that it would help researchers find a cure for cancer?
"These are just some of the dilemmas that we are beginning to face as we approach the age of the algorithm, when it feels as if the machines reign supreme. Already, these lines of code are telling us what to watch, where to go, whom to date, and even whom to send to jail. But as we rely on algorithms to automate big, important decisions—in crime, justice, healthcare, transportation, and money—they raise questions about what we want our world to look like. What matters most: Helping doctors with diagnosis or preserving privacy? Protecting victims of crime or preventing innocent people being falsely accused?
"Hello World takes us on a tour through the good, the bad, and the downright ugly of the algorithms that surround us on a daily basis. Mathematician Hannah Fry reveals their inner workings, showing us how algorithms are written and implemented, and demonstrates the ways in which human bias can literally be written into the code. By weaving in relatable, real world stories with accessible explanations of the underlying mathematics that power algorithms, Hello World helps us to determine their power, expose their limitations, and examine whether they really are improvement on the human systems they replace."
to:NB  books:noted  data_mining  machine_learning  prediction 
september 2018 by cshalizi
Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning
"Randomized neural networks are immortalized in this AI Koan: In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. What are you doing?'' asked Minsky. I am training a randomly wired neural net to play tic-tac-toe,'' Sussman replied. Why is the net wired randomly?'' asked Minsky. Sussman replied, I do not want it to have any preconceptions of how to play.'' Minsky then shut his eyes. Why do you close your eyes?'' Sussman asked his teacher. So that the room will be empty,'' replied Minsky. At that moment, Sussman was enlightened. We analyze shallow random networks with the help of concentration of measure inequalities. Specifically, we consider architectures that compute a weighted sum of their inputs after passing them through a bank of arbitrary randomized nonlinearities. We identify conditions under which these networks exhibit good classification performance, and bound their test error in terms of the size of the dataset and the number of random nonlinearities."

--- Have I never bookmarked this before?
in_NB  approximation  kernel_methods  random_projections  statistics  prediction  classifiers  rahimi.ali  recht.benjamin  machine_learning  have_read 
september 2018 by cshalizi
Archive ouverte HAL - The Great Regression. Machine Learning, Econometrics, and the Future of Quantitative Social Sciences
"What can machine learning do for (social) scientific analysis, and what can it do to it? A contribution to the emerging debate on the role of machine learning for the social sciences, this article offers an introduction to this class of statistical techniques. It details its premises, logic, and the challenges it faces. This is done by comparing machine learning to more classical approaches to quantification – most notably parametric regression– both at a general level and in practice. The article is thus an intervention in the contentious debates about the role and possible consequences of adopting statistical learning in science. We claim that the revolution announced by many and feared by others will not happen any time soon, at least not in the terms that both proponents and critics of the technique have spelled out. The growing use of machine learning is not so much ushering in a radically new quantitative era as it is fostering an increased competition between the newly termed classic method and the learning approach. This, in turn, results in more uncertainty with respect to quantified results. Surprisingly enough, this may be good news for knowledge overall."

--- The correct line here is that 90%+ of "machine learning" is rebranded non-parametric regression, which is what the social sciences should have been doing all along anyway, because they have no good theories which suggest particular parametric forms. (Partial exceptions: demography and epidemiology.) If the resulting confidence sets are bigger than they'd like, that's still the actual range of uncertainty they need to live with, until they can reduce it with more and better empirical information, or additional constraints from well-supported theories. (Arguably, this was all in Haavelmo.) I look forward to seeing whether this paper grasps these obvious truths.
to:NB  to_read  regression  social_science_methodology  machine_learning  via:phnk  econometrics  to_be_shot_after_a_fair_trial 
august 2018 by cshalizi
[1406.2661] Generative Adversarial Networks
"We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples."

--- Kind of astonishing that the phrases "actor-critic" or "co-evolution" do not appear anywhere in the paper. The bit about KL divergence is nice, though. (But even then, it doesn't involve multi-layer perceptrons at all, it's all at the level of probability distributions and could apply to any learning system.)
to:NB  have_read  neural_networks  machine_learning  computational_statistics  statistics 
june 2018 by cshalizi
[1805.10204] Adversarial examples from computational constraints
"Why are classifiers in high dimension vulnerable to "adversarial" perturbations? We show that it is likely not due to information theoretic limitations, but rather it could be due to computational constraints.
"First we prove that, for a broad set of classification tasks, the mere existence of a robust classifier implies that it can be found by a possibly exponential-time algorithm with relatively few training examples. Then we give a particular classification task where learning a robust classifier is computationally intractable. More precisely we construct a binary classification task in high dimensional space which is (i) information theoretically easy to learn robustly for large perturbations, (ii) efficiently learnable (non-robustly) by a simple linear separator, (iii) yet is not efficiently robustly learnable, even for small perturbations, by any algorithm in the statistical query (SQ) model. This example gives an exponential separation between classical learning and robust learning in the statistical query model. It suggests that adversarial examples may be an unavoidable byproduct of computational limitations of learning algorithms."
in_NB  adversarial_examples  computational_complexity  machine_learning  classifiers  have_read  bubeck.sebastien 
may 2018 by cshalizi
Artificial Intelligence — The Revolution Hasn’t Happened Yet
Unsurprisingly, Michael Jordan talks sense.

(Trivial and unrelated rant: What on Earth is the point of using Medium? It takes a post which is about 24k of text and actual formatting, and bloats it to over 150k, to do, so far as I can see, absolutely nothing of value to readers.)
artificial_intelligence  debunking  machine_learning  jordan.michael_i. 
april 2018 by cshalizi
[1802.08232] The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets
"Machine learning models based on neural networks and deep learning are being rapidly adopted for many purposes. What those models learn, and what they may share, is a significant concern when the training data may contain secrets and the models are public -- e.g., when a model helps users compose text messages using models trained on all users' messages.
"This paper presents exposure: a simple-to-compute metric that can be applied to any deep learning model for measuring the memorization of secrets. Using this metric, we show how to extract those secrets efficiently using black-box API access. Further, we show that unintended memorization occurs early, is not due to over-fitting, and is a persistent issue across different types of models, hyperparameters, and training strategies. We experiment with both real-world models (e.g., a state-of-the-art translation model) and datasets (e.g., the Enron email dataset, which contains users' credit card numbers) to demonstrate both the utility of measuring exposure and the ability to extract secrets.
"Finally, we consider many defenses, finding some ineffective (like regularization), and others to lack guarantees. However, by instantiating our own differentially-private recurrent model, we validate that by appropriately investing in the use of state-of-the-art techniques, the problem can be resolved, with high utility."

--- I wonder how hard this would be to replicate with a support-vector machine?
machine_learning  statistics  privacy  neural_networks  via:kjhealy  your_favorite_deep_neural_network_sucks 
march 2018 by cshalizi
Asking the Right Questions About AI – Yonatan Zunger – Medium
Pretty good, reading "machine learning", or even "statistical modeling", for "artificial intelligence" throughout (as he more or less admits up front). Worth teaching in particular for the black-faces-as-gorillas disaster.
machine_learning  data_mining  to_teach:data-mining 
february 2018 by cshalizi
[1801.00631] Deep Learning: A Critical Appraisal
"Although deep learning has historical roots going back decades, neither the term "deep learning" nor the approach was popular just over five years ago, when the field was reignited by papers such as Krizhevsky, Sutskever and Hinton's now classic (2012) deep network model of Imagenet. What has the field discovered in the five subsequent years? Against a background of considerable progress in areas such as speech recognition, image recognition, and game playing, and considerable enthusiasm in the popular press, I present ten concerns for deep learning, and suggest that deep learning must be supplemented by other techniques if we are to reach artificial general intelligence."

--- Nothing startling, but sound.
in_NB  artificial_intelligence  neural_networks  marcus.gary  have_read  machine_learning  statistics 
january 2018 by cshalizi
Artificial Unintelligence: How Computers Misunderstand the World | The MIT Press
"In Artificial Unintelligence, Meredith Broussard argues that our collective enthusiasm for applying computer technology to every aspect of life has resulted in a tremendous amount of poorly designed systems. We are so eager to do everything digitally—hiring, driving, paying bills, even choosing romantic partners—that we have stopped demanding that our technology actually work. Broussard, a software developer and journalist, reminds us that there are fundamental limits to what we can (and should) do with technology. With this book, she offers a guide to understanding the inner workings and outer limits of technology—and issues a warning that we should never assume that computers always get things right.
"Making a case against technochauvinism—the belief that technology is always the solution—Broussard argues that it’s just not true that social problems would inevitably retreat before a digitally enabled Utopia. To prove her point, she undertakes a series of adventures in computer programming. She goes for an alarming ride in a driverless car, concluding “the cyborg future is not coming any time soon”; uses artificial intelligence to investigate why students can’t pass standardized tests; deploys machine learning to predict which passengers survived the Titanic disaster; and attempts to repair the U.S. campaign finance system by building AI software. If we understand the limits of what we can do with technology, Broussard tells us, we can make better choices about what we should do with it to make the world better for everyone."
in_NB  books:noted  machine_learning  artificial_intelligence  computers  data_analysis  to_teach:data-mining  from_library 
december 2017 by cshalizi
Machine learning, social learning and the governance of self-driving cars ---Social Studies of Science - Jack Stilgoe, 2017
"Self-driving cars, a quintessentially ‘smart’ technology, are not born smart. The algorithms that control their movements are learning as the technology emerges. Self-driving cars represent a high-stakes test of the powers of machine learning, as well as a test case for social learning in technology governance. Society is learning about the technology while the technology learns about society. Understanding and governing the politics of this technology means asking ‘Who is learning, what are they learning and how are they learning?’ Focusing on the successes and failures of social learning around the much-publicized crash of a Tesla Model S in 2016, I argue that trajectories and rhetorics of machine learning in transport pose a substantial governance challenge. ‘Self-driving’ or ‘autonomous’ cars are misnamed. As with other technologies, they are shaped by assumptions about social needs, solvable problems, and economic opportunities. Governing these technologies in the public interest means improving social learning by constructively engaging with the contingencies of machine learning."

--- The fact that I could almost have written the abstract from just the journal and the title suggests there is little new here, but the last tag applies.
to:NB  machine_learning  robots_and_robotics  sociology  to_be_shot_after_a_fair_trial 
december 2017 by cshalizi
Optimized Pre-Processing for Discrimination Prevention
"Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling discrimination, limiting distortion in individual data samples, and preserving utility. We characterize the impact of limited sample size in accomplishing this objective. Two instances of the proposed optimization are applied to datasets, including one on real-world criminal recidivism. Results show that discrimination can be greatly reduced at a small cost in classification accuracy."
to:NB  classifiers  machine_learning  optimization  re:prediction-without-prejudice 
november 2017 by cshalizi
[1711.10337] Are GANs Created Equal? A Large-Scale Study
"Generative adversarial networks (GAN) are a powerful subclass of generative models. Despite a very rich research activity leading to numerous interesting GAN algorithms, it is still very hard to assess which algorithm(s) perform better than others. We conduct a neutral, multi-faceted large-scale empirical study on state-of-the art models and evaluation measures. We find that most models can reach similar scores with enough hyperparameter optimization and random restarts. This suggests that improvements can arise from a higher computational budget and tuning more than fundamental algorithmic changes. To overcome some limitations of the current metrics, we also propose several data sets on which precision and recall can be computed. Our experimental results suggest that future GAN research should be based on more systematic and objective evaluation procedures. Finally, we did not find evidence that any of the tested algorithms consistently outperforms the original one."
to:NB  neural_networks  machine_learning  optimization  your_favorite_deep_neural_network_sucks 
november 2017 by cshalizi
[1707.05589] On the State of the Art of Evaluation in Neural Language Models
"Ongoing innovations in recurrent neural network architectures have provided a steady influx of apparently state-of-the-art results on language modelling benchmarks. However, these have been evaluated using differing code bases and limited computational resources, which represent uncontrolled sources of experimental variation. We reevaluate several popular architectures and regularisation methods with large-scale automatic black-box hyperparameter tuning and arrive at the somewhat surprising conclusion that standard LSTM architectures, when properly regularised, outperform more recent models. We establish a new state of the art on the Penn Treebank and Wikitext-2 corpora, as well as strong baselines on the Hutter Prize dataset."
to:NB  natural_language_processing  statistics  machine_learning  neural_networks  your_favorite_deep_neural_network_sucks 
november 2017 by cshalizi
[1706.02744] Avoiding Discrimination through Causal Reasoning
"Recent work on fairness in machine learning has focused on various statistical discrimination criteria and how they trade off. Most of these criteria are observational: They depend only on the joint distribution of predictor, protected attribute, features, and outcome. While convenient to work with, observational criteria have severe inherent limitations that prevent them from resolving matters of fairness conclusively.
"Going beyond observational criteria, we frame the problem of discrimination based on protected attributes in the language of causal reasoning. This viewpoint shifts attention from "What is the right fairness criterion?" to "What do we want to assume about the causal data generating process?" Through the lens of causality, we make several contributions. First, we crisply articulate why and when observational criteria fail, thus formalizing what was before a matter of opinion. Second, our approach exposes previously ignored subtleties and why they are fundamental to the problem. Finally, we put forward natural causal non-discrimination criteria and develop algorithms that satisfy them."
to:NB  to_read  causality  algorithmic_fairness  prediction  machine_learning  janzing.dominik  re:ADAfaEPoV  via:arsyed 
november 2017 by cshalizi
[1711.00867] The (Un)reliability of saliency methods
"Saliency methods aim to explain the predictions of deep neural networks. These methods lack reliability when the explanation is sensitive to factors that do not contribute to the model prediction. We use a simple and common pre-processing step ---adding a constant shift to the input data--- to show that a transformation with no effect on the model can cause numerous methods to incorrectly attribute. In order to guarantee reliability, we posit that methods should fulfill input invariance, the requirement that a saliency method mirror the sensitivity of the model with respect to transformations of the input. We show, through several examples, that saliency methods that do not satisfy input invariance result in misleading attribution."
to:NB  neural_networks  machine_learning  credit_attribution  via:?  your_favorite_deep_neural_network_sucks 
november 2017 by cshalizi
Machine Learners | The MIT Press
"Machine learning—programming computers to learn from data—has spread across scientific disciplines, media, entertainment, and government. Medical research, autonomous vehicles, credit transaction processing, computer gaming, recommendation systems, finance, surveillance, and robotics use machine learning. Machine learning devices (sometimes understood as scientific models, sometimes as operational algorithms) anchor the field of data science. They have also become mundane mechanisms deeply embedded in a variety of systems and gadgets. In contexts from the everyday to the esoteric, machine learning is said to transform the nature of knowledge. In this book, Adrian Mackenzie investigates whether machine learning also transforms the practice of critical thinking.
"Mackenzie focuses on machine learners—either humans and machines or human-machine relations—situated among settings, data, and devices. The settings range from fMRI to Facebook; the data anything from cat images to DNA sequences; the devices include neural networks, support vector machines, and decision trees. He examines specific learning algorithms—writing code and writing about code—and develops an archaeology of operations that, following Foucault, views machine learning as a form of knowledge production and a strategy of power. Exploring layers of abstraction, data infrastructures, coding practices, diagrams, mathematical formalisms, and the social organization of machine learning, Mackenzie traces the mostly invisible architecture of one of the central zones of contemporary technological cultures.
"Mackenzie’s account of machine learning locates places in which a sense of agency can take root. His archaeology of the operational formation of machine learning does not unearth the footprint of a strategic monolith but reveals the local tributaries of force that feed into the generalization and plurality of the field."

--- We really need good histories, and critical studies, of ML, but the mere style of rhetoric here makes me suspicious of whether the author is up to the job. (Which is a prejudice...) . Last tag applies.
to:NB  books:noted  machine_learning  philosophy_of_science  in_wishlist  to_be_shot_after_a_fair_trial 
september 2017 by cshalizi
Deep Convolutional Neural Networks for Image Classification: A Comprehensive Review | Neural Computation | MIT Press Journals
"Convolutional neural networks (CNNs) have been applied to visual tasks since the late 1980s. However, despite a few scattered applications, they were dormant until the mid-2000s when developments in computing power and the advent of large amounts of labeled data, supplemented by improved algorithms, contributed to their advancement and brought them to the forefront of a neural network renaissance that has seen rapid progression since 2012. In this review, which focuses on the application of CNNs to image classification tasks, we cover their development, from their predecessors up to recent state-of-the-art deep learning systems. Along the way, we analyze (1) their early successes, (2) their role in the deep learning renaissance, (3) selected symbolic works that have contributed to their recent popularity, and (4) several improvement attempts by reviewing contributions and challenges of over 300 publications. We also introduce some of their current trends and remaining challenges."
to:NB  neural_networks  classifiers  machine_learning  to_read 
august 2017 by cshalizi
The Humans Working Behind the AI Curtain
This is _not_ what we mean when we talk about using computers to expand human capacities.
artificial_intelligence  machine_learning  networked_life  to_teach:data-mining 
january 2017 by cshalizi
Deep Learning | The MIT Press
"Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning.
"The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models.
"Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors."

--- Bengio is worth listening to.
in_NB  books:noted  neural_networks  machine_learning  computational_statistics  bengio.yoshua  books:owned 
december 2016 by cshalizi
Turing learning: a metric-free approach to inferring behavior and its application to swarms | SpringerLink
"We propose Turing Learning, a novel system identification method for inferring the behavior of natural or artificial systems. Turing Learning simultaneously optimizes two populations of computer programs, one representing models of the behavior of the system under investigation, and the other representing classifiers. By observing the behavior of the system as well as the behaviors produced by the models, two sets of data samples are obtained. The classifiers are rewarded for discriminating between these two sets, that is, for correctly categorizing data samples as either genuine or counterfeit. Conversely, the models are rewarded for ‘tricking’ the classifiers into categorizing their data samples as genuine. Unlike other methods for system identification, Turing Learning does not require predefined metrics to quantify the difference between the system and its models. We present two case studies with swarms of simulated robots and prove that the underlying behaviors cannot be inferred by a metric-based system identification method. By contrast, Turing Learning infers the behaviors with high accuracy. It also produces a useful by-product—the classifiers—that can be used to detect abnormal behavior in the swarm. Moreover, we show that Turing Learning also successfully infers the behavior of physical robot swarms. The results show that collective behaviors can be directly inferred from motion trajectories of individuals in the swarm, which may have significant implications for the study of animal collectives. Furthermore, Turing Learning could prove useful whenever a behavior is not easily characterizable using metrics, making it suitable for a wide range of applications."

--- Oh FFS. Co-evolutionary learning of classifiers and hard instances was an old idea when I encountered it in graduate school 20+ years ago. (See, e.g., the discussion of Hillis's work in the 1980s in ch. 1 of Mitchell's _Introduction to Genetic Algorithms_ [1996].) I suppose it's possible that the paper acknowledges this is a new implementation of an ancient idea, while the abstract (and the publicity: http://www.defenseone.com/technology/2016/09/new-ai-learns-through-observation-alone-what-means-drone-surveillance/131322/ ) is breathless. It's _possible_.

(Also: anyone who thinks that using classification accuracy means they're doing "metric-free systems identification" fully deserves what will happen to them.)
machine_learning  reinventing_the_wheel_and_putting_out_a_press_release  to_be_shot_after_a_fair_trial  why_oh_why_cant_we_have_a_better_academic_publishing_system 
september 2016 by cshalizi
[1412.1897] Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images
"Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification problems. Given that DNNs are now able to classify objects in images with near-human-level performance, questions naturally arise as to what differences remain between computer and human vision. A recent study revealed that changing an image (e.g. of a lion) in a way imperceptible to humans can cause a DNN to label the image as something else entirely (e.g. mislabeling a lion a library). Here we show a related result: it is easy to produce images that are completely unrecognizable to humans, but that state-of-the-art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labeling with certainty that white noise static is a lion). Specifically, we take convolutional neural networks trained to perform well on either the ImageNet or MNIST datasets and then find images with evolutionary algorithms or gradient ascent that DNNs label with high confidence as belonging to each dataset class. It is possible to produce images totally unrecognizable to human eyes that DNNs believe with near certainty are familiar objects, which we call "fooling images" (more generally, fooling examples). Our results shed light on interesting differences between human vision and current DNNs, and raise questions about the generality of DNN computer vision."
in_NB  neural_networks  classifiers  machine_learning  have_read  to:blog  via:gptp2016  adversarial_examples 
may 2016 by cshalizi
[1604.00289] Building Machines That Learn and Think Like People
"Recent progress in artificial intelligence (AI) has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn, and how they learn it. Specifically, we argue that these machines should (a) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (b) ground learning in intuitive theories of physics and psychology, to support and enrich the knowledge that is learned; and (c) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes towards these goals that can combine the strengths of recent neural network advances with more structured cognitive models."
in_NB  cognitive_science  artificial_intelligence  neural_networks  machine_learning  via:arthegall  gershman.samuel  tenenbaum.joshua 
may 2016 by cshalizi
[1602.04938] "Why Should I Trust You?": Explaining the Predictions of Any Classifier
"Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust in a model. Trust is fundamental if one plans to take action based on a prediction, or when choosing whether or not to deploy a new model. Such understanding further provides insights into the model, which can be used to turn an untrustworthy model or prediction into a trustworthy one.
"In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction. We further propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). The usefulness of explanations is shown via novel experiments, both simulated and with human subjects. Our explanations empower users in various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and detecting why a classifier should not be trusted."
to:NB  machine_learning  classifiers  explanation  guestrin.carlos  to_read  model_checking 
march 2016 by cshalizi
Cat Basis Pursuit
Will the statistical machine learning reading group meet on 1 April?
machine_learning  cats  funny:geeky  principal_components  sparsity 
february 2016 by cshalizi
Sequential Testing for Large Scale Learning
"We argue that when faced with big data sets, learning and inference algorithms should compute updates using only subsets of data items. We introduce algorithms that use sequential hypothesis tests to adaptively select such a subset of data points. The statistical properties of this subsampling process can be used to control the efficiency and accuracy of learning or inference. In the context of learning by optimization, we test for the probability that the update direction is no more than 90 degrees in the wrong direction. In the context of posterior inference using Markov chain Monte Carlo, we test for the probability that our decision to accept or reject a sample is wrong. We experimentally evaluate our algorithms on a number of models and data sets."
to:NB  computational_statistics  hypothesis_testing  machine_learning  optimization  monte_carlo 
december 2015 by cshalizi
Learning the Structure of Causal Models with Relational and Temporal Dependence
"Many real-world domains are inherently rela- tional and temporal—they consist of heteroge- neous entities that interact with each other over time. Effective reasoning about causality in such domains requires representations that explicitly model relational and temporal dependence. In this work, we provide a formalization of tem- poral relational models. We define temporal ex- tensions to abstract ground graphs—a lifted rep- resentation that abstracts paths of dependence over all possible ground graphs. Temporal ab- stract ground graphs enable a sound and com- plete method for answering d-separation queries on temporal relational models. These methods provide the foundation for a constraint-based al- gorithm, TRCD, that learns causal models from temporal relational data. We provide experimen- tal evidence that demonstrates the need to explic- itly represent time when inferring causal depen- dence. We also demonstrate the expressive gain of TRCD compared to earlier algorithms that do not explicitly represent time."
in_NB  causal_discovery  relational_learning  machine_learning  statistics  jensen.david  heard_the_talk 
july 2015 by cshalizi
Random Features for Large-Scale Kernel Machines
"To accelerate the training of kernel machines, we propose to map the input data to a randomized low-dimensional feature space and then apply existing fast linear methods. The features are designed so that the inner products of the transformed data are approximately equal to those in the feature space of a user specified shift- invariant kernel. We explore two sets of random features, provide convergence bounds on their ability to approximate various radial basis kernels, and show that in large-scale classification and regression tasks linear machine learning al- gorithms applied to these features outperform state-of-the-art large-scale kernel machines."
in_NB  to_read  machine_learning  computational_statistics  approximation  random_projections  fourier_analysis  kernel_methods  statistics  re:hyperbolic_networks 
july 2015 by cshalizi
Training generative neural networks via maximum mean discrepancy optimization
"We consider training a deep neural network to generate samples from an unknown distribu- tion given i.i.d. data. We frame learning as an optimization minimizing a two-sample test statistic—informally speaking, a good genera- tor network produces samples that cause a two- sample test to fail to reject the null hypothesis. As our two-sample test statistic, we use an un- biased estimate of the maximum mean discrep- ancy, which is the centerpiece of the nonpara- metric kernel two-sample test proposed by Gret- ton et al. [2]. We compare to the adversar- ial nets framework introduced by Goodfellow et al. [1], in which learning is a two-player game between a generator network and an adversarial discriminator network, both trained to outwit the other. From this perspective, the MMD statistic plays the role of the discriminator. In addition to empirical comparisons, we prove bounds on the generalization error incurred by optimizing the empirical MMD."

--- On first glance, there's no obvious limitation to neural networks, and indeed it's rather suggestive of indirect inference (to me)
to:NB  simulation  stochastic_models  neural_networks  machine_learning  two-sample_tests  hypothesis_testing  nonparametrics  kernel_methods  statistics  computational_statistics  ghahramani.zoubin 
july 2015 by cshalizi
[1506.05900] Representation Learning for Clustering: A Statistical Framework
"We address the problem of communicating domain knowledge from a user to the designer of a clustering algorithm. We propose a protocol in which the user provides a clustering of a relatively small random sample of a data set. The algorithm designer then uses that sample to come up with a data representation under which k-means clustering results in a clustering (of the full data set) that is aligned with the user's clustering. We provide a formal statistical model for analyzing the sample complexity of learning a clustering representation with this paradigm. We then introduce a notion of capacity of a class of possible representations, in the spirit of the VC-dimension, showing that classes of representations that have finite such dimension can be successfully learned with sample size error bounds, and end our discussion with an analysis of that dimension for classes of representations induced by linear embeddings."
to:NB  machine_learning  representation  learning_theory  clustering  vc-dimension 
july 2015 by cshalizi
Suddenly, a leopard print sofa appears
Actually, there is a bit of an evolutionary-ecological puzzle here: given that our ancestral environments would have contained many more big cats than sofas, why _doesn't_ our perceptual system just leap from leopard-spots to leopard? (My suspicion is that this must have to do with environments, plural, but that's just a hand-wave.) Also, Eberhardt's "causal classifiers" might in principle be able to help with this...
via:?  machine_learning  classifiers  neural_networks  pattern_recognition  to:blog 
june 2015 by cshalizi
[1412.6572] Explaining and Harnessing Adversarial Examples
"Several machine learning models, including neural networks, consistently misclassify adversarial examples---inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed input results in the model outputting an incorrect answer with high confidence. Early attempts at explaining this phenomenon focused on nonlinearity and overfitting. We argue instead that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature. This explanation is supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets. Moreover, this view yields a simple and fast method of generating adversarial examples. Using this approach to provide examples for adversarial training, we reduce the test set error of a maxout network on the MNIST dataset."
in_NB  machine_learning  neural_networks  classifiers  to:blog  adversarial_examples 
march 2015 by cshalizi
[1412.6621] Why does Deep Learning work? - A perspective from Group Theory
"Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning.
"One factor behind the recent resurgence of the subject is a key algorithmic step called pre-training: first search for a good generative model for the input samples, and repeat the process one layer at a time. We show deeper implications of this simple principle, by establishing a connection with the interplay of orbits and stabilizers of group actions. Although the neural networks themselves may not form groups, we show the existence of {\em shadow} groups whose elements serve as close approximations.
"Over the shadow groups, the pre-training step, originally introduced as a mechanism to better initialize a network, becomes equivalent to a search for features with minimal orbits. Intuitively, these features are in a way the {\em simplest}. Which explains why a deep learning network learns simple features first. Next, we show how the same principle, when repeated in the deeper layers, can capture higher order representations, and why representation complexity increases as the layers get deeper."

- Slightly dubious, but S.V. is always worth attending to.
to:NB  neural_networks  algebra  machine_learning 
january 2015 by cshalizi
[1412.2309] Visual Causal Feature Learning
"We react to what we see, but what exactly is it that we react to? What are the visual causes of be- havior? Can we identify such causes from raw image data? If the visual features are causes, how can we manipulate them? Here we provide a rigorous definition of the visual cause of a behavior that is broadly applicable to the visually driven behavior in humans, animals, neurons, robots and other per- ceiving systems. Our framework generalizes standard accounts of causal learning to settings in which the causal variables need to be constructed from microvariables (raw image pixels in this case). We prove the Causal Coarsening Theorem, which allows us to gain causal knowledge from observational data with minimal experimental effort. The theorem provides a connection to standard inference techniques in machine learning that identify features of an image that correlate with, but may not cause, the target behavior. Finally, we propose an active learning scheme to learn a manipulator function that performs optimal manipulations on the image to automatically identify the visual cause of a target behavior. We illustrate our inference and learning algorithms in experiments based on both synthetic and real data. To our knowledge, our account is the first demonstration of true causal feature learning in the literature."

--- Where by "heart the talk" I mean Frederick explained it while he was visiting...
in_NB  heard_the_talk  classifiers  machine_learning  causal_inference  kith_and_kin  eberhardt.frederick  have_read  to:blog  adversarial_examples 
december 2014 by cshalizi
Advanced Structured Prediction | The MIT Press
"The goal of structured prediction is to build machine learning models that predict relational information that itself has structure, such as being composed of multiple interrelated parts. These models, which reflect prior knowledge, task-specific relations, and constraints, are used in fields including computer vision, speech recognition, natural language processing, and computational biology. They can carry out such tasks as predicting a natural language sentence, or segmenting an image into meaningful components.
"These models are expressive and powerful, but exact computation is often intractable. A broad research effort in recent years has aimed at designing structured prediction models and approximate inference and learning procedures that are computationally efficient. This volume offers an overview of this recent research in order to make the work accessible to a broader research community. The chapters, by leading researchers in the field, cover a range of topics, including research trends, the linear programming relaxation approach, innovations in probabilistic modeling, recent theoretical progress, and resource-aware learning."
to:NB  books:noted  relational_learning  machine_learning  structured_data  statistics  computational_statistics  coveted  in_wishlist 
december 2014 by cshalizi
[1412.1897] Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images
"Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification problems. Given that DNNs are now able to classify objects in images with near-human-level performance, questions naturally arise as to what differences remain between computer and human vision. A recent study revealed that changing an image (e.g. of a lion) in a way imperceptible to humans can cause a DNN to label the image as something else entirely (e.g. mislabeling a lion a library). Here we show a related result: it is easy to produce images that are completely unrecognizable to humans, but that state-of-the-art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labeling with certainty that white noise static is a lion). Specifically, we take convolutional neural networks trained to perform well on either the ImageNet or MNIST datasets and then find images with evolutionary algorithms or gradient ascent that DNNs label with high confidence as belonging to each dataset class. It is possible to produce images totally unrecognizable to human eyes that DNNs believe with near certainty are familiar objects. Our results shed light on interesting differences between human vision and current DNNs, and raise questions about the generality of DNN computer vision."

--- The pictures really have to be seen to be believed.
in_NB  have_read  machine_learning  classifiers  neural_networks  to:blog  adversarial_examples 
december 2014 by cshalizi
[1211.4246] What Regularized Auto-Encoders Learn from the Data Generating Distribution
"What do auto-encoders learn about the underlying data generating distribution? Recent work suggests that some auto-encoder variants do a good job of capturing the local manifold structure of data. This paper clarifies some of these previous observations by showing that minimizing a particular form of regularized reconstruction error yields a reconstruction function that locally characterizes the shape of the data generating density. We show that the auto-encoder captures the score (derivative of the log-density with respect to the input). It contradicts previous interpretations of reconstruction error as an energy function. Unlike previous results, the theorems provided here are completely generic and do not depend on the parametrization of the auto-encoder: they show what the auto-encoder would tend to if given enough capacity and examples. These results are for a contractive training criterion we show to be similar to the denoising auto-encoder training criterion with small corruption noise, but with contraction applied on the whole reconstruction function rather than just encoder. Similarly to score matching, one can consider the proposed training criterion as a convenient alternative to maximum likelihood because it does not involve a partition function. Finally, we show how an approximate Metropolis-Hastings MCMC can be setup to recover samples from the estimated distribution, and this is confirmed in sampling experiments."
to:NB  neural_networks  machine_learning  to_read 
september 2014 by cshalizi
Spectral learning of weighted automata - Springer
"In recent years we have seen the development of efficient provably correct algorithms for learning Weighted Finite Automata (WFA). Most of these algorithms avoid the known hardness results by defining parameters beyond the number of states that can be used to quantify the complexity of learning automata under a particular distribution. One such class of methods are the so-called spectral algorithms that measure learning complexity in terms of the smallest singular value of some Hankel matrix. However, despite their simplicity and wide applicability to real problems, their impact in application domains remains marginal to this date. One of the goals of this paper is to remedy this situation by presenting a derivation of the spectral method for learning WFA that—without sacrificing rigor and mathematical elegance—puts emphasis on providing intuitions on the inner workings of the method and does not assume a strong background in formal algebraic methods. In addition, our algorithm overcomes some of the shortcomings of previous work and is able to learn from statistics of substrings. To illustrate the approach we present experiments on a real application of the method to natural language parsing."
to:NB  to_read  re:AoS_project  spectral_methods  automata_theory  grammar_induction  markov_models  state-space_models  learning_theory  machine_learning  statistics 
july 2014 by cshalizi
Adaptively learning probabilistic deterministic automata from data streams - Springer
"Markovian models with hidden state are widely-used formalisms for modeling sequential phenomena. Learnability of these models has been well studied when the sample is given in batch mode, and algorithms with PAC-like learning guarantees exist for specific classes of models such as Probabilistic Deterministic Finite Automata (PDFA). Here we focus on PDFA and give an algorithm for inferring models in this class in the restrictive data stream scenario: Unlike existing methods, our algorithm works incrementally and in one pass, uses memory sublinear in the stream length, and processes input items in amortized constant time. We also present extensions of the algorithm that (1) reduce to a minimum the need for guessing parameters of the target distribution and (2) are able to adapt to changes in the input distribution, relearning new models when needed. We provide rigorous PAC-like bounds for all of the above. Our algorithm makes a key usage of stream sketching techniques for reducing memory and processing time, and is modular in that it can use different tests for state equivalence and for change detection in the stream."
to:NB  to_read  re:AoS_project  grammar_induction  markov_models  state-space_models  automata_theory  machine_learning  statistics  learning_theory 
july 2014 by cshalizi
PAutomaC: a probabilistic automata and hidden Markov models learning competition - Springer
"Approximating distributions over strings is a hard learning problem. Typical techniques involve using finite state machines as models and attempting to learn these; these machines can either be hand built and then have their weights estimated, or built by grammatical inference techniques: the structure and the weights are then learned simultaneously. The Probabilistic Automata learning Competition (PAutomaC), run in 2012, was the first grammatical inference challenge that allowed the comparison between these methods and algorithms. Its main goal was to provide an overview of the state-of-the-art techniques for this hard learning problem. Both artificial data and real data were presented and contestants were to try to estimate the probabilities of strings. The purpose of this paper is to describe some of the technical and intrinsic challenges such a competition has to face, to give a broad state of the art concerning both the problems dealing with learning grammars and finite state machines and the relevant literature. This paper also provides the results of the competition and a brief description and analysis of the different approaches the main participants used."
to:NB  to_read  re:AoS_project  grammar_induction  markov_models  state-space_models  machine_learning 
july 2014 by cshalizi
[1312.6199] Intriguing properties of neural networks
"Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions that could have counter-intuitive properties. In this paper we report two such properties.
"First, we find that there is no distinction between individual high level units and random linear combinations of high level units, according to various methods of unit analysis. It suggests that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.
"Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extend. We can cause the network to misclassify an image by applying a certain imperceptible perturbation, which is found by maximizing the network's prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input."
in_NB  neural_networks  machine_learning  deep_learning  to:blog  via:?  have_read  adversarial_examples 
june 2014 by cshalizi
extrapolated art - Cambridge Machine Learning Group | Yarin Gal
"New techniques in machine learning and image processing allow us to extrapolate the scene of a painting to see what the full scenery might have looked like. Click on a painting to extrapolate it – new paintings added every week. "

--- This sounds very like an idea Bill "Vaguery" Tozier had, in regards to http://bactra.org/weblog/000025.html back in 2003...
art  machine_learning  spatial_statistics  via:arsyed  to_teach:data-mining 
april 2014 by cshalizi
How Do Humans Teach: On Curriculum Learning and Teaching Dimension
"We study the empirical strategies that humans follow as they teach a target concept with a simple 1D threshold to a robot. Previous studies of computational teaching, particularly the teaching dimension model and the curriculum learning principle, offer contradictory predictions on what optimal strategy the teacher should follow in this teaching task. We show through behavioral studies that humans employ three distinct teaching strategies, one of which is consistent with the curriculum learning principle, and propose a novel theoretical framework as a potential explanation for this strategy. This framework, which assumes a teaching goal of minimizing the learner's expected generalization error at each iteration, extends the standard teaching dimension model and offers a theoretical justification for curriculum learning."
to:NB  learning_theory  machine_learning  robotics 
march 2014 by cshalizi
« earlier      
per page:    204080120160

related tags

abstraction  academia  active_learning  adaptive_behavior  additive_models  adversarial_examples  advertising  agent-based_models  ai  algebra  algorithmic_fairness  algorithmic_information_theory  algorithms  allen.genevera_i.  analogy  anderson.chris  anomaly_detection  approximation  arlot.sylvain  arrow_of_time  art  arthegall  artificial_intelligence  artificial_life  asymptotics  athey.susan  author-identification  automata_theory  autonomous_agents  autonomy  azizyan.martin  bach.francis_r.  bad_data_analysis  bad_science_journalism  bandit_problems  bartlett.peter_l.  baveja.satinder_singh  bayesianism  belief_propagation  bengio.yoshua  biau.gerard  biochemical_networks  bioinformatics  blei.david  blogged  blogs  books:contributed-to  books:noted  books:owned  books:recommended  books:reviewed  book_reviews  boolean_networks  boosting  boots.byron  bootstrap  bottou.leon  boucheron.stephane  bousquet.olivier  breiman.leo  bubeck.sebastien  buhlmann.peter  butler.charles  c++  calibration  can't_remember_if_I_read_it_or_not  carnegie_mellon  CART  caruana.rich  categorical_data  category_theory  cats  caudill.maureen  causality  causal_discovery  causal_inference  cellular_automata  cesa-bianchi.nicolo  change_of_representation  chaos  chow-liu_trees  classifiers  climate_change  climatology  clustering  cmu  coarse-graining  coen.michael  cognitive_development  cognitive_science  cohen.william  cold_war  collaborative_filtering  collective_cognition  communication  community_discovery  complexity  compressed_sensing  computation  computational_complexity  computational_philosophy  computational_statistics  computers  concentration_of_measure  conditional_random_fields  conferences  confidence_sets  connectionism  conner.brendan  consistency  context-free_grammars  control  control_theory  convexity  cool_if_true  copulas  courses  covariate_shift  coveted  credit_attribution  cross-validation  cybernetics  darwin_machines  das.kaustav  databases  dataset_shift  data_analysis  data_as_construct  data_mining  debunking  decision_theory  decision_trees  deep_learning  density_estimation  density_ratio_estimation  design_for_a_brain  desolneux.agnes  deviation_inequalities  devroye.luc  dietterich.thomas  dimension_estimation  dimension_reduction  distributed_systems  domingos.pedro  drones  dynamical_systems  eberhardt.frederick  ecology  econometrics  economics  education  eichler.michael  empirical_processes  em_algorithm  energy  ensemble_methods  entableted  entropy_estimation  epistemology  ergodic_theory  estimation  events  evisceration  evolutionary_biology  evolutionary_optimization  evolution_of_learning  expectation-maximization  experimental_biology  experimental_design  experimental_psychology  explanation  exponential_families  exponential_family_random_graphs  extreme_values  factor_analysis  fact_checking  fairness  feature_creation  feedback  filtering  finance  financial_markets  fink.daniel  fisher_information  flake.gary_william  fleuret.francois  fmri  foster.dean_p.  fourier_analysis  fox.emily  fractals  freedom_as_self-control  freund.yoav  friedman.nir  from_library  functional_data_analysis  function_approximation  funny:geeky  funny:malicious  galstyan.aram  game_theory  gaussian_processes  geman.donald  genetic_algorithms  genetic_programming  gene_expression_data_analysis  gene_regulation  genomics  geology  gershman.samuel  getoor.lise  ghahramani.zoubin  gibbons  gigs  gneiting.tilmann  god_and_golem_inc.  goldenberg.anna  gordon.geoff  gordon.geoffrey  gordon.geoffrey_j.  grammar_induction  graphical_models  graph_grammars  graph_limits  graph_theory  grunwald.peter  guestrin.carlos  guyon.isabelle  hansen.bruce  hashing  have_read  hayes.brian  heard_the_talk  heavy_tails  hebbian_learning  heteroskedasticity  heuristics  hierarchical_statistical_models  hierarchical_structure  high-dimensional_probability  high-dimensional_statistics  hilbert_space  hinton.geoffrey  hipsters  hoeffdings_inequality  holland.john_h.  holyoak.keith  homeostasis  homophily  hoyer.patrik_o.  human_evolution  hydrodynamics  hypothesis_testing  identifiability  immunology  independent_component_analysis  individual_sequence_prediction  induction  inequalities  inference_to_latent_objects  information_criteria  information_geometry  information_retrieval  information_theory  innovation  intelligence_(spying)  interface_design  interview  in_NB  in_wishlist  ising_model  i_see_what_you_did_there  janzing.dominik  jensen.david  jordan.michael_i.  k-means  kakade.sham  kearns.michael  kelly.kevin_t.  kernel_methods  kith_and_kin  kleinberg.jon  kolar.mladen  koller.daphne  kontorovich.aryeh  lacerda.gustavo  lafferty.john  langford.john  language_acquisition  large_deviations  lasso  latent_dirichlet_allocation  latent_variables  learning_in_games  learning_theory  lebanon.guy  lebaron.blake  leeb.hannes  lerman.kristina  levy_processes  liberman.mark  linguistics  literary_criticism  liu.han  lives_of_the_scientists  locality-sensitive_hashing  logic  logistic_regression  low-rank_approximation  low-regret_learning  lugosi.gabor  luxburg.ulrike_von  machine_learning  machine_translation  macro_from_micro  made_vs_found  manifold_learning  marcus.gary  markov_models  mayo.deborah  mccallum.andrew  mcfowland.edward_iii  mechanism_design  medicine  meila.marina  metaphor  methodological_EPIC_FAIL  metric_learning  minimax  mixture_models  modeling  model_averaging  model_checking  model_discovery  model_selection  modularity  mohri.mehryar  monte_carlo  moore.cristopher  movies  multiple_testing  mumford.david  narrative  national_surveillance_state  natural_language_processing  neal.radford  nearest-neighbors  nearest_neighbors  netflix  networked_life  networks  network_data_analysis  neural_coding_and_decoding  neural_networks  neuroscience  neville.jennifer  nielsen.michael  nisbett.richard  niyogi.partha  non-stationarity  nonparametrics  norvig.peter  no_such_thing_as_false_positives  nugent.rebecca  nukes  numerical_methods  o'connor.brendan  o'neil.cathy  occams_razor  online_learning  optimization  ordinal_data  or_perhaps_the_nightmare_into_which_we_are_slipping  our_decrepit_institutions  p-values  pac-bayesian  parallel_computing  particle_detectors  particle_filters  particle_physics  pattern_discovery  pattern_recognition  pattern_theory  pearl.judea  perception  perceptron  phase_transitions  philosophy_of_science  phonology  photos  physics  poczos.barnabas  point_processes  poldrack.russell  popular_science  precision-recall  prediction  prediction_trees  predictive_states  primates  principal_components  privacy  programming  propagation_of_error  psychology  R  racine.jeffrey  rahimi.ali  rakhlin.alexander  rakhlin.sasha  random_fields  random_forests  random_projections  rapture_for_nerds  re:ADAfaEPoV  re:almost_none  re:AoS_project  re:data_science_whitepaper  re:democratic_cognition  re:do-institutions-evolve  re:freshman_seminar_on_optimization  re:growing_ensemble_project  re:hyperbolic_networks  re:knightian_uncertainty  re:naive-semi-supervised  re:prediction-without-prejudice  re:smoothing_adjacency_matrices  re:social_networks_as_sensor_networks  re:stacs  re:what_is_the_right_null_model_for_linear_regression  re:XV_for_mixing  re:XV_for_networks  re:your_favorite_dsge_sucks  re:your_favorite_ergm_sucks  recht.benjamin  recommender_systems  regression  reinforcement_learning  reinventing_the_wheel_and_putting_out_a_press_release  relational_learning  remarkable_if_true  representation  reproducibility  rhetoric  riedewald.mirek  rigollet.philippe  ripley.brian  robotics  robots_and_robotics  rodu_jordan  rudin.cynthia  salakhutdinov.ruslan  samworth.richard  sardinia  sarkar.purnamrita  schapire.robert_e.  science  science_policy  scientific_computing  search_engines  sejnowski.terrence  self-organization  self-promotion  semantics_from_syntax  semi-supervised_learning  shalev-shwartz.shai  sharpnack.james  sheep  siddiqi.sajid_m.  signal_transduction  simulation  singh.aarti  singh.satinder_baveja  smith.noah_a.  smola.alex  smoothing  smyth.padhraic  sober.elliott  social_criticism  social_media  social_networks  social_science_methodology  sociology  sorokina.daria  sounds_grim  spam  sparsity  spatial_statistics  spectral_clustering  spectral_methods  spirtes.peter  splines  sprites.peter  stability_of_learning  standardized_testing  state-space_models  state_estimation  stationary_features  statistical_inference_for_stochastic_processes  statistical_interaction  statistical_learning  statistical_mechanics  statistics  stochastic_models  stochastic_processes  structured_data  sufficiency  sugiyama.masashi  summer_schools  sutton.charles  taskar.ben  technological_unemployment  teleology  teleonomy  tenenbaum.joshua  terrorism_fears  tetrad  tewari.ambuj  text_mining  thagard.paul  theoretical_computer_science  the_continuing_crises  the_nightmare_from_which_we_are_trying_to_awake  the_robo-nuclear_apocalypse_in_our_past_light_cone  tibshirani.robert  tibshirani.ryan  time_series  tin_NB  to:blog  to:NB  topic_models  tozier.william  to_be_shot_after_a_fair_trial  to_read  to_teach:complexity-and-inference  to_teach:data-mining  to_teach:statcomp  to_teach:undergrad-ADA  to_teach:undergrad-research  track_down_references  transaction_networks  tufekci.zeynep  tv_tropes  two-sample_tests  uai  uhler.caroline  universal_prediction  ussr  us_politics  van_der_laan.mark  van_de_geer.sara  vapnik.v.n.  vapnik.vladimir  variable_selection  variational_inference  vazirani.umesh  vc-dimension  ventura.sam  via:?  via:???  via:aaron_clauset  via:albers  via:anoopsarkar  via:ariddell  via:arsyed  via:arthegall  via:auerbach  via:chl  via:csantos  via:cwiggins  via:ded-maxim  via:gelman  via:georg  via:gptp2016  via:guslacerda  via:henry_farrell  via:james-nicoll  via:jhofman  via:judea_pearl  via:kjhealy  via:klk  via:melanie_mitchell  via:mraginsky  via:mreid  via:myl  via:phnk  via:shivak  via:shreejoy  via:slaniel  vishwanathan.s.v.n.  visual_display_of_quantitative_information  wahba.grace  wainwright.martin_j.  warmuth.manfred  wasserman.larry  weather_prediction  welling.max  why_oh_why_cant_we_have_a_better_academic_publishing_system  why_oh_why_cant_we_have_a_better_press_corps  wiener.norbert  williamson.robert_c.  wolpert.david  would_be_tagged_'to_be_shot_after_a_fair_trial'_if_it_were_by_almost_anyone_else  xing.eric  your_favorite_deep_neural_network_sucks  zhang.tong  zhu.jerry 

Copy this bookmark: