**le-sigh**22

Kiwi Hellenist: The citation problem

october 2018 by Vaguery

Let me re-state the problem. It didn’t occur to anyone, at any stage, that a research paper ought to look at research on the thing that the article is about. Why not?

science-and-humanities-sittin-in-a-tree
annexation-by-physics
digital-humanities
network-theory
le-sigh
academic-culture
october 2018 by Vaguery

IASC: The Hedgehog Review - Volume 20, No. 1 (Spring 2018) - Digital Metaphysics: The Cybernetic Idealism of Warren McCulloch -

june 2018 by Vaguery

Kant famously stated that he had “awoken from his dogmatic slumber”30 by reading the Scottish Enlightenment philosopher David Hume. Hume maintained a bright line between “matters of fact” and “relations of ideas.” This meant that mental habit was central. If one wanted to form a meaningful sentence about the world (“this causes that”), then one would have to habituate the mind by noticing common correlations and regularly drawing the conclusion that one thing “caused” another. Kant disagreed. Cause, he reasoned, could not just be a mental habit, because it had a hidden premise: not that one thing followed another in time, but that it necessarily did so. To conceive of a necessity in the world was to add something more than habit to observation—to contribute a law to nature.

philosophy-of-science
philosophy
neural-networks
representation
cultural-assumptions
machine-learning
le-sigh
familiar-refrains-again-again
june 2018 by Vaguery

SocArXiv Papers | Exposure to Opposing Views can Increase Political Polarization: Evidence from a Large-Scale Field Experiment on Social Media

april 2018 by Vaguery

There is mounting concern that social media sites contribute to political polarization by creating ``echo chambers" that insulate people from opposing views about current events. We surveyed a large sample of Democrats and Republicans who visit Twitter at least three times each week about a range of social policy issues. One week later, we randomly assigned respondents to a treatment condition in which they were offered financial incentives to follow a Twitter bot for one month that exposed them to messages produced by elected officials, organizations, and other opinion leaders with opposing political ideologies. Respondents were re-surveyed at the end of the month to measure the effect of this treatment, and at regular intervals throughout the study period to monitor treatment compliance. We find that Republicans who followed a liberal Twitter bot became substantially more conservative post-treatment, and Democrats who followed a conservative Twitter bot became slightly more liberal post-treatment. These findings have important implications for the interdisciplinary literature on political polarization as well as the emerging field of computational social science.

sociology
the-madness-of-crowds
polarization
social-norms
social-media
opinion
le-sigh
april 2018 by Vaguery

n-gate.com. we can't both be right.

july 2017 by Vaguery

An internet lectures passersby about webshit. The lectures are sprinkled with advertisements for an HTTP server that runs as root. We are expected to take security advice from this person seriously.

We do not.

web-applications
to-do
best-practices
devops
le-sigh
We do not.

july 2017 by Vaguery

Uber Menschen — Crooked Timber

october 2016 by Vaguery

To belabor something that I hope is obvious to most readers but may not be obvious to all – the point is not that surge pricing is necessarily evil. It is that Econ 101 does not pre-empt the politics – people with different perspectives, interests and values may reasonably support or oppose surge pricing, depending. We do not know what the right answer is, and the best way we have discovered to even arrive at a rough approximation of the best answer that we can live with is to have people with different perspectives argue it out.

Brennan’s ideal epistocracy would strain a lot of those diverse perspectives out (he admits that e.g. African Americans in the US would likely be under-represented). His tweet furthermore suggests that, to take a few examples, people like Astra Taylor, the late E.P. Thompson, James Scott and Tom Slee (who, in fairness, has been denounced as an economic dunce by a prominent libertarian economist) should be barred from voting. I hope that it’s not too offensive to say that I would expect the contribution of any of these people to democratic debate to likely be more valuable than that of Jason Brennan.

libertarianism
ethics
among-the-Uberii
economics
corporatism
le-sigh
Brennan’s ideal epistocracy would strain a lot of those diverse perspectives out (he admits that e.g. African Americans in the US would likely be under-represented). His tweet furthermore suggests that, to take a few examples, people like Astra Taylor, the late E.P. Thompson, James Scott and Tom Slee (who, in fairness, has been denounced as an economic dunce by a prominent libertarian economist) should be barred from voting. I hope that it’s not too offensive to say that I would expect the contribution of any of these people to democratic debate to likely be more valuable than that of Jason Brennan.

october 2016 by Vaguery

[1607.06146] Supervised quantum gate "teaching" for quantum hardware design

august 2016 by Vaguery

This seems like somebody who should read Lee Spector's book "Automatic Quantum Computer Programming"

machine-learning
system-of-professions
reinvention
quantums
le-sigh
august 2016 by Vaguery

[1403.4630] Penalising model component complexity: A principled, practical approach to constructing priors

march 2016 by Vaguery

In this paper, we introduce a new concept for constructing prior distributions. We exploit the natural nested structure inherent to many model components, which defines the model component to be a flexible extension of a base model. Proper priors are defined to penalise the complexity induced by deviating from the simpler base model and are formulated after the input of a user-defined scaling parameter for that model component, both in the univariate and the multivariate case. These priors are invariant to reparameterisations, have a natural connection to Jeffreys' priors, are designed to support Occam's razor and seem to have excellent robustness properties, all which are highly desirable and allow us to use this approach to define default prior distributions. Through examples and theoretical results, we demonstrate the appropriateness of this approach and how it can be applied in various situations.

they're-never-going-to-learn-about-multiple-objectives-are-they?
statistics
methodologies
system-of-professions
le-sigh
march 2016 by Vaguery

[1506.03229] A cognitive neural architecture able to learn and communicate through natural language

december 2015 by Vaguery

Communicative interactions involve a kind of procedural knowledge that is used by the human brain for processing verbal and nonverbal inputs and for language production. Although considerable work has been done on modeling human language abilities, it has been difficult to bring them together to a comprehensive tabula rasa system compatible with current knowledge of how verbal information is processed in the brain. This work presents a cognitive system, entirely based on a large-scale neural architecture, which was developed to shed light on the procedural knowledge involved in language elaboration. The main component of this system is the central executive, which is a supervising system that coordinates the other components of the working memory. In our model, the central executive is a neural network that takes as input the neural activation states of the short-term memory and yields as output mental actions, which control the flow of information among the working memory components through neural gating mechanisms. The proposed system is capable of learning to communicate through natural language starting from tabula rasa, without any a priori knowledge of the structure of phrases, meaning of words, role of the different classes of words, only by interacting with a human through a text-based interface, using an open-ended incremental learning process. It is able to learn nouns, verbs, adjectives, pronouns and other word classes, and to use them in expressive language. The model was validated on a corpus of 1587 input sentences, based on literature on early language assessment, at the level of about 4-years old child, and produced 521 output sentences, expressing a broad range of language processing functionalities.

neural-networks
party-like-it's-1991
natural-language-processing
le-sigh
december 2015 by Vaguery

[1509.02548] Initial Analysis of a Simple Numerical Model that Exhibits Antifragile Behavior

september 2015 by Vaguery

I present a simple numerical model based on iteratively updating subgroups of a population, individually modeled by nonnegative real numbers, by a constant decay factor; however, at each iteration, one group is selected to instead be updated by a constant growth factor. I discover a relationship between these variables and their respective probabilities for a given subgroup, summarized as the variable c. When c>1, the subgroup is found to tend towards behaviors reminiscent of antifragility; when at least one subgroup of the population has c≥1, the population as a whole tends towards significantly higher probabilities of "living forever," although it may first suffer a drop in population size as less robust, fragile subgroups "die off." In concluding, I discuss the limitations and ethics of such a model, notably the implications of when an upper limit is placed on the growth constant, requiring a population to facilitate an increase in the decay factor to lessen the impact of periods of failure.

self-organization
not-one-Bak-reference
system-of-professions
silo-of-knowledge
le-sigh
september 2015 by Vaguery

John Locke Against Freedom | Jacobin

june 2015 by Vaguery

Received ideas change only slowly, and the standard view of Locke as a defender of liberty is likely to persist for years to come. Still, the reassessment is underway, and the outcome is inevitable. Locke was a theoretical advocate of, and a personal participant in, expropriation and enslavement. His classical liberalism offers no guarantee of freedom to anyone except owners of capitalist private property.

philosophy
libertarianism
actual-history-as-opposed-to-received
said-this-for-years
American-cultural-assumptions
le-sigh
june 2015 by Vaguery

[1502.04585] The Ladder: A Reliable Leaderboard for Machine Learning Competitions

march 2015 by Vaguery

The organizer of a machine learning competition faces the problem of maintaining an accurate leaderboard that faithfully represents the quality of the best submission of each competing team. What makes this estimation problem particularly challenging is its sequential and adaptive nature. As participants are allowed to repeatedly evaluate their submissions on the leaderboard, they may begin to overfit to the holdout data that supports the leaderboard. Few theoretical results give actionable advice on how to design a reliable leaderboard. Existing approaches therefore often resort to poorly understood heuristics such as limiting the bit precision of answers and the rate of re-submission.

In this work, we introduce a notion of "leaderboard accuracy" tailored to the format of a competition. We introduce a natural algorithm called "the Ladder" and demonstrate that it simultaneously supports strong theoretical guarantees in a fully adaptive model of estimation, withstands practical adversarial attacks, and achieves high utility on real submission files from an actual competition hosted by Kaggle.

Notably, we are able to sidestep a powerful recent hardness result for adaptive risk estimation that rules out algorithms such as ours under a seemingly very similar notion of accuracy. On a practical note, we provide a completely parameter-free variant of our algorithm that can be deployed in a real competition with no tuning required whatsoever.

via:arthegall
horse-races
machine-learning
statistics
what-gets-measured-gets-fudged
le-sigh
not-multiobjective
bad-habits-of-engineers
In this work, we introduce a notion of "leaderboard accuracy" tailored to the format of a competition. We introduce a natural algorithm called "the Ladder" and demonstrate that it simultaneously supports strong theoretical guarantees in a fully adaptive model of estimation, withstands practical adversarial attacks, and achieves high utility on real submission files from an actual competition hosted by Kaggle.

Notably, we are able to sidestep a powerful recent hardness result for adaptive risk estimation that rules out algorithms such as ours under a seemingly very similar notion of accuracy. On a practical note, we provide a completely parameter-free variant of our algorithm that can be deployed in a real competition with no tuning required whatsoever.

march 2015 by Vaguery

[1105.1475] Pivotal estimation via square-root Lasso in nonparametric regression

march 2015 by Vaguery

We propose a self-tuning Lasso‾‾‾‾‾‾√ method that simultaneously resolves three important practical problems in high-dimensional regression analysis, namely it handles the unknown scale, heteroscedasticity and (drastic) non-Gaussianity of the noise. In addition, our analysis allows for badly behaved designs, for example, perfectly collinear regressors, and generates sharp bounds even in extreme cases, such as the infinite variance case and the noiseless case, in contrast to Lasso. We establish various nonasymptotic bounds for Lasso‾‾‾‾‾‾√ including prediction norm rate and sparsity. Our analysis is based on new impact factors that are tailored for bounding prediction norm. In order to cover heteroscedastic non-Gaussian noise, we rely on moderate deviation theory for self-normalized sums to achieve Gaussian-like results under weak conditions. Moreover, we derive bounds on the performance of ordinary least square (ols) applied to the model selected by Lasso‾‾‾‾‾‾√ accounting for possible misspecification of the selected model. Under mild conditions, the rate of convergence of ols post Lasso‾‾‾‾‾‾√ is as good as Lasso‾‾‾‾‾‾√'s rate. As an application, we consider the use of Lasso‾‾‾‾‾‾√ and ols post Lasso‾‾‾‾‾‾√ as estimators of nuisance parameters in a generic semiparametric problem (nonlinear moment condition or Z-problem), resulting in a construction of n‾‾√-consistent and asymptotically normal estimators of the main parameters.

statistics
multiobjective-optimization
done-wrong
le-sigh
system-of-professions
hurts-to-watch
to-understand
and-to-forgive
rituals-of-discipline
operant-conditioning
consider:engineering-criticism
march 2015 by Vaguery

[1501.04242] The Information-theoretic and Algorithmic Approach to Human, Animal and Artificial Cognition

march 2015 by Vaguery

We survey concepts at the frontier of research connecting artificial, animal and human cognition to computation and information processing---from the Turing test to Searle's Chinese Room argument, from Integrated Information Theory to computational and algorithmic complexity. We start by arguing that passing the Turing test is a trivial computational problem and that its pragmatic difficulty sheds light on the computational nature of the human mind more than it does on the challenge of artificial intelligence. We then review our proposed algorithmic information-theoretic measures for quantifying and characterizing cognition in various forms. These are capable of accounting for known biases in human behavior, thus vindicating a computational algorithmic view of cognition as first suggested by Turing, but this time rooted in the concept of algorithmic probability, which in turn is based on computational universality while being independent of computational model, and which has the virtue of being predictive and testable as a model theory of cognitive behavior.

artificial-intelligence
formalization
review
oh-science
le-sigh
philosophy-of-science
when-your-find-yourself-standing-in-a-broken-metaphor-stop-digging
march 2015 by Vaguery

[1502.02599] Adaptive Random SubSpace Learning (RSSL) Algorithm for Prediction

march 2015 by Vaguery

We present a novel adaptive random subspace learning algorithm (RSSL) for prediction purpose. This new framework is flexible where it can be adapted with any learning technique. In this paper, we tested the algorithm for regression and classification problems. In addition, we provide a variety of weighting schemes to increase the robustness of the developed algorithm. These different wighting flavors were evaluated on simulated as well as on real-world data sets considering the cases where the ratio between features (attributes) and instances (samples) is large and vice versa. The framework of the new algorithm consists of many stages: first, calculate the weights of all features on the data set using the correlation coefficient and F-statistic statistical measurements. Second, randomly draw n samples with replacement from the data set. Third, perform regular bootstrap sampling (bagging). Fourth, draw without replacement the indices of the chosen variables. The decision was taken based on the heuristic subspacing scheme. Fifth, call base learners and build the model. Sixth, use the model for prediction purpose on test set of the data. The results show the advancement of the adaptive RSSL algorithm in most of the cases compared with the synonym (conventional) machine learning algorithms.

system-of-professions
le-sigh
march 2015 by Vaguery

[1502.02362] Counterfactual Risk Minimization: Learning from Logged Bandit Feedback

march 2015 by Vaguery

We develop a learning principle and an efficient algorithm for batch learning from logged bandit feedback. This learning setting is ubiquitous in online systems (e.g., ad placement, web search, recommendation), where an algorithm makes a prediction (e.g., ad ranking) for a given input (e.g., query) and observes bandit feedback (e.g., user clicks on presented ads). We first address the counterfactual nature of the learning problem through propensity scoring. Next, we prove generalization error bounds that account for the variance of the propensity-weighted empirical risk estimator. These constructive bounds give rise to the Counterfactual Risk Minimization (CRM) principle. We show how CRM can be used to derive a new learning method -- called Policy Optimizer for Exponential Models (POEM) -- for learning stochastic linear rules for structured output prediction. We present a decomposition of the POEM objective that enables efficient stochastic gradient optimization. POEM is evaluated on several multi-label classification problems showing substantially improved robustness and generalization performance compared to the state-of-the-art.

statistics
oh-statistics
learning-by-watching
machine-learning
genetic-programming
which-it-basically-is-but-done-lazily
le-sigh
system-of-professions
march 2015 by Vaguery

[1405.1796] Comparisons of penalized least squares methods by simulations

february 2015 by Vaguery

Penalized least squares methods are commonly used for simultaneous estimation and variable selection in high-dimensional linear models. In this paper we compare several prevailing methods including the lasso, nonnegative garrote, and SCAD in this area through Monte Carlo simulations. Criterion for evaluating these methods in terms of variable selection and estimation are presented. This paper focuses on the traditional n > p cases. For larger p, our results are still helpful to practitioners after the dimensionality is reduced by a screening method. K

variable-selection
statistics
dimension-reduction
horse-races
system-of-professions
multiobjective-optimization
habits-of-formalization-as-a-trap
le-sigh
february 2015 by Vaguery

[1003.0516] Model Selection with the Loss Rank Principle

february 2015 by Vaguery

A key issue in statistics and machine learning is to automatically select the "right" model complexity, e.g., the number of neighbors to be averaged over in k nearest neighbor (kNN) regression or the polynomial degree in regression with polynomials. We suggest a novel principle - the Loss Rank Principle (LoRP) - for model selection in regression and classification. It is based on the loss rank, which counts how many other (fictitious) data would be fitted better. LoRP selects the model that has minimal loss rank. Unlike most penalized maximum likelihood variants (AIC, BIC, MDL), LoRP depends only on the regression functions and the loss function. It works without a stochastic noise model, and is directly applicable to any non-parametric regressor, like kNN.

via:cshalizi
system-of-professions
model-selection
multiobjective-it-ain't
le-sigh
nonoverlapping-bibliographies
data-balancing
nudge-targets
consider:performance-measures
february 2015 by Vaguery

[1411.1045] Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNet

february 2015 by Vaguery

Recent results suggest that state-of-the-art saliency models perform far from optimal in predicting fixations. This lack in performance has been attributed to an inability to model the influence of high-level image features such as objects. Recent seminal advances in applying deep neural networks to tasks like object recognition suggests that they are able to capture this kind of structure. However, the enormous amount of training data necessary to train these networks makes them difficult to apply directly to saliency prediction. We present a novel way of reusing existing neural networks that have been pretrained on the task of object recognition in models of fixation prediction. Using the well-known network of Krizhevsky et al., 2012, we come up with a new saliency model that significantly outperforms all state-of-the-art models on the MIT Saliency Benchmark. We show that the structure of this network allows new insights in the psychophysics of fixation selection and potentially their neural implementation. To train our network, we build on recent work on the modeling of saliency as point processes.

deep-learning
saliency
image-processing
blah-blah
horse-races
oh-right-"novel"
nudge-targets
consider:revisiting-GP-pretraining-work-from-20-fucking-years-ago
le-sigh
february 2015 by Vaguery

[1205.3767] Universal Algorithm for Online Trading Based on the Method of Calibration

november 2014 by Vaguery

We present a universal algorithm for online trading in Stock Market which performs asymptotically at least as good as any stationary trading strategy that computes the investment at each step using a fixed function of the side information that belongs to a given RKHS (Reproducing Kernel Hilbert Space). Using a universal kernel, we extend this result for any continuous stationary strategy. In this learning process, a trader rationally chooses his gambles using predictions made by a randomized well-calibrated algorithm. Our strategy is based on Dawid's notion of calibration with more general checking rules and on some modification of Kakade and Foster's randomized rounding algorithm for computing the well-calibrated forecasts. We combine the method of randomized calibration with Vovk's method of defensive forecasting in RKHS. Unlike the statistical theory, no stochastic assumptions are made about the stock prices. Our empirical results on historical markets provide strong evidence that this type of technical trading can "beat the market" if transaction costs are ignored.

technical-analysis
financial-engineering
SMH
le-sigh
formalization
if-transaction-costs-are-ignored
statistics
models
nudge-targets
consider:doing-it-right-so-the-leaks-show
november 2014 by Vaguery

**related tags**

Copy this bookmark: