**cshalizi + to_teach:statcomp + regression**
5

Fast Generalized Linear Models by Database Sampling and One-Step Polishing: Journal of Computational and Graphical Statistics: Vol 0, No 0

june 2019 by cshalizi

"In this article, I show how to fit a generalized linear model to N observations on p variables stored in a relational database, using one sampling query and one aggregation query, as long as N^{1/2+δ} observations can be stored in memory, for some δ>0. The resulting estimator is fully efficient and asymptotically equivalent to the maximum likelihood estimator, and so its variance can be estimated from the Fisher information in the usual way. A proof-of-concept implementation uses R with MonetDB and with SQLite, and could easily be adapted to other popular databases. I illustrate the approach with examples of taxi-trip data in New York City and factors related to car color in New Zealand. "

to:NB
computational_statistics
linear_regression
regression
databases
lumley.thomas
to_teach:statcomp
june 2019 by cshalizi

A Primer on Regression Splines

may 2014 by cshalizi

"B-splines constitute an appealing method for the nonparametric estimation of a range of statis- tical objects of interest. In this primer we focus our attention on the estimation of a conditional mean, i.e. the ‘regression function’."

in_NB
splines
nonparametrics
regression
approximation
statistics
computational_statistics
racine.jeffrey_s.
to_teach:statcomp
to_teach:undergrad-ADA
have_read
may 2014 by cshalizi

[1306.3574] Early stopping and non-parametric regression: An optimal data-dependent stopping rule

june 2013 by cshalizi

"The strategy of early stopping is a regularization technique based on choosing a stopping time for an iterative algorithm. Focusing on non-parametric regression in a reproducing kernel Hilbert space, we analyze the early stopping strategy for a form of gradient-descent applied to the least-squares loss function. We propose a data-dependent stopping rule that does not involve hold-out or cross-validation data, and we prove upper bounds on the squared error of the resulting function estimate, measured in either the $L^2(P)$ and $L^2(P_n)$ norm. These upper bounds lead to minimax-optimal rates for various kernel classes, including Sobolev smoothness classes and other forms of reproducing kernel Hilbert spaces. We show through simulation that our stopping rule compares favorably to two other stopping rules, one based on hold-out data and the other based on Stein's unbiased risk estimate. We also establish a tight connection between our early stopping strategy and the solution path of a kernel ridge regression estimator."

in_NB
optimization
kernel_estimators
hilbert_space
nonparametrics
regression
minimax
yu.bin
wainwright.martin_j.
to_teach:statcomp
have_read
june 2013 by cshalizi

[1212.4174] Feature Clustering for Accelerating Parallel Coordinate Descent

december 2012 by cshalizi

"Large-scale L1-regularized loss minimization problems arise in high-dimensional applications such as compressed sensing and high-dimensional supervised learning, including classification and regression problems. High-performance algorithms and implementations are critical to efficiently solving these problems. Building upon previous work on coordinate descent algorithms for L1-regularized problems, we introduce a novel family of algorithms called block-greedy coordinate descent that includes, as special cases, several existing algorithms such as SCD, Greedy CD, Shotgun, and Thread-Greedy. We give a unified convergence analysis for the family of block-greedy algorithms. The analysis suggests that block-greedy coordinate descent can better exploit parallelism if features are clustered so that the maximum inner product between features in different blocks is small. Our theoretical convergence analysis is supported with experimental re- sults using data from diverse real-world applications. We hope that algorithmic approaches and convergence analysis we provide will not only advance the field, but will also encourage researchers to systematically explore the design space of algorithms for solving large-scale L1-regularization problems."

to:NB
optimization
lasso
regression
machine_learning
to_teach:statcomp
december 2012 by cshalizi

Introduction to Online Optimization (Bubeck)

december 2011 by cshalizi

"to_teach" tag a sudden brainstorm for how to make next year's statistical computing class either unbeatably awesome or an absolute disaster

in_NB
online_learning
regression
individual_sequence_prediction
optimization
machine_learning
learning_theory
via:mraginsky
to_read
to_teach:statcomp
re:freshman_seminar_on_optimization
december 2011 by cshalizi

**related tags**

Copy this bookmark: