cshalizi + to_teach:statcomp + databases   2

Fast Generalized Linear Models by Database Sampling and One-Step Polishing: Journal of Computational and Graphical Statistics: Vol 0, No 0
"In this article, I show how to fit a generalized linear model to N observations on p variables stored in a relational database, using one sampling query and one aggregation query, as long as N^{1/2+δ} observations can be stored in memory, for some δ>0. The resulting estimator is fully efficient and asymptotically equivalent to the maximum likelihood estimator, and so its variance can be estimated from the Fisher information in the usual way. A proof-of-concept implementation uses R with MonetDB and with SQLite, and could easily be adapted to other popular databases. I illustrate the approach with examples of taxi-trip data in New York City and factors related to car color in New Zealand. "
to:NB  computational_statistics  linear_regression  regression  databases  lumley.thomas  to_teach:statcomp 
june 2019 by cshalizi
10 Easy Steps to a Complete Understanding of SQL - Tech.Pro
And by "to_teach", I mean "to mention".

ETA: arthegall calls item #2 somewhere between incoherent and wrong, and he'd know better than I...
databases  programming  to_teach:statcomp  via:kjhealy 
september 2013 by cshalizi

Copy this bookmark: