[1905.00175] Boosting: Why You Can Use the HP Filter

27 days ago by cshalizi

"The Hodrick-Prescott (HP) filter is one of the most widely used econometric methods in applied macroeconomic research. Like all nonparametric methods, the HP filter depends critically on a tuning parameter that controls the degree of smoothing. Yet in contrast to modern nonparametric methods and applied work with these procedures, empirical practice with the HP filter almost universally relies on standard settings for the tuning parameter that have been suggested largely by experimentation with macroeconomic data and heuristic reasoning. As recent research (Phillips and Jin, 2015) has shown, standard settings may not be adequate in removing trends, particularly stochastic trends, in economic data.

"This paper proposes an easy-to-implement practical procedure of iterating the HP smoother that is intended to make the filter a smarter smoothing device for trend estimation and trend elimination. We call this iterated HP technique the boosted HP filter in view of its connection to L2-boosting in machine learning. The paper develops limit theory to show that the boosted HP (bHP) filter asymptotically recovers trend mechanisms that involve unit root processes, deterministic polynomial drifts, and polynomial drifts with structural breaks. A stopping criterion is used to automate the iterative HP algorithm, making it a data-determined method that is ready for modern data-rich environments in economic research. The methodology is illustrated using three real data examples that highlight the differences between simple HP filtering, the data-determined boosted filter, and an alternative autoregressive approach. These examples show that the bHP filter is helpful in analyzing a large collection of heterogeneous macroeconomic time series that manifest various degrees of persistence, trend behavior, and volatility."

to:NB
boosting
time_series
splines
statistics
"This paper proposes an easy-to-implement practical procedure of iterating the HP smoother that is intended to make the filter a smarter smoothing device for trend estimation and trend elimination. We call this iterated HP technique the boosted HP filter in view of its connection to L2-boosting in machine learning. The paper develops limit theory to show that the boosted HP (bHP) filter asymptotically recovers trend mechanisms that involve unit root processes, deterministic polynomial drifts, and polynomial drifts with structural breaks. A stopping criterion is used to automate the iterative HP algorithm, making it a data-determined method that is ready for modern data-rich environments in economic research. The methodology is illustrated using three real data examples that highlight the differences between simple HP filtering, the data-determined boosted filter, and an alternative autoregressive approach. These examples show that the bHP filter is helpful in analyzing a large collection of heterogeneous macroeconomic time series that manifest various degrees of persistence, trend behavior, and volatility."

27 days ago by cshalizi

[1909.11161] Selecting a Scale for Spatial Confounding Adjustment

10 weeks ago by cshalizi

"Unmeasured, spatially-structured factors can confound associations between spatial environmental exposures and health outcomes. Adding flexible splines to a regression model is a simple approach for spatial confounding adjustment, but the spline degrees of freedom do not provide an easily interpretable spatial scale. We describe a method for quantifying the extent of spatial confounding adjustment in terms of the Euclidean distance at which variation is removed. We develop this approach for confounding adjustment with splines and using Fourier and wavelet filtering. We demonstrate differences in the spatial scales these bases can represent and provide a comparison of methods for selecting the amount of confounding adjustment. We find the best performance for selecting the amount of adjustment using an information criterion evaluated on an outcome model without exposure. We apply this method to spatial adjustment in an analysis of particulate matter and blood pressure in a cohort of United States women."

to:NB
causal_inference
splines
smoothing
spatial_statistics
statistics
fourier_analysis
to_teach:data_over_space_and_time
10 weeks ago by cshalizi

Xiao : Asymptotic theory of penalized splines

july 2019 by cshalizi

"The paper gives a unified study of the large sample asymptotic theory of penalized splines including the O-splines using B-splines and an integrated squared derivative penalty [22], the P-splines which use B-splines and a discrete difference penalty [13], and the T-splines which use truncated polynomials and a ridge penalty [24]. Extending existing results for O-splines [7], it is shown that, depending on the number of knots and appropriate smoothing parameters, the L2L2 risk bounds of penalized spline estimators are rate-wise similar to either those of regression splines or to those of smoothing splines and could each attain the optimal minimax rate of convergence [32]. In addition, convergence rate of the L∞L∞ risk bound, and local asymptotic bias and variance are derived for all three types of penalized splines."

to:NB
splines
statistics
july 2019 by cshalizi

A Spline Theory of Deep Learning

april 2019 by cshalizi

"We build a rigorous bridge between deep networks (DNs) and approximation theory via spline functions and operators. Our key result is that a large class of DNs can be written as a composition of max-affine spline operators (MASOs), which provide a powerful portal through which to view and analyze their inner workings. For instance, conditioned on the input signal, the output of a MASO DN can be written as a simple affine transformation of the input. This implies that a DN constructs a set of signal-dependent, class-specific templates against which the signal is compared via a simple inner product; we explore the links to the classical theory of optimal classification via matched filters and the effects of data memorization. Going further, we propose a simple penalty term that can be added to the cost function of any DN learning algorithm to force the templates to be orthogonal with each other; this leads to significantly improved classification performance and reduced overfitting with no change to the DN architecture. The spline partition of the input signal space opens up a new geometric avenue to study how DNs organize signals in a hierarchical fashion. As an application, we develop and validate a new distance metric for signals that quantifies the difference between their partition encodings."

to:NB
to_read
approximation
splines
neural_networks
machine_learning
your_favorite_deep_neural_network_sucks
via:csantos
april 2019 by cshalizi

Kriging and Splines: Theoretical Approach to Linking Spatial Prediction Methods | SpringerLink

january 2019 by cshalizi

To rip off shamelessly, if/when I re-teach the course.

in_NB
to_read
splines
prediction
spatial_statistics
to_teach:data_over_space_and_time
january 2019 by cshalizi

Breaking ground with Grace

october 2018 by cshalizi

Some amusing stories, but also a very strong impression that the writer didn't actually understand Wahba's work at all.

wahba.grace
statistics
splines
lives_of_the_scientists
october 2018 by cshalizi

[1212.6788] Local and global asymptotic inference in smoothing spline models

december 2016 by cshalizi

"This article studies local and global inference for smoothing spline estimation in a unified asymptotic framework. We first introduce a new technical tool called functional Bahadur representation, which significantly generalizes the traditional Bahadur representation in parametric models, that is, Bahadur [Ann. Inst. Statist. Math. 37 (1966) 577-580]. Equipped with this tool, we develop four interconnected procedures for inference: (i) pointwise confidence interval; (ii) local likelihood ratio testing; (iii) simultaneous confidence band; (iv) global likelihood ratio testing. In particular, our confidence intervals are proved to be asymptotically valid at any point in the support, and they are shorter on average than the Bayesian confidence intervals proposed by Wahba [J. R. Stat. Soc. Ser. B Stat. Methodol. 45 (1983) 133-150] and Nychka [J. Amer. Statist. Assoc. 83 (1988) 1134-1143]. We also discuss a version of the Wilks phenomenon arising from local/global likelihood ratio testing. It is also worth noting that our simultaneous confidence bands are the first ones applicable to general quasi-likelihood models. Furthermore, issues relating to optimality and efficiency are carefully addressed. As a by-product, we discover a surprising relationship between periodic and nonperiodic smoothing splines in terms of inference."

to:NB
nonparametrics
splines
regression
confidence_sets
statistics
december 2016 by cshalizi

Thomas-Agnan : Spline Functions and Stochastic Filtering

april 2016 by cshalizi

"Some relationships have been established between unbiased linear predictors of processes, in signal and noise models, minimizing the predictive mean square error and some smoothing spline functions. We construct a new family of multidimensional splines adapted to the prediction of locally homogeneous random fields, whose "m-spectral measure" (to be defined) is absolutely continuous with respect to Lebesgue measure and satisfies some minor assumptions. By considering partial splines, one may include an arbitrary drift in the signal. This type of correspondence underlines the potentialities of cross-fertilization between statistics and the numerical techniques in approximation theory."

to:NB
splines
prediction
filtering
statistics
hilbert_space
fourier_analysis
random_fields
have_read
april 2016 by cshalizi

Boosting With the L2 Loss - Journal of the American Statistical Association

april 2016 by cshalizi

"This article investigates a computationally simple variant of boosting, L2Boost, which is constructed from a functional gradient descent algorithm with the L2-loss function. Like other boosting algorithms, L2Boost uses many times in an iterative fashion a prechosen fitting method, called the learner. Based on the explicit expression of refitting of residuals of L2Boost, the case with (symmetric) linear learners is studied in detail in both regression and classification. In particular, with the boosting iteration m working as the smoothing or regularization parameter, a new exponential bias-variance trade-off is found with the variance (complexity) term increasing very slowly as m tends to infinity. When the learner is a smoothing spline, an optimal rate of convergence result holds for both regression and classification and the boosted smoothing spline even adapts to higher-order, unknown smoothness. Moreover, a simple expansion of a (smoothed) 0–1 loss function is derived to reveal the importance of the decision boundary, bias reduction, and impossibility of an additive bias-variance decomposition in classification. Finally, simulation and real dataset results are obtained to demonstrate the attractiveness of L2Boost. In particular, we demonstrate that L2Boosting with a novel component-wise cubic smoothing spline is both practical and effective in the presence of high-dimensional predictors."

to:NB
statistics
regression
splines
smoothing
classifiers
ensemble_methods
have_read
via:djm1107
buhlmann.peter
yu.bin
april 2016 by cshalizi

An Introduction to Sparse Stochastic Processes | Communications and Signal Processing | Cambridge University Press

december 2015 by cshalizi

"Providing a novel approach to sparsity, this comprehensive book presents the theory of stochastic processes that are ruled by linear stochastic differential equations, and that admit a parsimonious representation in a matched wavelet-like basis. Two key themes are the statistical property of infinite divisibility, which leads to two distinct types of behaviour - Gaussian and sparse - and the structural link between linear stochastic processes and spline functions, which is exploited to simplify the mathematical analysis. The core of the book is devoted to investigating sparse processes, including a complete description of their transform-domain statistics. The final part develops practical signal-processing algorithms that are based on these models, with special emphasis on biomedical image reconstruction. This is an ideal reference for graduate students and researchers with an interest in signal/image processing, compressed sensing, approximation theory, machine learning, or statistics."

to:NB
books:noted
sparsity
stochastic_processes
compressed_sensing
splines
statistics
levy_processes
december 2015 by cshalizi

Silverman : Spline Smoothing: The Equivalent Variable Kernel Method

february 2015 by cshalizi

"The spline smoothing approach to nonparametric regression and curve estimation is considered. It is shown that, in a certain sense, spline smoothing corresponds approximately to smoothing by a kernel method with bandwidth depending on the local density of design points. Some exact calculations demonstrate that the approximation is extremely close in practice. Consideration of kernel smoothing methods demonstrates that the way in which the effective local bandwidth behaves in spline smoothing has desirable properties. Finally, the main result of the paper is applied to the related topic of penalized maximum likelihood probability density estimates; a heuristic discussion shows that these estimates should adapt well in the tails of the distribution."

have_read
splines
kernel_estimators
nonparametrics
regression
density_estimation
statistics
silverman.bernard
in_NB
february 2015 by cshalizi

[1410.7690] Trend Filtering on Graphs

december 2014 by cshalizi

"We introduce a family of adaptive estimators on graphs, based on penalizing the ℓ1 norm of discrete graph differences. This generalizes the idea of trend filtering [Kim et al. (2009), Tibshirani (2014)], used for univariate nonparametric regression, to graphs. Analogous to the univariate case, graph trend filtering exhibits a level of local adaptivity unmatched by the usual ℓ2-based graph smoothers. It is also defined by a convex minimization problem that is readily solved (e.g., by fast ADMM or Newton algorithms). We demonstrate the merits of graph trend filtering through examples and theory."

to:NB
network_data_analysis
smoothing
statistics
nonparametrics
splines
kith_and_kin
sharpnack.james
tibshirani.ryan
to_teach:baby-nets
december 2014 by cshalizi

[1405.0558] The Falling Factorial Basis and Its Statistical Applications

december 2014 by cshalizi

"We study a novel spline-like basis, which we name the "falling factorial basis", bearing many similarities to the classic truncated power basis. The advantage of the falling factorial basis is that it enables rapid, linear-time computations in basis matrix multiplication and basis matrix inversion. The falling factorial functions are not actually splines, but are close enough to splines that they provably retain some of the favorable properties of the latter functions. We examine their application in two problems: trend filtering over arbitrary input points, and a higher-order variant of the two-sample Kolmogorov-Smirnov test."

to:NB
have_read
splines
nonparametrics
statistics
two-sample_tests
kith_and_kin
tibshirani.ryan
december 2014 by cshalizi

IEEE Xplore Abstract - Control theoretic smoothing splines

september 2014 by cshalizi

"Some of the relationships between optimal control and statistics are examined. We produce generalized, smoothing splines by solving an optimal control problem for linear control systems, minimizing the L2-norm of the control signal, while driving the scalar output of the control system close to given, prespecified interpolation points. We then prove a convergence result for the smoothing splines, using results from the theory of numerical quadrature. Finally, we show, in simulations, that our approach works in practice as well as in theory"

to:NB
splines
control_theory_and_control_engineering
smoothing
statistics
via:arsyed
september 2014 by cshalizi

A Primer on Regression Splines

may 2014 by cshalizi

"B-splines constitute an appealing method for the nonparametric estimation of a range of statis- tical objects of interest. In this primer we focus our attention on the estimation of a conditional mean, i.e. the ‘regression function’."

in_NB
splines
nonparametrics
regression
approximation
statistics
computational_statistics
racine.jeffrey_s.
to_teach:statcomp
to_teach:undergrad-ADA
have_read
may 2014 by cshalizi

[1403.7118] A Unified Framework of Constrained Regression

april 2014 by cshalizi

"Generalized additive models (GAMs) play an important role in modeling and understanding complex relationships in modern applied statistics. They allow for flexible, data-driven estimation of covariate effects. Yet researchers often have a priori knowledge of certain effects, which might be monotonic or periodic (cyclic) or should fulfill boundary conditions. We propose a unified framework to incorporate these constraints for both univariate and bivariate effect estimates and for varying coefficients. As the framework is based on (functional gradient descent) boosting methods, variables can be selected intrinsically, and effects can be estimated for a wide range of different distributional assumptions. We present three case studies from environmental sciences. The first on air pollution illustrates the use of monotonic and periodic effects in the context of an additive Poisson model. The second case study highlights the use of bivariate cyclic splines to model activity profiles of roe deer. The third case study demonstrates how to estimate the complete conditional distribution function of deer-vehicle collisions with the help of monotonicity constraints, and a cyclic constraint is considered for the seasonal variation of collision numbers. All discussed constrained effect estimates are implemented in the comprehensive R package mboost for model-based boosting."

to:NB
additive_models
regression
nonparametrics
statistics
to_read
boosting
splines
april 2014 by cshalizi

[1403.7001] Spaghetti prediction: A robust method for forecasting short time series

april 2014 by cshalizi

"A novel method for predicting time series is described and demonstrated. This method inputs time series data points and outputs multiple "spaghetti" functions from which predictions can be made. Spaghetti prediction has desirable properties that are not realized by classic autoregression, moving average, spline, Gaussian process, and other methods. It is particularly appropriate for short time series because it allows asymmetric prediction distributions and produces prediction functions which are robust in that they use multiple independent models."

prediction
time_series
statistics
have_read
shot_after_a_fair_trial
splines
april 2014 by cshalizi

Likelihood Ratio Tests for Dependent Data with Applications to Longitudinal and Functional Data Analysis - Staicu - 2014 - Scandinavian Journal of Statistics - Wiley Online Library

march 2014 by cshalizi

"This paper introduces a general framework for testing hypotheses about the structure of the mean function of complex functional processes. Important particular cases of the proposed framework are as follows: (1) testing the null hypothesis that the mean of a functional process is parametric against a general alternative modelled by penalized splines; and (2) testing the null hypothesis that the means of two possibly correlated functional processes are equal or differ by only a simple parametric function. A global pseudo-likelihood ratio test is proposed, and its asymptotic distribution is derived. The size and power properties of the test are confirmed in realistic simulation scenarios. Finite-sample power results indicate that the proposed test is much more powerful than competing alternatives. Methods are applied to testing the equality between the means of normalized δ-power of sleep electroencephalograms of subjects with sleep-disordered breathing and matched controls."

to:NB
likelihood
hypothesis_testing
splines
nonparametrics
misspecification
statistics
to_teach:undergrad-ADA
march 2014 by cshalizi

Kim , Huo : Asymptotic optimality of a multivariate version of the generalized cross validation in adaptive smoothing splines

february 2014 by cshalizi

"We consider an adaptive smoothing spline with a piecewise-constant penalty function λ(x), in which a univariate smoothing parameter λ in the classic smoothing spline is converted into an adaptive multivariate parameter λ. Choosing the optimal value of λ is critical for obtaining desirable estimates. We propose to choose λ by minimizing a multivariate version of the generalized cross validation function; the resulting estimator is shown to be consistent and asymptotically optimal under some general conditions—i.e., the counterparts of the nice asymptotic properties of the generalized cross validation in the ordinary smoothing spline are still provable. This provides theoretical justification of adopting the multivariate version of the generalized cross validation principle in adaptive smoothing splines."

in_NB
splines
smoothing
nonparametrics
cross-validation
february 2014 by cshalizi

Taylor & Francis Online :: Shape-Constrained Estimation Using Nonnegative Splines - Journal of Computational and Graphical Statistics - Volume 23, Issue 1

february 2014 by cshalizi

"We consider the problem of nonparametric estimation of unknown smooth functions in the presence of restrictions on the shape of the estimator and on its support using polynomial splines. We provide a general computational framework that treats these estimation problems in a unified manner, without the limitations of the existing methods. Applications of our approach include computing optimal spline estimators for regression, density estimation, and arrival rate estimation problems in the presence of various shape constraints. Our approach can also handle multiple simultaneous shape constraints. The approach is based on a characterization of nonnegative polynomials that leads to semidefinite programming (SDP) and second-order cone programming (SOCP) formulations of the problems. These formulations extend and generalize a number of previous approaches in the literature, including those with piecewise linear and B-spline estimators. We also consider a simpler approach in which nonnegative splines are approximated by splines whose pieces are polynomials with nonnegative coefficients in a nonnegative basis. A condition is presented to test whether a given nonnegative basis gives rise to a spline cone that is dense in the space of nonnegative continuous functions. The optimization models formulated in the article are solvable with minimal running time using off-the-shelf software. We provide numerical illustrations for density estimation and regression problems. These examples show that the proposed approach requires minimal computational time, and that the estimators obtained using our approach often match and frequently outperform kernel methods and spline smoothing without shape constraints. Supplementary materials for this article are provided online."

to:NB
splines
regression
nonparametrics
optimization
statistics
february 2014 by cshalizi

Shang , Cheng : Local and global asymptotic inference in smoothing spline models

november 2013 by cshalizi

"This article studies local and global inference for smoothing spline estimation in a unified asymptotic framework. We first introduce a new technical tool called functional Bahadur representation, which significantly generalizes the traditional Bahadur representation in parametric models, that is, Bahadur [Ann. Inst. Statist. Math. 37 (1966) 577–580]. Equipped with this tool, we develop four interconnected procedures for inference: (i) pointwise confidence interval; (ii) local likelihood ratio testing; (iii) simultaneous confidence band; (iv) global likelihood ratio testing. In particular, our confidence intervals are proved to be asymptotically valid at any point in the support, and they are shorter on average than the Bayesian confidence intervals proposed by Wahba [J. R. Stat. Soc. Ser. B Stat. Methodol. 45 (1983) 133–150] and Nychka [J. Amer. Statist. Assoc. 83 (1988) 1134–1143]. We also discuss a version of the Wilks phenomenon arising from local/global likelihood ratio testing. It is also worth noting that our simultaneous confidence bands are the first ones applicable to general quasi-likelihood models. Furthermore, issues relating to optimality and efficiency are carefully addressed. As a by-product, we discover a surprising relationship between periodic and nonperiodic smoothing splines in terms of inference."

in_NB
splines
nonparametrics
confidence_sets
hypothesis_testing
to_read
november 2013 by cshalizi

Discretized Laplacian Smoothing by Fourier Methods - Journal of the American Statistical Association - Volume 86, Issue 415

september 2013 by cshalizi

"An approach to multidimensional smoothing is introduced that is based on a penalized likelihood with a modified discretized Laplacian penalty term. The choice of penalty simplifies computational difficulties associated with standard multidimensional Laplacian smoothing methods yet without compromising mean squared error characteristics, at least on the interior of the region of interest. For linear smoothing in hyper-rectangular domains, which has wide application in image reconstruction and restoration problems, computations are carried out using fast Fourier transforms. Nonlinear smoothing is accomplished by iterative application of the linear smoothing technique. The iterative procedure is shown to be convergent under general conditions. Adaptive choice of the amount of smoothing is based on approximate cross-validation type scores. An importance sampling technique is used to estimate the degrees of freedom of the smooth. The methods are implemented in one- and two-dimensional settings. Some illustrations are given relating to scatterplot smoothing, estimation of a logistic regression surface, and density estimation. The asymptotic mean squared error characteristics of the linear smoother are derived from first principles and shown to match those of standard Laplacian smoothing splines in the case where the target function is locally linear at the boundary. A one-dimensional Monte Carlo simulation indicates that the mean squared error properties of the linear smoother largely match those of smoothing splines even when these boundary conditions are not satisfied."

to:NB
have_read
smoothing
splines
regression
statistics
september 2013 by cshalizi

Some Aspects of the Spline Smoothing Approach to Non-Parametric Regression Curve Fitting

september 2013 by cshalizi

"Non-parametric regression using cubic splines is an attractive, flexible and widely-applicable approach to curve estimation. Although the basic idea was formulated many years ago, the method is not as widely known or adopted as perhaps it should be. The topics and examples discussed in this paper are intended to promote the understanding and extend the practicability of the spline smoothing methodology. Particular subjects covered include the basic principles of the method; the relation with moving average and other smoothing methods; the automatic choice of the amount of smoothing; and the use of residuals for diagnostic checking and model adaptation. The question of providing inference regions for curves-and for relevant properties of curves--is approached via a finite-dimensional Bayesian formulation."

- Ungated copy, including discussion: http://www-personal.umich.edu/~jizhu/jizhu/wuke/Silverman-JRSSB85.pdf

to:NB
splines
regression
smoothing
have_read
statistics
silverman.bernard
nonparametrics
- Ungated copy, including discussion: http://www-personal.umich.edu/~jizhu/jizhu/wuke/Silverman-JRSSB85.pdf

september 2013 by cshalizi

Bayesian Smoothing and Regression Splines for Measurement Error Problems

september 2013 by cshalizi

"In the presence of covariate measurement error, estimating a regression function nonparametrically is extremely dif cult, the problem being related to deconvolution. Various frequentist approaches exist for this problem, but to date there has been no Bayesian treatment. In this article we describe Bayesian approaches to modeling a exible regression function when the predictor variable is measured with error. The regression function is modeled with smoothing splines and regression P-splines. Two methods are described for exploration of the posterior. The rst, called the iterative conditional modes ( ICM), is only partially Bayesian. ICM uses a componentwise maximization routine to nd the mode of the posterior. It also serves to create starting values for the second method, which is fully Bayesian and uses Markov chain Monte Carlo (MCMC) techniques to generate observations from the joint posterior distribution. Use of the MCMC approach has the advantage that interval estimates that directly model and adjust for the measurement error are easily calculated. We provide simulations with several nonlinear regression functions and provide an illustrative example. Our simulations indicate that the frequentist mean squared error properties of the fully Bayesian method are better than those of ICM and also of previously proposed frequentist methods, at least in the examples that we have studied."

to:NB
error-in-variables
regression
nonparametrics
splines
statistics
re:smoothing_adjacency_matrices
to_read
september 2013 by cshalizi

[1306.3014] Mixtures of Spatial Spline Regressions

june 2013 by cshalizi

"We present an extension of the functional data analysis framework for univariate functions to the analysis of surfaces: functions of two variables. The spatial spline regression (SSR) approach developed can be used to model surfaces that are sampled over a rectangular domain. Furthermore, combining SSR with linear mixed effects models (LMM) allows for the analysis of populations of surfaces, and combining the joint SSR-LMM method with finite mixture models allows for the analysis of populations of surfaces with sub-family structures. Through the mixtures of spatial splines regressions (MSSR) approach developed, we present methodologies for clustering surfaces into sub-families, and for performing surface-based discriminant analysis. The effectiveness of our methodologies, as well as the modeling capabilities of the SSR model are assessed through an application to handwritten character recognition."

in_NB
splines
spatial_statistics
mixture_models
june 2013 by cshalizi

[1306.1868] Smoothing splines with varying smoothing parameter

june 2013 by cshalizi

"This paper considers the development of spatially adaptive smoothing splines for the estimation of a regression function with non-homogeneous smoothness across the domain. Two challenging issues that arise in this context are the evaluation of the equivalent kernel and the determination of a local penalty. The roughness penalty is a function of the design points in order to accommodate local behavior of the regression function. It is shown that the spatially adaptive smoothing spline estimator is approximately a kernel estimator. The resulting equivalent kernel is spatially dependent. The equivalent kernels for traditional smoothing splines are a special case of this general solution. With the aid of the Green's function for a two-point boundary value problem, the explicit forms of the asymptotic mean and variance are obtained for any interior point. Thus, the optimal roughness penalty function is obtained by approximately minimizing the asymptotic integrated mean square error. Simulation results and an application illustrate the performance of the proposed estimator."

to:NB
splines
kernel_estimators
nonparametrics
regression
statistics
june 2013 by cshalizi

[1306.1866] Minimax Optimal Estimation of Convex Functions in the Supreme Norm

june 2013 by cshalizi

"Estimation of convex functions finds broad applications in engineering and science, while convex shape constraint gives rise to numerous challenges in asymptotic performance analysis. This paper is devoted to minimax optimal estimation of univariate convex functions from the H\"older class in the framework of shape constrained nonparametric estimation. Particularly, the paper establishes the optimal rate of convergence in two steps for the minimax sup-norm risk of convex functions with the H\"older order between one and two. In the first step, by applying information theoretical results on probability measure distance, we establish the minimax lower bound under the supreme norm by constructing a novel family of piecewise quadratic convex functions in the H\"older class. In the second step, we develop a penalized convex spline estimator and establish the minimax upper bound under the supreme norm. Due to the convex shape constraint, the optimality conditions of penalized convex splines are characterized by nonsmooth complementarity conditions. By exploiting complementarity methods, a critical uniform Lipschitz property of optimal spline coefficients in the infinity norm is established. This property, along with asymptotic estimation techniques, leads to uniform bounds for bias and stochastic errors on the entire interval of interest. This further yields the optimal rate of convergence by choosing the suitable number of knots and penalty value. The present paper provides the first rigorous justification of the optimal minimax risk for convex estimation under the supreme norm."

to:NB
optimization
splines
nonparametrics
statistics
june 2013 by cshalizi

Flexible Copula Density Estimation with Penalized Hierarchical B-splines - Kauermann - 2013 - Scandinavian Journal of Statistics - Wiley Online Library

june 2013 by cshalizi

"The paper introduces a new method for flexible spline fitting for copula density estimation. Spline coefficients are penalized to achieve a smooth fit. To weaken the curse of dimensionality, instead of a full tensor spline basis, a reduced tensor product based on so called sparse grids (Notes Numer. Fluid Mech. Multidiscip. Des., 31, 1991, 241-251) is used. To achieve uniform margins of the copula density, linear constraints are placed on the spline coefficients, and quadratic programming is used to fit the model. Simulations and practical examples accompany the presentation."

to:NB
density_estimation
copulas
splines
statistics
june 2013 by cshalizi

Heckman , Lockhart , Nielsen : Penalized regression, mixed effects models and appropriate modelling

may 2013 by cshalizi

"Linear mixed effects methods for the analysis of longitudinal data provide a convenient framework for modelling within-individual correlation across time. Using spline functions allows for flexible modelling of the response as a smooth function of time. A computational connection between linear mixed effects modelling and spline smoothing has resulted in a cross-fertilization of these two fields. The connection has popularized the use of spline functions in longitudinal data analysis and the use of mixed effects software in smoothing analyses. However, care must be taken in exploiting this connection, as resulting estimates of the underlying population mean might not track the data well and associated standard errors might not reflect the true variability in the data. We discuss these shortcomings and suggest some easy-to-compute methods to eliminate them."

to:NB
hierarchical_statistical_models
smoothing
splines
time_series
statistics
regression
may 2013 by cshalizi

[1304.3347] Spline regression for zero-inflated models

april 2013 by cshalizi

"We propose a regression model for count data when the classical generalized linear model approach is too rigid due to a high outcome of zero counts and a nonlinear influence of continuous covariates. Zero-Inflation is applied to take into account the presence of excess zeros with separate link functions for the zero and the nonzero component. Nonlinearity in covariates is captured by spline functions based on B-splines. Our algorithm relies on maximum-likelihood estimation and allows for adaptive box-constrained knots, thus improving the goodness of the spline fit and allowing for detection of sensitivity changepoints. A simulation study substantiates the numerical stability of the algorithm to infer such models. The AIC criterion is shown to serve well for model selection, in particular if nonlinearities are weak such that BIC tends to overly simplistic models. We fit the introduced models to real data of children's dental sanity, linking caries counts with the so-called Body-Mass-Index (BMI) and other socioeconomic factors. This reveals a puzzling nonmonotonic influence of BMI on caries counts which is yet to be explained by clinical experts."

to:NB
splines
regression
nonparametrics
to_teach:undergrad-ADA
april 2013 by cshalizi

[1304.2986] Adaptive Piecewise Polynomial Estimation via Trend Filtering

april 2013 by cshalizi

"We study trend filtering, a recently proposed tool of Kim et al. (2009) for nonparametric regression. The trend filtering estimate is defined as the minimizer of a penalized least squares criterion, in which the penalty term sums the absolute kth order discrete derivatives over the input points. Perhaps not surprisingly, trend filtering estimates appear to have the structure of kth degree spline functions, with adaptively chosen knot points (we say "appear" here as trend filtering estimates are not really functions over continuous domains, and are only defined over the discrete set of inputs). This brings to mind comparisons to other nonparametric regression tools that also produce adaptive splines; in particular, we compare trend filtering to smoothing splines, which penalize the sum of squared derivatives across input points, and to locally adaptive regression splines (Mammen & van de Geer 1997), which penalize the total variation of the kth derivative. Empirically, we discover that trend filtering estimates adapt to the local level of smoothness much better than smoothing splines, and further, they exhibit a remarkable similarity to locally adaptive regression splines. We also provide theoretical support for these empirical findings; most notably, we prove that (with the right choice of tuning parameter) the trend filtering estimate converges to the true underlying function at the minimax rate for functions whose kth derivative is of bounded variation. This is done via an asymptotic pairing of trend filtering and locally adaptive regression splines, which have already been shown to converge at the minimax rated (Mammen & van de Geer 1997). At the core of this argument is a new result tying together the fitted values of two lasso problems that share the same outcome vector, but have different predictor matrices."

to:NB
filtering
regression
statistics
splines
sparsity
lasso
kith_and_kin
tibshirani.ryan
have_read
april 2013 by cshalizi

Spatial spline regression models - Sangalli - 2013 - Journal of the Royal Statistical Society: Series B (Statistical Methodology) - Wiley Online Library

march 2013 by cshalizi

"We describe a model for the analysis of data distributed over irregularly shaped spatial domains with complex boundaries, strong concavities and interior holes. Adopting an approach that is typical of functional data analysis, we propose a spatial spline regression model that is computationally efficient, allows for spatially distributed covariate information and can impose various conditions over the boundaries of the domain. Accurate surface estimation is achieved by the use of piecewise linear and quadratic finite elements."

statistics
splines
spatial_statistics
functional_data_analysis
smoothing
re:small-area_estimation_by_smoothing
in_NB
march 2013 by cshalizi

Smoothing parameter selection in two frameworks for penalized splines - Krivobokova - 2013 - Journal of the Royal Statistical Society: Series B (Statistical Methodology) - Wiley Online Library

march 2013 by cshalizi

"There are two popular smoothing parameter selection methods for spline smoothing. First, smoothing parameters can be estimated by minimizing criteria that approximate the average mean-squared error of the regression function estimator. Second, the maximum likelihood paradigm can be employed, under the assumption that the regression function is a realization of some stochastic process. The asymptotic properties of both smoothing parameter estimators for penalized splines are studied and compared. A simulation study and a real data example illustrate the theoretical findings."

in_NB
splines
statistics
regression
smoothing
march 2013 by cshalizi

[1303.3365] Adaptive Priors based on Splines with Random Knots

march 2013 by cshalizi

"Splines are useful building blocks when constructing priors on nonparametric models indexed by functions. Recently it has been established in the literature that hierarchical priors based on splines with a random number of equally spaced knots and random coefficients in the B-spline basis corresponding to those knots lead, under certain conditions, to adaptive posterior contraction rates, over certain smoothness functional classes. In this paper we extend these results for when the location of the knots is also endowed with a prior. This has already been a common practice in MCMC applications, where the resulting posterior is expected to be more "spatially adaptive", but a theoretical basis in terms of adaptive contraction rates was missing. Under some mild assumptions, we establish a result that provides sufficient conditions for adaptive contraction rates in a range of models."

to:NB
bayesianism
splines
nonparametrics
statistics
march 2013 by cshalizi

Balek, Mizera: Mechanical models in nonparametric regression

march 2013 by cshalizi

"what would happen if the metaphor underlying splines were replaced by a metaphor of plastic energy"

to:NB
splines
regression
nonparametrics
statistics
march 2013 by cshalizi

Fast bivariate P-splines: the sandwich smoother - Xiao - 2013 - Journal of the Royal Statistical Society: Series B (Statistical Methodology) - Wiley Online Library

march 2013 by cshalizi

"We propose a fast penalized spline method for bivariate smoothing. Univariate P-spline smoothers are applied simultaneously along both co-ordinates. The new smoother has a sandwich form which suggested the name ‘sandwich smoother’ to a referee. The sandwich smoother has a tensor product structure that simplifies an asymptotic analysis and it can be fast computed. We derive a local central limit theorem for the sandwich smoother, with simple expressions for the asymptotic bias and variance, by showing that the sandwich smoother is asymptotically equivalent to a bivariate kernel regression estimator with a product kernel. As far as we are aware, this is the first central limit theorem for a bivariate spline estimator of any type. Our simulation study shows that the sandwich smoother is orders of magnitude faster to compute than other bivariate spline smoothers, even when the latter are computed by using a fast generalized linear array model algorithm, and comparable with them in terms of mean integrated squared errors. We extend the sandwich smoother to array data of higher dimensions, where a generalized linear array model algorithm improves the computational speed of the sandwich smoother. One important application of the sandwich smoother is to estimate covariance functions in functional data analysis. In this application, our numerical results show that the sandwich smoother is orders of magnitude faster than local linear regression. The speed of the sandwich formula is important because functional data sets are becoming quite large."

to:NB
smoothing
splines
regression
to_teach:undergrad-ADA
march 2013 by cshalizi

Smoothing Spline ANOVA Models

february 2013 by cshalizi

"Nonparametric function estimation with stochastic data, otherwise known as smoothing, has been studied by several generations of statisticians. Assisted by the ample computing power in today's servers, desktops, and laptops, smoothing methods have been finding their ways into everyday data analysis by practitioners. While scores of methods have proved successful for univariate smoothing, ones practical in multivariate settings number far less. Smoothing spline ANOVA models are a versatile family of smoothing methods derived through roughness penalties, that are suitable for both univariate and multivariate problems.

"In this book, the author presents a treatise on penalty smoothing under a unified framework. Methods are developed for (i) regression with Gaussian and non-Gaussian responses as well as with censored lifetime data; (ii) density and conditional density estimation under a variety of sampling schemes; and (iii) hazard rate estimation with censored life time data and covariates. The unifying themes are the general penalized likelihood method and the construction of multivariate models with built-in ANOVA decompositions. Extensive discussions are devoted to model construction, smoothing parameter selection, computation, and asymptotic convergence."

in_NB
books:noted
smoothing
splines
regression
density_estimation
statistics
nonparametrics
additive_models
"In this book, the author presents a treatise on penalty smoothing under a unified framework. Methods are developed for (i) regression with Gaussian and non-Gaussian responses as well as with censored lifetime data; (ii) density and conditional density estimation under a variety of sampling schemes; and (iii) hazard rate estimation with censored life time data and covariates. The unifying themes are the general penalized likelihood method and the construction of multivariate models with built-in ANOVA decompositions. Extensive discussions are devoted to model construction, smoothing parameter selection, computation, and asymptotic convergence."

february 2013 by cshalizi

Heckman : The theory and application of penalized methods or Reproducing Kernel Hilbert Spaces made easy

october 2012 by cshalizi

"The popular cubic smoothing spline estimate of a regression function arises as the minimizer of the penalized sum of squares ∑j(Yj−μ(tj))2+λ∫ba[μ′′(t)]2dt, where the data are tj,Yj, j=1,…,n. The minimization is taken over an infinite-dimensional function space, the space of all functions with square integrable second derivatives. But the calculations can be carried out in a finite-dimensional space. The reduction from minimizing over an infinite dimensional space to minimizing over a finite dimensional space occurs for more general objective functions: the data may be related to the function μ in another way, the sum of squares may be replaced by a more suitable expression, or the penalty, ∫ba[μ′′(t)]2dt, might take a different form. This paper reviews the Reproducing Kernel Hilbert Space structure that provides a finite-dimensional solution for a general minimization problem. Particular attention is paid to the construction and study of the Reproducing Kernel Hilbert Space corresponding to a penalty based on a linear differential operator. In this case, one can often calculate the minimizer explicitly, using Green’s functions."

to:NB
statistics
optimization
splines
hilbert_space
october 2012 by cshalizi

[1208.3920] Asymptotics for penalized splines in generalized additive models

august 2012 by cshalizi

"This paper discusses asymptotic theory for penalized spline estimators in generalized additive models. The purpose of this paper is to establish the asymptotic bias and variance as well as the asymptotic normality of the penalized spline estimators proposed by Marx and Eilers (1998). Furthermore, the asymptotics for the penalized quasi likelihood fit in mixed models are also discussed."

in_NB
additive_models
regression
splines
statistics
august 2012 by cshalizi

Filters and Full Employment (Not Wonkish, Really) - NYTimes.com

july 2012 by cshalizi

Dear God, are real grown-up economists confusing _the trend of realized output_ with _potential output_? That's insane (but apparently the case - at the Fed, no less). Well, at least explaining what's going wrong here can make a useful homework problem for undergraduate data analysis.

bad_data_analysis
macroeconomics
filtering
time_series
utter_stupidity
to_teach:undergrad-ADA
splines
july 2012 by cshalizi

[1205.5314] A general spline representation for nonparametric and semiparametric density estimates using diffeomorphisms

june 2012 by cshalizi

"A theorem of McCann shows that for any two absolutely continuous probability measures on R^d there exists a monotone transformation sending one probability measure to the other. A consequence of this theorem, relevant to statistics, is that density estimation can be recast in terms of transformations. In particular, one can fix any absolutely continuous probability measure, call it P, and then reparameterize the whole class of absolutely continuous probability measures as monotone transformations from P. In this paper we utilize this reparameterization of densities, as monotone transformations from some P, to construct semiparametric and nonparametric density estimates. We focus our attention on classes of transformations, developed in the image processing and computational anatomy literature, which are smooth, invertible and which have attractive computational properties. The techniques developed for this class of transformations allow us to show that a penalized maximum likelihood estimate (PMLE) of a smooth transformation from P exists and has a finite dimensional characterization, similar to those results found in the spline literature. These results are derived utilizing an Euler-Lagrange characterization of the PMLE which also establishes a surprising connection to a generalization of Stein's lemma for characterizing the normal distribution."

in_NB
density_estimation
statistics
splines
june 2012 by cshalizi

[0805.1404] Adaptive estimation of a distribution function and its density in sup-norm loss by wavelet and spline projections

may 2012 by cshalizi

"Given an i.i.d. sample from a distribution $F$ on $mathbb{R}$ with uniformly continuous density $p_0$, purely data-driven estimators are constructed that efficiently estimate $F$ in sup-norm loss and simultaneously estimate $p_0$ at the best possible rate of convergence over H"older balls, also in sup-norm loss. The estimators are obtained by applying a model selection procedure close to Lepski's method with random thresholds to projections of the empirical measure onto spaces spanned by wavelets or $B$-splines. The random thresholds are based on suprema of Rademacher processes indexed by wavelet or spline projection kernels. This requires Bernstein-type analogs of the inequalities in Koltchinskii [Ann. Statist. 34 (2006) 2593-2656] for the deviation of suprema of empirical processes from their Rademacher symmetrizations."

in_NB
density_estimation
wavelets
splines
statistics
empirical_processes
may 2012 by cshalizi

[math/0612776] Uniform error bounds for smoothing splines

april 2012 by cshalizi

"Almost sure bounds are established on the uniform error of smoothing spline estimators in nonparametric regression with random designs. Some results of Einmahl and Mason (2005) are used to derive uniform error bounds for the approximation of the spline smoother by an ``equivalent'' reproducing kernel regression estimator, as well as for proving uniform error bounds on the reproducing kernel regression estimator itself, uniformly in the smoothing parameter over a wide range. This admits data-driven choices of the smoothing parameter."

in_NB
splines
regression
nonparametrics
statistics
learning_theory
april 2012 by cshalizi

On a New Method of Graduation

january 2012 by cshalizi

Whittaker introduces spline smoothing in 1922, complete with the Bayesian derivation. Does not use the word "spline", however --- when did that come in?

in_NB
to_teach:undergrad-ADA
splines
smoothing
regression
statistics
have_read
january 2012 by cshalizi

[1111.1915] The theory and application of penalized methods or Reproducing Kernel Hilbert Spaces made easy

november 2011 by cshalizi

"The popular cubic smoothing spline estimate of a regression function arises as the minimizer of the penalized sum of squares $sum_j(Y_j - {mu}(t_j))^2 + {lambda}int_a^b [{mu}"(t)]^2 dt$, where the data are $t_j,Y_j$, $j=1,..., n$. The minimization is taken over an infinite-dimensional function space, the space of all functions with square integrable second derivatives. But the calculations can be carried out in a finite-dimensional space. The reduction from minimizing over an infinite dimensional space to minimizing over a finite dimensional space occurs for more general objective functions: the data may be related to the function ${mu}$ in another way, the sum of squares may be replaced by a more suitable expression, or the penalty, $int_a^b [{mu}"(t)]^2 dt$, might take a different form. This paper reviews the Reproducing Kernel Hilbert Space structure that provides a finite-dimensional solution for a general minimization problem. Particular attention is paid to penalties based on linear differential operators. In this case, one can sometimes easily calculate the minimizer explicitly, using Green's functions."

in_NB
statistics
hilbert_space
splines
november 2011 by cshalizi

"A Locally Adaptive Penalty for Estimation of Functions of Varying Roughness"

august 2010 by cshalizi

"We propose a new regularization method called Loco-Spline for nonparametric function estimation. Loco-Spline uses a penalty which is data driven and locally adaptive. This allows for more flexible estimation of the function in regions of the domain where it has more curvature, without over fitting in regions that have little curvature. This methodology is also transferred into higher dimensions via the Smoothing Spline ANOVA framework. General conditions for optimal MSE rate of convergence are given and the Loco-Spline is shown to achieve this rate. In our simulation study, the Loco-Spline substantially outperforms the traditional smoothing spline and the locally adaptive kernel smoother. Code to fit Loco-Spline models is included with the Supplemental Materials for this article which are available online." Teach? But I'd need to explain more about splines.

splines
curve_fitting
smoothing
regression
statistics
to_teach:data-mining
to_read
to_teach:undergrad-ADA
august 2010 by cshalizi

"Simultaneous Confidence Bands for Penalized Spline Estimators" - Journal of the American Statistical Association - 105(490):852

july 2010 by cshalizi

"In this article we construct simultaneous confidence bands for a smooth curve using penalized spline estimators. We consider three types of estimation methods: (a) as a standard (fixed effect) nonparametric model, (b) using the mixed-model framework with the spline coefficients as random effects, and (c) a full Bayesian approach. The volume-of-tube formula is applied for the first two methods and compared with Bayesian simultaneous confidence bands from a frequentist perspective. We show that the mixed-model formulation of penalized splines can help obtain, at least approximately, confidence bands with either Bayesian or frequentist properties. Simulations and data analysis support the proposed methods. The R package ConfBands accompanies the article."

splines
confidence_sets
statistics
july 2010 by cshalizi

Kohler, Krzyzak, Schäfer: Application of structural risk minimization to multivariate smoothing spline regression estimates

march 2010 by cshalizi

Found clearing out my office; so old it's now open access!

splines
regression
structural_risk_minimization
learning_theory
estimation
statistics
have_read
march 2010 by cshalizi

"On the Equivalence of Two Stochastic Approaches to Spline Smoothing" (Ansley and Kohn, 1986)

february 2010 by cshalizi

On the connection between linear state-space models and reproducing-kernel Hilbert spaces. (Time to re-read Wahba?)

splines
smoothing
statistics
hilbert_space
state-space_models
nonparametrics
re:your_favorite_dsge_sucks
february 2010 by cshalizi

Qi, Zhao: Asymptotic efficiency and finite-sample properties of the generalized profiling estimation of parameters in ordinary differential equations

december 2009 by cshalizi

Results on the Hooker/Ramsey method for estimating ODEs.

time_series
estimation_of_dynamical_systems
statistics
estimation
splines
re:stacs
december 2009 by cshalizi

Hodrick-Prescott filter - Wikipedia, the free encyclopedia

september 2009 by cshalizi

I think you mis-spelled "smoothing spline". HTH. HAND.

time_series
macroeconomics
filtering
splines
wheels:reinvention_of
statistics
econometrics
re:your_favorite_dsge_sucks
september 2009 by cshalizi

Splines for Financial Volatility

june 2009 by cshalizi

"We propose a flexible generalized auto-regressive conditional heteroscedasticity type of model for the prediction of volatility in financial time series. The approach relies on the idea of using multivariate B-splines of lagged observations and volatilities. Estimation of such a B-spline basis expansion is constructed within the likelihood framework for non-Gaussian observations. As the dimension of the B-spline basis is large, i.e. many parameters, we use regularized and sparse model fitting with a boosting algorithm"

splines
statistics
finance
stochastic_volatility
in_NB
ensemble_methods
boosting
buhlmann.peter
have_read
time_series
june 2009 by cshalizi

**related tags**

Copy this bookmark: