lahiri.s.n.   4

[1906.08843] On Statistical Properties of A Veracity Scoring Method for Spatial Data
"Measuring veracity or reliability of noisy data is of utmost importance, especially in the scenarios where the information are gathered through automated systems. In a recent paper, Chakraborty et. al. (2019) have introduced a veracity scoring technique for geostatistical data. The authors have used a high-quality `reference' data to measure the veracity of the varying-quality observations and incorporated the veracity scores in their analysis of mobile-sensor generated noisy weather data to generate efficient predictions of the ambient temperature process. In this paper, we consider the scenario when no reference data is available and hence, the veracity scores (referred as VS) are defined based on `local' summaries of the observations. We develop a VS-based estimation method for parameters of a spatial regression model. Under a non-stationary noise structure and fairly general assumptions on the underlying spatial process, we show that the VS-based estimators of the regression parameters are consistent. Moreover, we establish the advantage of the VS-based estimators as compared to the ordinary least squares (OLS) estimator by analyzing their asymptotic mean squared errors. We illustrate the merits of the VS-based technique through simulations and apply the methodology to a real data set on mass percentages of ash in coal seams in Pennsylvania."

--- I don't quite understand this abstract...
to:NB  spatial_statistics  statistics  lahiri.s.n.  to_teach:data_over_space_and_time 
8 weeks ago by cshalizi
Chatterjee , Lahiri : Rates of convergence of the Adaptive LASSO estimators to the Oracle distribution and higher order refinements by the bootstrap
"Zou [J. Amer. Statist. Assoc. 101 (2006) 1418–1429] proposed the Adaptive LASSO (ALASSO) method for simultaneous variable selection and estimation of the regression parameters, and established its oracle property. In this paper, we investigate the rate of convergence of the ALASSO estimator to the oracle distribution when the dimension of the regression parameters may grow to infinity with the sample size. It is shown that the rate critically depends on the choices of the penalty parameter and the initial estimator, among other factors, and that confidence intervals (CIs) based on the oracle limit law often have poor coverage accuracy. As an alternative, we consider the residual bootstrap method for the ALASSO estimators that has been recently shown to be consistent; cf. Chatterjee and Lahiri [J. Amer. Statist. Assoc. 106 (2011a) 608–625]. We show that the bootstrap applied to a suitable studentized version of the ALASSO estimator achieves second-order correctness, even when the dimension of the regression parameters is unbounded. Results from a moderately large simulation study show marked improvement in coverage accuracy for the bootstrap CIs over the oracle based CIs."
to:NB  lasso  high-dimensional_statistics  confidence_sets  regression  statistics  lahiri.s.n. 
june 2013 by cshalizi
[1302.3071] A penalized empirical likelihood method in high dimensions
"This paper formulates a penalized empirical likelihood (PEL) method for inference on the population mean when the dimension of the observations may grow faster than the sample size. Asymptotic distributions of the PEL ratio statistic is derived under different component-wise dependence structures of the observations, namely, (i) non-Ergodic, (ii) long-range dependence and (iii) short-range dependence. It follows that the limit distribution of the proposed PEL ratio statistic can vary widely depending on the correlation structure, and it is typically different from the usual chi-squared limit of the empirical likelihood ratio statistic in the fixed and finite dimensional case. A unified subsampling based calibration is proposed, and its validity is established in all three cases, (i)-(iii). Finite sample properties of the method are investigated through a simulation study."
to:NB  likelihood  statistics  lahiri.s.n.  to_read  high-dimensional_statistics 
february 2013 by cshalizi
Lahiri , Spiegelman , Appiah , Rilett : Gap bootstrap methods for massive data sets with an application to transportation engineering
"In this paper we describe two bootstrap methods for massive data sets. Naive applications of common resampling methodology are often impractical for massive data sets due to computational burden and due to complex patterns of inhomogeneity. In contrast, the proposed methods exploit certain structural properties of a large class of massive data sets to break up the original problem into a set of simpler subproblems, solve each subproblem separately where the data exhibit approximate uniformity and where computational complexity can be reduced to a manageable level, and then combine the results through certain analytical considerations. The validity of the proposed methods is proved and their finite sample properties are studied through a moderately large simulation study. The methodology is illustrated with a real data example from Transportation Engineering, which motivated the development of the proposed methods."

--- The to_teach for uADA is really more of a "to mention".
in_NB  statistics  bootstrap  to_read  lahiri.s.n.  to_teach:statcomp  to_teach:undergrad-ADA 
december 2012 by cshalizi

related tags

bootstrap  confidence_sets  high-dimensional_statistics  in_nb  lasso  likelihood  regression  spatial_statistics  statistics  to:nb  to_read  to_teach:data_over_space_and_time  to_teach:statcomp  to_teach:undergrad-ada 

Copy this bookmark: