statistics   205105

« earlier    

Covariance Matrix
This lesson explains how to use matrix methods to generate a variance-covariance matrix from a matrix of raw data.
matrix  statistics 
14 hours ago by lena
[1709.10030] Sparse Hierarchical Regression with Polynomials
We present a novel method for exact hierarchical sparse polynomial regression. Our regressor is that degree r polynomial which depends on at most k inputs, counting at most ℓ monomial terms, which minimizes the sum of the squares of its prediction errors. The previous hierarchical sparse specification aligns well with modern big data settings where many inputs are not relevant for prediction purposes and the functional complexity of the regressor needs to be controlled as to avoid overfitting. We present a two-step approach to this hierarchical sparse regression problem. First, we discard irrelevant inputs using an extremely fast input ranking heuristic. Secondly, we take advantage of modern cutting plane methods for integer optimization to solve our resulting reduced hierarchical (k,ℓ)-sparse problem exactly. The ability of our method to identify all k relevant inputs and all ℓ monomial terms is shown empirically to experience a phase transition. Crucially, the same transition also presents itself in our ability to reject all irrelevant features and monomials as well. In the regime where our method is statistically powerful, its computational complexity is interestingly on par with Lasso based heuristics. The presented work fills a void in terms of a lack of powerful disciplined nonlinear sparse regression methods in high-dimensional settings. Our method is shown empirically to scale to regression problems with n≈10,000 observations for input dimension p≈1,000.
statistics  regression  algorithms  approximation  performance-measure  to-understand  nudge-targets  consider:looking-to-see 
15 hours ago by Vaguery

« earlier    

related tags

abdsc  ad99  ai  algorithms  analysis  analytics  approximation  arcmap  art  average  bayes  bayesian  bias  bigdata  blog  bogus  book  books  business  cat  charts  chrismanning  cms  consider:looking-to-see  content  creation  crime  criticalthinking  crossvalidation  cute_animals  data-science  data  datascience  datascientists  datavisualization  dating  deceptive  deeplearning  design  drupal  ebooks  economics  education  embeddings  errorstatistics  excellence  experiment  facebook  feminism  football  free  frequentist  funny  game  geography  geotiff  getrichslowly  goff  golf  gov  gov2.0  graphs  guncontrol  guns  history  hlm  issue  javascript  jobs  js  jupyter  kalman_filter  liars  lists  mac  machine.learning  machinecomprehension  machinelearning  map  mapping  maps  math  maths  matrix  mcmc  measure  measurement  metrics  ml  money  mortality  neuralnetworks  nhst  nice-thinking  node.js  nps  nudge-targets  numbers  p-values  packages  paper  papers  pca  performance-measure  philly  phnom-penh  pocket  politics  powerlaw  predictions  prior  probability  programming  progressives  python  r  rams  raster  react  reference  regression  research  resources  satire  school  science  severetesting  shootings  simson  soccer  sociology  spatial  sportstrading  statistical.learning  stats  study  tax  teaching  teenagers  to-understand  tools  tracking  traffic  transportation  trends  tutorial  tweetit  twitter  ui  uk  usage  ux  validation  visualization  watchlist  weather  web  webdevelopment  wikipedia  wordpress  work  yannlecun 

Copy this bookmark: