approximation   636

« earlier    

Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning
"Randomized neural networks are immortalized in this AI Koan: In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. What are you doing?'' asked Minsky. I am training a randomly wired neural net to play tic-tac-toe,'' Sussman replied. Why is the net wired randomly?'' asked Minsky. Sussman replied, I do not want it to have any preconceptions of how to play.'' Minsky then shut his eyes. Why do you close your eyes?'' Sussman asked his teacher. So that the room will be empty,'' replied Minsky. At that moment, Sussman was enlightened. We analyze shallow random networks with the help of concentration of measure inequalities. Specifically, we consider architectures that compute a weighted sum of their inputs after passing them through a bank of arbitrary randomized nonlinearities. We identify conditions under which these networks exhibit good classification performance, and bound their test error in terms of the size of the dataset and the number of random nonlinearities."

--- Have I never bookmarked this before?
in_NB  approximation  kernel_methods  random_projections  statistics  prediction  classifiers  rahimi.ali  recht.benjamin  machine_learning  have_read 
4 weeks ago by cshalizi
[1808.05587] Deep Convolutional Networks as shallow Gaussian Processes
We show that the output of a (residual) convolutional neural network (CNN) with an appropriate prior over the weights and biases is a Gaussian process (GP) in the limit of infinitely many convolutional filters, extending similar results for dense networks. For a CNN, the equivalent kernel can be computed exactly and, unlike "deep kernels", has very few parameters: only the hyperparameters of the original CNN. Further, we show that this kernel has two properties that allow it to be computed efficiently; the cost of evaluating the kernel for a pair of images is similar to a single forward pass through the original CNN with only one filter per layer. The kernel equivalent to a 32-layer ResNet obtains 0.84% classification error on MNIST, a new record for GPs with a comparable number of parameters.
via:cshalizi  neural-networks  approximation  rather-interesting  representation  deep-learning  algorithms  to-write-about  equivalences-of-motion-over-different-paths 
7 weeks ago by Vaguery
Lattice Boltzmann methods - Wikipedia
"Lattice Boltzmann methods (LBM) ... is a class of computational fluid dynamics (CFD) methods for fluid simulation. " Successor to lattice gas automaton methods.
article  algorithm  approximation  fluid  mechanics 
10 weeks ago by schahn
[1806.06850] Polynomial Regression As an Alternative to Neural Nets
"Despite the success of neural networks (NNs), there is still a concern among many over their "black box" nature. Why do they work? Here we present a simple analytic argument that NNs are in fact essentially polynomial regression models. This view will have various implications for NNs, e.g. providing an explanation for why convergence problems arise in NNs, and it gives rough guidance on avoiding overfitting. In addition, we use this phenomenon to predict and confirm a multicollinearity property of NNs not previously reported in the literature. Most importantly, given this loose correspondence, one may choose to routinely use polynomial models instead of NNs, thus avoiding some major problems of the latter, such as having to set many tuning parameters and dealing with convergence issues. We present a number of empirical results; in each case, the accuracy of the polynomial approach matches or exceeds that of NN approaches. A many-featured, open-source software package, polyreg, is available."

--- Matloff is the author of my favorite "R programming for n00bs" textbook...
--- ETA after reading: the argument that multi-layer neural networks "are essentially" polynomial regression is a bit weak. It would be true, exactly, if activation functions were exactly polynomial, which however they rarely are in practice. If non-polynomial activations happen to be implemented in computational practice by polynomials (e.g., Taylor approximations), well, either we get different hardware or we crank up the degree of approximation as much as we like. (Said a little differently, if you buy this line of argument, you should buy that _every_ smooth statistical model "is essentially" polynomial regression, which seems a bit much.) It is, also, an argument about the function-approximation properties of the model classes, and not the fitting processes, despite the explicit disclaimers.
to:NB  your_favorite_deep_neural_network_sucks  regression  neural_networks  statistics  matloff.norman  approximation  computational_statistics  have_read 
july 2018 by cshalizi
[1806.10909] ResNet with one-neuron hidden layers is a Universal Approximator
We demonstrate that a very deep ResNet with stacked modules with one neuron per hidden layer and ReLU activation functions can uniformly approximate any Lebesgue integrable function in d dimensions, i.e. ℓ1(ℝd). Because of the identity mapping inherent to ResNets, our network has alternating layers of dimension one and d. This stands in sharp contrast to fully connected networks, which are not universal approximators if their width is the input dimension d [Lu et al, 2017]. Hence, our result implies an increase in representational power for narrow deep networks by the ResNet architecture.
representation  neural-networks  approximation  rather-interesting  to-write-about  to-do 
june 2018 by Vaguery
[1806.06323] Approximate Submodular Functions and Performance Guarantees
We consider the problem of maximizing non-negative non-decreasing set functions. Although most of the recent work focus on exploiting submodularity, it turns out that several objectives we encounter in practice are not submodular. Nonetheless, often we leverage the greedy algorithms used in submodular functions to determine a solution to the non-submodular functions. Hereafter, we propose to address the original problem by \emph{approximating} the non-submodular function and analyze the incurred error, as well as the performance trade-offs. To quantify the approximation error, we introduce a novel concept of δ-approximation of a function, which we used to define the space of submodular functions that lie within an approximation error. We provide necessary conditions on the existence of such δ-approximation functions, which might not be unique. Consequently, we characterize this subspace which we refer to as \emph{region of submodularity}. Furthermore, submodular functions are known to lead to different sub-optimality guarantees, so we generalize those dependencies upon a δ-approximation into the notion of \emph{greedy curvature}. Finally, we used this latter notion to simplify some of the existing results and efficiently (i.e., linear complexity) determine tightened bounds on the sub-optimality guarantees using objective functions commonly used in practical setups and validate them using real data.
submodularity  approximation 
june 2018 by arsyed
Multivariate approximation | Numerical analysis | Cambridge University Press
"This self-contained, systematic treatment of multivariate approximation begins with classical linear approximation, and moves on to contemporary nonlinear approximation. It covers substantial new developments in the linear approximation theory of classes with mixed smoothness, and shows how it is directly related to deep problems in other areas of mathematics. For example, numerical integration of these classes is closely related to discrepancy theory and to nonlinear approximation with respect to special redundant dictionaries, and estimates of the entropy numbers of classes with mixed smoothness are closely related to (in some cases equivalent to) the Small Ball Problem from probability theory. The useful background material included in the book makes it accessible to graduate students. Researchers will find that the many open problems in the theory outlined in the book provide helpful directions and guidance for their own research in this exciting and active area."
to:NB  approximation  mathematics  books:noted 
june 2018 by cshalizi
One Parameter Is Always Enough
This is very cute, and deserves some attention.
Being me, I'll grump a bit. As he says at the end, this is just an elaboration of the point that we get a class of _infinite_ VC dimension by thresholding sin(kx), i.e., we can correctly classify any arbitrarily large set of points by tweaking k appropriately. Since this was known in the 1970s (proved by V&C themselves, if memory serves), it's been kind of insane that people continue to count parameters and pat themselves on the back about Occam. (See http://bactra.org/weblog/921.html et seq. for more.) Still, the elephant is nice.
model_selection  approximation  dynamical_systems  chaos  have_read  to:blog 
may 2018 by cshalizi
[1712.08373] Notes on complexity of packing coloring
A packing k-coloring for some integer k of a graph G=(V,E) is a mapping
φ:V→{1,…,k} such that any two vertices u,v of color φ(u)=φ(v) are in distance at least φ(u)+1. This concept is motivated by frequency assignment problems. The \emph{packing chromatic number} of G is the smallest k such that there exists a packing k-coloring of G.
Fiala and Golovach showed that determining the packing chromatic number for chordal graphs is \NP-complete for diameter exactly 5. While the problem is easy to solve for diameter 2, we show \NP-completeness for any diameter at least 3. Our reduction also shows that the packing chromatic number is hard to approximate within n1/2−ε for any ε>0.
In addition, we design an \FPT algorithm for interval graphs of bounded diameter. This leads us to exploring the problem of finding a partial coloring that maximizes the number of colored vertices.
graph-theory  algorithms  combinatorics  proof  approximation  nudge-targets  consider:looking-to-see  consider:feature-discovery 
march 2018 by Vaguery

« earlier    

related tags

absolute-relative  accuracy  acm  acmtariat  additive  aggregate  aggregation  ai  aknn  algebra  algorithm  algorithms  amt  analogy  analytical-expressions  announcement  anthropic  antiquity  applications  approxnearestneighbors  approxsimilarityjoin  archaeology  arrows  article  arxiv  atan  atoms  ben-recht  benchmarking  big-picture  big-surf  binomial  bio  biophysics  book  books:noted  boundingbox  bounds  cache  calculation  cardinality  carmack  chaos  cheap-learning  cheatsheet  classifiers  closure  clustering  cocktail  combinatorics  common-sense  communication  comp_sci  compare-to-pareto-gp-features  comparison  compass-and-straightedge  complexity  composition-decomposition  compressed-sensing  computation  computational-complexity  computational-geometry  computational_complexity  computational_statistics  computerscience  concept  concurrency  consider:approximation  consider:complexity  consider:convergence  consider:evolving-some  consider:feature-discovery  consider:for-behavior-classification  consider:looking-to-see  consider:performance-measures  consider:performance-space-analysis  consider:re-approximation  consider:rediscovery  consider:representation  consider:stress-testing  constraint-satisfaction  construction  continued-fractions  convergence  convex  cooperate-defect  count-min  count  counter  counting  cs  data-structures  data  datamining  datastructure  deep-learning  deep-materialism  demographics  density  differential-equations  differential  diffy-qs  diffyq  dimensionality  dirty-hands  discrete  discussion  disease  distance  distribution  dsp  dynamical_systems  earth  economics  elearn  electromag  embeddings  empirical  engineering-design  engineering  epistemology  equivalences-of-motion-over-different-paths  error  essay  estimate  estimation  evolution  examples  explanation  exposition  fast-inverse-sqrt  feature-construction  field  floating-point_arithmetic  fluid  fourier  fractions  frontier  frustrated-systems  function  fuzzy-match  game-theory  game_theory  gender-diff  gender  geometry  github  golang  good-question  gradient-descent  graph-layout  graph-theory  graphs  gravity  greedy  ground-up  guide  hardness  hash  hashing  have_read  heuristics  hi-order-bits  high-dimension  history-of-science  history  html  hypergraphs  hyperloglog  idealization  ideas  identity  ieee  immune  important  in_nb  incentives  information-theory  integer-programming  integral  interpolation  intricacy  iron-age  is-ought  ising  iteration-recursion  javascript  journal  kernel_methods  kernels  learning-algorithms  lecture-notes  leftism  length  lens  levenshtein  levers  lfu  library  limits  linear-algebra  linear  linearapproximation  linearity  linearization  liner-notes  list  local-global  looking-to-see  lower-bounds  lsh  machine-learning  machine_learning  magazine  magnitude  manual  markup  matching  math.ca  math.nt  math  mathematical-recreations  mathematics  matloff.norman  matrices  maxim-gun  mean  measurement  mechanics  mechanism-design  medieval  metabuch  methodology  metric-space  metrics  michael-jordan  mihai  mit  mnemonics  model-class  model_selection  modeling  models  models_of_behavior  moments  multi  multiplicative  multivariable  mutation  mwua  natural-language-processing  network-theory  network  neural-networks  neural  neural_networks  news  nibble  nitty-gritty  nlp  noise  non-equilibrium  nonlinear-dynamics  nudge-targets  nudge  number-theory  numerical-methods  numerical  numerics  objektbuch  ontology  open-problems  open-questions  opensource  operations-research  optimization  orders  org:bleg  org:edu  org:inst  org:junk  org:lite  org:mag  org:sci  out-of-the-box  overflow  packing  paper  papers  parasites-microbiome  parsimony  pdf  performance-measure  performance  perturbation  philosophy-of-science  philosophy  philosophy_of_science  php  physics!  physics  pi  pigeonhole-markov  plane-geometry  planning  popsci  population  prediction  preprint  pro-rata  probabilistic-counter  probabilistic-model  probabilistic  probability  profile  programming  proof  proofs  protein-folding  python  q-n-a  qra  quake  quanta_mag  quantum  questions  rahimi.ali  rand-approx  random  random_projections  ranking  rather-interesting  realness  recht.benjamin  red-queen  reduction  reference  reflection  regression  replication  representation  research  review  rigor  robust  roots  rounding  salvage  sanjoymahajan  sapiens  scale  science  scitariat  sdp  selection  self-organization  series  set-theory  similarity  simplification  simulation  sketch  slides  smoothness  software  space  spark  spatial  stat-mech  statistics  stats  stochastic-processes  stochastic-resonance  stories  streaming  string  structural-biology  stylized-facts  submodularity  success  summary  symbolic-methods  symbolic  synthesis  systematic-ad-hoc  systems-biology  talks  taylor  tcs  techtariat  the-trenches  theoretical-biology  theory-and-practice-sitting-in-a-tree  theory  thinking  tidbits  tightness  tiling  time-complexity  time  timeseries  to-do  to-understand  to-write-about  to:blog  to:nb  tolearn  toread  tounderstand  training  trees  tricks  trivia  ugc  universal  video  visualization  volo-avolo  waves  west-hunter  wiki  yahoo  yoga  your_favorite_deep_neural_network_sucks  youtube  🌞 

Copy this bookmark:



description:


tags: