approximation   651

« earlier    

[1612.07545] A Revisit of Hashing Algorithms for Approximate Nearest Neighbor Search
"Approximate Nearest Neighbor Search (ANNS) is a fundamental problem in many areas of machine learning and data mining. During the past decade, numerous hashing algorithms are proposed to solve this problem. Every proposed algorithm claims outperform other state-of-the-art hashing methods. However, the evaluation of these hashing papers was not thorough enough, and those claims should be re-examined. The ultimate goal of an ANNS method is returning the most accurate answers (nearest neighbors) in the shortest time. If implemented correctly, almost all the hashing methods will have their performance improved as the code length increases. However, many existing hashing papers only report the performance with the code length shorter than 128. In this paper, we carefully revisit the problem of search with a hash index, and analyze the pros and cons of two popular hash index search procedures. Then we proposed a very simple but effective two level index structures and make a thorough comparison of eleven popular hashing algorithms. Surprisingly, the random-projection-based Locality Sensitive Hashing (LSH) is the best performed algorithm, which is in contradiction to the claims in all the other ten hashing papers. Despite the extreme simplicity of random-projection-based LSH, our results show that the capability of this algorithm has been far underestimated. For the sake of reproducibility, all the codes used in the paper are released on GitHub, which can be used as a testing platform for a fair comparison between various hashing algorithms."
to:NB  data_mining  approximation  nearest_neighbors  locality-sensitive_hashing  hashing  have_read  via:vaguery  random_projections  k-means  databases 
11 days ago by cshalizi
[1612.07545] A Revisit of Hashing Algorithms for Approximate Nearest Neighbor Search
Approximate Nearest Neighbor Search (ANNS) is a fundamental problem in many areas of machine learning and data mining. During the past decade, numerous hashing algorithms are proposed to solve this problem. Every proposed algorithm claims outperform other state-of-the-art hashing methods. However, the evaluation of these hashing papers was not thorough enough, and those claims should be re-examined. The ultimate goal of an ANNS method is returning the most accurate answers (nearest neighbors) in the shortest time. If implemented correctly, almost all the hashing methods will have their performance improved as the code length increases. However, many existing hashing papers only report the performance with the code length shorter than 128. In this paper, we carefully revisit the problem of search with a hash index, and analyze the pros and cons of two popular hash index search procedures. Then we proposed a very simple but effective two level index structures and make a thorough comparison of eleven popular hashing algorithms. Surprisingly, the random-projection-based Locality Sensitive Hashing (LSH) is the best performed algorithm, which is in contradiction to the claims in all the other ten hashing papers. Despite the extreme simplicity of random-projection-based LSH, our results show that the capability of this algorithm has been far underestimated. For the sake of reproducibility, all the codes used in the paper are released on GitHub, which can be used as a testing platform for a fair comparison between various hashing algorithms.
hashing  algorithms  approximation  dimension-reduction  representation  data-analysis  feature-extraction  nudge-targets  consider:looking-to-see  to-write-about 
11 days ago by Vaguery
[1712.08911] Largest and Smallest Area Triangles on Imprecise Points
Assume we are given a set of parallel line segments in the plane, and we wish to place a point on each line segment such that the resulting point set maximizes or minimizes the area of the largest or smallest triangle in the set. We analyze the complexity of the four resulting computational problems, and we show that three of them admit polynomial-time algorithms, while the fourth is NP-hard. Specifically, we show that maximizing the largest triangle can be done in O(n2) time (or in O(nlogn) time for unit segments); minimizing the largest triangle can be done in O(n2logn) time; maximizing the smallest triangle is NP-hard; but minimizing the smallest triangle can be done in O(n2) time. We also discuss to what extent our results can be generalized to polygons with k>3 sides.
computational-geometry  rather-interesting  approximation  plane-geometry  optimization  to-write-about  nudge-targets  consider:looking-to-see 
27 days ago by Vaguery
[1807.09885] A Polynomial Time Constant Approximation For Minimizing Total Weighted Flow-time
We consider the classic scheduling problem of minimizing the total weighted flow-time on a single machine (min-WPFT), when preemption is allowed. In this problem, we are given a set of n jobs, each job having a release time rj, a processing time pj, and a weight wj. The flow-time of a job is defined as the amount of time the job spends in the system before it completes; that is, Fj=Cj−rj, where Cj is the completion time of job. The objective is to minimize the total weighted flow-time of jobs.
This NP-hard problem has been studied quite extensively for decades. In a recent breakthrough, Batra, Garg, and Kumar presented a {\em pseudo-polynomial} time algorithm that has an O(1) approximation ratio. The design of a truly polynomial time algorithm, however, remained an open problem. In this paper, we show a transformation from pseudo-polynomial time algorithms to polynomial time algorithms in the context of min-WPFT. Our result combined with the result of Batra, Garg, and Kumar settles the long standing conjecture that there is a polynomial time algorithm with O(1)-approximation for min-WPFT.
approximation  mathematical-programming  optimization  computational-complexity  NP-hard  heuristics  rather-interesting  to-write-about 
4 weeks ago by Vaguery
[1806.04484] A Fourier-Analytic Approach for the Discrepancy of Random Set Systems
One of the prominent open problems in combinatorics is the discrepancy of set systems where each element lies in at most t sets. The Beck-Fiala conjecture suggests that the right bound is O(t√), but for three decades the only known bound not depending on the size of the set system has been O(t). Arguably we currently lack techniques for breaking that barrier.
In this paper we introduce discrepancy bounds based on Fourier analysis. We demonstrate our method on random set systems. Suppose one has n elements and m sets containing each element independently with probability p. We prove that in the regime of n≥Θ(m2log(m)), the discrepancy is at most 1 with high probability. Previously, a result of Ezra and Lovett gave a bound of O(1) under the stricter assumption that n≫mt.
combinatorics  open-questions  probability-theory  approximation  limits  hypergraphs  to-understand  consider:looking-to-see 
4 weeks ago by Vaguery
Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks
"The performance of neural networks on high-dimensional data distributions suggests that it may be possible to parameterize a representation of a given high-dimensional function with controllably small errors, potentially outperforming standard interpolation methods. We demonstrate, both theoretically and numerically, that this is indeed the case. We map the parameters of a neural network to a system of particles relaxing with an interaction potential determined by the loss function. We show that in the limit that the number of parameters n is large, the landscape of the mean-squared error becomes convex and the representation error in the function scales as O(n−1). In this limit, we prove a dynamical variant of the universal approximation theorem showing that the optimal representation can be attained by stochastic gradient descent, the algorithm ubiquitously used for parameter optimization in machine learning. In the asymptotic regime, we study the fluctuations around the optimal representation and show that they arise at a scale O(n−1). These fluctuations in the landscape identify the natural scale for the noise in stochastic gradient descent. Our results apply to both single and multi-layer neural networks, as well as standard kernel methods like radial basis functions."

!!!
to:NB  to_read  neural_networks  approximation  learning_theory  via:rvenkat 
5 weeks ago by cshalizi
[1807.10489] Randomized residual-based error estimators for parametrized equations
We propose a randomized a posteriori error estimator for reduced order approximations of parametrized (partial) differential equations. The error estimator has several important properties: the effectivity is close to unity with prescribed lower and upper bounds at specified high probability; the estimator does not require the calculation of stability (coercivity, or inf-sup) constants; the online cost to evaluate the a posteriori error estimator is commensurate with the cost to find the reduced order approximation; the probabilistic bounds extend to many queries with only modest increase in cost. To build this estimator, we first estimate the norm of the error with a Monte-Carlo estimator using Gaussian random vectors whose covariance is chosen according to the desired error measure, e.g. user-defined norms or quantity of interest. Then, we introduce a dual problem with random right-hand side the solution of which allows us to rewrite the error estimator in terms of the residual of the original equation. In order to have a fast-to-evaluate estimator, model order reduction methods can be used to approximate the random dual solutions. Here, we propose a greedy algorithm that is guided by a scalar quantity of interest depending on the error estimator. Numerical experiments on a multi-parametric Helmholtz problem demonstrate that this strategy yields rather low-dimensional reduced dual spaces.
numerical-methods  approximation  rather-interesting  to-understand  modeling  algorithms  to-write-about 
10 weeks ago by Vaguery
Cell Biology by the Numbers
"Vignettes that reveal how numbers serve as a sixth sense to understanding our cells"
biology  numbers  scale  approximation 
12 weeks ago by arsyed
Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning
"Randomized neural networks are immortalized in this AI Koan: In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. What are you doing?'' asked Minsky. I am training a randomly wired neural net to play tic-tac-toe,'' Sussman replied. Why is the net wired randomly?'' asked Minsky. Sussman replied, I do not want it to have any preconceptions of how to play.'' Minsky then shut his eyes. Why do you close your eyes?'' Sussman asked his teacher. So that the room will be empty,'' replied Minsky. At that moment, Sussman was enlightened. We analyze shallow random networks with the help of concentration of measure inequalities. Specifically, we consider architectures that compute a weighted sum of their inputs after passing them through a bank of arbitrary randomized nonlinearities. We identify conditions under which these networks exhibit good classification performance, and bound their test error in terms of the size of the dataset and the number of random nonlinearities."

--- Have I never bookmarked this before?
in_NB  approximation  kernel_methods  random_projections  statistics  prediction  classifiers  rahimi.ali  recht.benjamin  machine_learning  have_read 
september 2018 by cshalizi
[1808.05587] Deep Convolutional Networks as shallow Gaussian Processes
We show that the output of a (residual) convolutional neural network (CNN) with an appropriate prior over the weights and biases is a Gaussian process (GP) in the limit of infinitely many convolutional filters, extending similar results for dense networks. For a CNN, the equivalent kernel can be computed exactly and, unlike "deep kernels", has very few parameters: only the hyperparameters of the original CNN. Further, we show that this kernel has two properties that allow it to be computed efficiently; the cost of evaluating the kernel for a pair of images is similar to a single forward pass through the original CNN with only one filter per layer. The kernel equivalent to a 32-layer ResNet obtains 0.84% classification error on MNIST, a new record for GPs with a comparable number of parameters.
via:cshalizi  neural-networks  approximation  rather-interesting  representation  deep-learning  algorithms  to-write-about  equivalences-of-motion-over-different-paths 
august 2018 by Vaguery
Lattice Boltzmann methods - Wikipedia
"Lattice Boltzmann methods (LBM) ... is a class of computational fluid dynamics (CFD) methods for fluid simulation. " Successor to lattice gas automaton methods.
article  algorithm  approximation  fluid  mechanics 
august 2018 by schahn
[1806.06850] Polynomial Regression As an Alternative to Neural Nets
"Despite the success of neural networks (NNs), there is still a concern among many over their "black box" nature. Why do they work? Here we present a simple analytic argument that NNs are in fact essentially polynomial regression models. This view will have various implications for NNs, e.g. providing an explanation for why convergence problems arise in NNs, and it gives rough guidance on avoiding overfitting. In addition, we use this phenomenon to predict and confirm a multicollinearity property of NNs not previously reported in the literature. Most importantly, given this loose correspondence, one may choose to routinely use polynomial models instead of NNs, thus avoiding some major problems of the latter, such as having to set many tuning parameters and dealing with convergence issues. We present a number of empirical results; in each case, the accuracy of the polynomial approach matches or exceeds that of NN approaches. A many-featured, open-source software package, polyreg, is available."

--- Matloff is the author of my favorite "R programming for n00bs" textbook...
--- ETA after reading: the argument that multi-layer neural networks "are essentially" polynomial regression is a bit weak. It would be true, exactly, if activation functions were exactly polynomial, which however they rarely are in practice. If non-polynomial activations happen to be implemented in computational practice by polynomials (e.g., Taylor approximations), well, either we get different hardware or we crank up the degree of approximation as much as we like. (Said a little differently, if you buy this line of argument, you should buy that _every_ smooth statistical model "is essentially" polynomial regression, which seems a bit much.) It is, also, an argument about the function-approximation properties of the model classes, and not the fitting processes, despite the explicit disclaimers.
to:NB  your_favorite_deep_neural_network_sucks  regression  neural_networks  statistics  matloff.norman  approximation  computational_statistics  have_read 
july 2018 by cshalizi

« earlier    

related tags

absolute-relative  accuracy  acm  acmtariat  additive  aggregate  aggregation  ai  aknn  algebra  algorithm  algorithms  amt  analytical-expressions  announcement  antiquity  applications  approxnearestneighbors  approxsimilarityjoin  archaeology  arrows  article  arxiv  atan  ben-recht  benchmarking  big-picture  big-surf  binomial  biology  biophysics  book  books:noted  boundingbox  bounds  calculation  cardinality  carmack  chaos  cheap-learning  cheatsheet  circles  classifiers  closure  cocktail  combinatorics  common-sense  communication  comp_sci  compass-and-straightedge  complexity  composition-decomposition  compressed-sensing  computation  computational-complexity  computational-geometry  computational_complexity  computational_statistics  computerscience  concept  concurrency  consider:approximation  consider:complexity  consider:convergence  consider:evolving-some  consider:feature-discovery  consider:for-behavior-classification  consider:looking-to-see  consider:performance-measures  consider:re-approximation  consider:rediscovery  consider:representation  consider:stress-testing  constraint-satisfaction  construction  continued-fractions  convergence  convex  count  counting  cs  data-analysis  data-structures  data_mining  databases  datamining  datastructure  deep-learning  demographics  density  differential-equations  differential  diffy-qs  diffyq  dimension-reduction  dimensionality  dirty-hands  discrete  distance  distribution  dsp  dynamical_systems  earth  economics  elearn  electromag  embeddings  empirical  engineering-design  engineering  epistemology  equivalences-of-motion-over-different-paths  error  essay  estimate  estimation  examples  explanation  exposition  farey  fast-inverse-sqrt  feature-construction  feature-extraction  field  floating-point_arithmetic  fluid  ford  fourier  fractions  frontier  frustrated-systems  fun  function  fuzzy-match  game-theory  game_theory  geometry  github  golang  good-question  gradient-descent  graph-layout  graph-theory  graphs  gravity  greedy  ground-up  guide  hardness  hash  hashing  have_read  heuristics  hi-order-bits  high-dimension  history-of-science  history  hypergraphs  hyperloglog  idealization  ideas  identity  ieee  important  in_nb  information-theory  integer-programming  integral  interpolation  intricacy  iron-age  iteration-recursion  journal  k-means  kernel_methods  kernels  learning-algorithms  learning_theory  lecture-notes  leftism  length  lens  levenshtein  levers  library  limits  linear-algebra  linear  linearapproximation  linearity  linearization  liner-notes  list  local-global  locality-sensitive_hashing  looking-to-see  lower-bounds  lsh  machine-learning  machine_learning  magazine  magnitude  manual  markup  matching  math.ca  math.nt  math  mathematical-programming  mathematical-recreations  mathematics  matloff.norman  matrices  mean  measurement  mechanics  mechanism-design  medieval  metabuch  methodology  metric-space  michael-jordan  mihai  mit  model-class  model_selection  modeling  models_of_behavior  moments  multi  multiplicative  multivariable  natural-language-processing  nearest_neighbors  network-theory  network  neural-networks  neural  neural_networks  news  nibble  nitty-gritty  nlp  noise  non-equilibrium  nonlinear-dynamics  np-hard  nudge-targets  nudge  number-theory  numbers  numeric  numerical-methods  numerical  numerical_computing  numerics  objektbuch  ontology  open-problems  open-questions  opensource  operations-research  optimization  orders  org:bleg  org:edu  org:inst  org:junk  org:lite  org:mag  org:sci  out-of-the-box  overflow  packing  paper  papers  parsimony  pdf  performance-measure  performance  perturbation  philosophy-of-science  philosophy  philosophy_of_science  php  physics!  physics  pigeonhole-markov  plane-geometry  planning  popsci  population  prediction  preprint  probabilistic-counter  probabilistic-model  probabilistic  probability-theory  probability  profile  programming  proof  proofs  protein-folding  python  q-n-a  qra  quake  quanta_mag  quantum  questions  rahimi.ali  rand-approx  random  random_projections  ranking  rather-interesting  realness  recht.benjamin  reduction  reference  reflection  regression  replication  representation  research  review  rigor  robust  roots  rounding  salvage  sanjoymahajan  sapiens  scale  science  sdp  self-organization  series  set-theory  similarity  simplification  simulation  sketch  slides  smoothness  software  space  spark  spatial  statistics  stats  stochastic-processes  stochastic-resonance  stories  streaming  string  structural-biology  stylized-facts  submodularity  success  summary  symbolic-methods  symbolic  synthesis  systematic-ad-hoc  systems-biology  talks  taylor  tcs  techtariat  the-trenches  theoretical-biology  theory-and-practice-sitting-in-a-tree  theory  tidbits  tightness  tiling  time-complexity  time  timeseries  to-do  to-understand  to-write-about  to:blog  to:nb  to_read  tolearn  toread  tounderstand  training  trees  tricks  trivia  ugc  universal  video  volo-avolo  waves  wiki  yahoo  yoga  your_favorite_deep_neural_network_sucks  youtube 

Copy this bookmark:



description:


tags: