approximation   632
Lattice Boltzmann methods - Wikipedia
"Lattice Boltzmann methods (LBM) ... is a class of computational fluid dynamics (CFD) methods for fluid simulation. " Successor to lattice gas automaton methods.
article  algorithm  approximation  fluid  mechanics
9 days ago by schahn
[1806.06850] Polynomial Regression As an Alternative to Neural Nets
"Despite the success of neural networks (NNs), there is still a concern among many over their "black box" nature. Why do they work? Here we present a simple analytic argument that NNs are in fact essentially polynomial regression models. This view will have various implications for NNs, e.g. providing an explanation for why convergence problems arise in NNs, and it gives rough guidance on avoiding overfitting. In addition, we use this phenomenon to predict and confirm a multicollinearity property of NNs not previously reported in the literature. Most importantly, given this loose correspondence, one may choose to routinely use polynomial models instead of NNs, thus avoiding some major problems of the latter, such as having to set many tuning parameters and dealing with convergence issues. We present a number of empirical results; in each case, the accuracy of the polynomial approach matches or exceeds that of NN approaches. A many-featured, open-source software package, polyreg, is available."

--- Matloff is the author of my favorite "R programming for n00bs" textbook...
--- ETA after reading: the argument that multi-layer neural networks "are essentially" polynomial regression is a bit weak. It would be true, exactly, if activation functions were exactly polynomial, which however they rarely are in practice. If non-polynomial activations happen to be implemented in computational practice by polynomials (e.g., Taylor approximations), well, either we get different hardware or we crank up the degree of approximation as much as we like. (Said a little differently, if you buy this line of argument, you should buy that _every_ smooth statistical model "is essentially" polynomial regression, which seems a bit much.) It is, also, an argument about the function-approximation properties of the model classes, and not the fitting processes, despite the explicit disclaimers.
to:NB  your_favorite_deep_neural_network_sucks  regression  neural_networks  statistics  matloff.norman  approximation  computational_statistics  have_read
5 weeks ago by cshalizi
[1806.10909] ResNet with one-neuron hidden layers is a Universal Approximator
We demonstrate that a very deep ResNet with stacked modules with one neuron per hidden layer and ReLU activation functions can uniformly approximate any Lebesgue integrable function in d dimensions, i.e. ℓ1(ℝd). Because of the identity mapping inherent to ResNets, our network has alternating layers of dimension one and d. This stands in sharp contrast to fully connected networks, which are not universal approximators if their width is the input dimension d [Lu et al, 2017]. Hence, our result implies an increase in representational power for narrow deep networks by the ResNet architecture.
representation  neural-networks  approximation  rather-interesting  to-write-about  to-do
6 weeks ago by Vaguery
[1806.06323] Approximate Submodular Functions and Performance Guarantees
We consider the problem of maximizing non-negative non-decreasing set functions. Although most of the recent work focus on exploiting submodularity, it turns out that several objectives we encounter in practice are not submodular. Nonetheless, often we leverage the greedy algorithms used in submodular functions to determine a solution to the non-submodular functions. Hereafter, we propose to address the original problem by \emph{approximating} the non-submodular function and analyze the incurred error, as well as the performance trade-offs. To quantify the approximation error, we introduce a novel concept of δ-approximation of a function, which we used to define the space of submodular functions that lie within an approximation error. We provide necessary conditions on the existence of such δ-approximation functions, which might not be unique. Consequently, we characterize this subspace which we refer to as \emph{region of submodularity}. Furthermore, submodular functions are known to lead to different sub-optimality guarantees, so we generalize those dependencies upon a δ-approximation into the notion of \emph{greedy curvature}. Finally, we used this latter notion to simplify some of the existing results and efficiently (i.e., linear complexity) determine tightened bounds on the sub-optimality guarantees using objective functions commonly used in practical setups and validate them using real data.
submodularity  approximation
6 weeks ago by arsyed
Multivariate approximation | Numerical analysis | Cambridge University Press
"This self-contained, systematic treatment of multivariate approximation begins with classical linear approximation, and moves on to contemporary nonlinear approximation. It covers substantial new developments in the linear approximation theory of classes with mixed smoothness, and shows how it is directly related to deep problems in other areas of mathematics. For example, numerical integration of these classes is closely related to discrepancy theory and to nonlinear approximation with respect to special redundant dictionaries, and estimates of the entropy numbers of classes with mixed smoothness are closely related to (in some cases equivalent to) the Small Ball Problem from probability theory. The useful background material included in the book makes it accessible to graduate students. Researchers will find that the many open problems in the theory outlined in the book provide helpful directions and guidance for their own research in this exciting and active area."
to:NB  approximation  mathematics  books:noted
7 weeks ago by cshalizi
One Parameter Is Always Enough
This is very cute, and deserves some attention.
Being me, I'll grump a bit. As he says at the end, this is just an elaboration of the point that we get a class of _infinite_ VC dimension by thresholding sin(kx), i.e., we can correctly classify any arbitrarily large set of points by tweaking k appropriately. Since this was known in the 1970s (proved by V&C themselves, if memory serves), it's been kind of insane that people continue to count parameters and pat themselves on the back about Occam. (See http://bactra.org/weblog/921.html et seq. for more.) Still, the elephant is nice.
model_selection  approximation  dynamical_systems  chaos  have_read  to:blog
11 weeks ago by cshalizi
[1712.08373] Notes on complexity of packing coloring
A packing k-coloring for some integer k of a graph G=(V,E) is a mapping
φ:V→{1,…,k} such that any two vertices u,v of color φ(u)=φ(v) are in distance at least φ(u)+1. This concept is motivated by frequency assignment problems. The \emph{packing chromatic number} of G is the smallest k such that there exists a packing k-coloring of G.
Fiala and Golovach showed that determining the packing chromatic number for chordal graphs is \NP-complete for diameter exactly 5. While the problem is easy to solve for diameter 2, we show \NP-completeness for any diameter at least 3. Our reduction also shows that the packing chromatic number is hard to approximate within n1/2−ε for any ε>0.
In addition, we design an \FPT algorithm for interval graphs of bounded diameter. This leads us to exploring the problem of finding a partial coloring that maximizes the number of colored vertices.
graph-theory  algorithms  combinatorics  proof  approximation  nudge-targets  consider:looking-to-see  consider:feature-discovery
march 2018 by Vaguery
[1801.00548] A Machine Learning Approach to Adaptive Covariance Localization
Data assimilation plays a key role in large-scale atmospheric weather forecasting, where the state of the physical system is estimated from model outputs and observations, and is then used as initial condition to produce accurate future forecasts. The Ensemble Kalman Filter (EnKF) provides a practical implementation of the statistical solution of the data assimilation problem and has gained wide popularity as. This success can be attributed to its simple formulation and ease of implementation. EnKF is a Monte-Carlo algorithm that solves the data assimilation problem by sampling the probability distributions involved in Bayes theorem. Because of this, all flavors of EnKF are fundamentally prone to sampling errors when the ensemble size is small. In typical weather forecasting applications, the model state space has dimension 109−1012, while the ensemble size typically ranges between 30−100 members. Sampling errors manifest themselves as long-range spurious correlations and have been shown to cause filter divergence. To alleviate this effect covariance localization dampens spurious correlations between state variables located at a large distance in the physical space, via an empirical distance-dependent function. The quality of the resulting analysis and forecast is greatly influenced by the choice of the localization function parameters, e.g., the radius of influence. The localization radius is generally tuned empirically to yield desirable results.This work, proposes two adaptive algorithms for covariance localization in the EnKF framework, both based on a machine learning approach. The first algorithm adapts the localization radius in time, while the second algorithm tunes the localization radius in both time and space. Numerical experiments carried out with the Lorenz-96 model, and a quasi-geostrophic model, reveal the potential of the proposed machine learning approaches.
modeling  machine-learning  prediction  rather-interesting  looking-to-see  approximation  algorithms  to-write-about
march 2018 by Vaguery
[1709.08004] Slow-scale split-step tau-leap method for stiff stochastic chemical systems
Tau-leaping is a family of algorithms for the approximate simulation of discrete state continuous time Markov chains. The motivation for the development of such methods can be found, for instance, in the fields of chemical kinetics and systems biology. It is well known that the dynamical behavior of biochemical systems is often intrinsically stiff representing a serious challenge for their numerical approximation. The naive extension of stiff deterministic solvers to stochastic integration usually yields numerical solutions with either impractically large relaxation times or incorrectly resolved covariance. In this paper, we propose a novel splitting heuristic which allows to resolve these issues. The proposed numerical integrator takes advantage of the special structure of the linear systems with explicitly available equations for the mean and the covariance which we use to calibrate the parameters of the scheme. It is shown that the method is able to reproduce the exact mean and variance of the linear scalar test equation and has very good accuracy for the arbitrarily stiff systems at least in linear case. The numerical examples for both linear and nonlinear systems are also provided and the obtained results confirm the efficiency of the considered splitting approach.
numerical-methods  diffy-Qs  approximation  systems-biology  algorithms  to-write-about  consider:representation  rather-interesting  nudge
february 2018 by Vaguery
The topic of this snapshot is interpolation. In the
ordinary sense, interpolation means to insert something
of a different nature into something else. In
mathematics, interpolation means constructing new
data points from given data points. The new points
usually lie in between the already-known points. The
purpose of this snapshot is to introduce a particular
type of interpolation, namely, polynomial interpolation.
This will be explained starting from basic ideas
that go back to the ancient Babylonians and Greeks,
and will arrive at subjects of current research activity.
interpolation  rather-interesting  history-of-science  algorithms  approximation  to-write-about
february 2018 by Vaguery
[1709.10030] Sparse Hierarchical Regression with Polynomials
We present a novel method for exact hierarchical sparse polynomial regression. Our regressor is that degree r polynomial which depends on at most k inputs, counting at most ℓ monomial terms, which minimizes the sum of the squares of its prediction errors. The previous hierarchical sparse specification aligns well with modern big data settings where many inputs are not relevant for prediction purposes and the functional complexity of the regressor needs to be controlled as to avoid overfitting. We present a two-step approach to this hierarchical sparse regression problem. First, we discard irrelevant inputs using an extremely fast input ranking heuristic. Secondly, we take advantage of modern cutting plane methods for integer optimization to solve our resulting reduced hierarchical (k,ℓ)-sparse problem exactly. The ability of our method to identify all k relevant inputs and all ℓ monomial terms is shown empirically to experience a phase transition. Crucially, the same transition also presents itself in our ability to reject all irrelevant features and monomials as well. In the regime where our method is statistically powerful, its computational complexity is interestingly on par with Lasso based heuristics. The presented work fills a void in terms of a lack of powerful disciplined nonlinear sparse regression methods in high-dimensional settings. Our method is shown empirically to scale to regression problems with n≈10,000 observations for input dimension p≈1,000.
statistics  regression  algorithms  approximation  performance-measure  to-understand  nudge-targets  consider:looking-to-see
february 2018 by Vaguery

Copy this bookmark:

description:

tags: