consider:representation   365
[1712.08558] Lattice-based Locality Sensitive Hashing is Optimal
Locality sensitive hashing (LSH) was introduced by Indyk and Motwani (STOC 98) to give the first sublinear time algorithm for the c-approximate nearest neighbor (ANN) problem using only polynomial space. At a high level, an LSH family hashes "nearby" points to the same bucket and "far away" points to different buckets. The quality of measure of an LSH family is its LSH exponent, which helps determine both query time and space usage.
In a seminal work, Andoni and Indyk (FOCS 06) constructed an LSH family based on random ball partitioning of space that achieves an LSH exponent of 1/c^2 for the l_2 norm, which was later shown to be optimal by Motwani, Naor and Panigrahy (SIDMA 07) and O'Donnell, Wu and Zhou (TOCT 14). Although optimal in the LSH exponent, the ball partitioning approach is computationally expensive. So, in the same work, Andoni and Indyk proposed a simpler and more practical hashing scheme based on Euclidean lattices and provided computational results using the 24-dimensional Leech lattice. However, no theoretical analysis of the scheme was given, thus leaving open the question of finding the exponent of lattice based LSH.
In this work, we resolve this question by showing the existence of lattices achieving the optimal LSH exponent of 1/c^2 using techniques from the geometry of numbers. At a more conceptual level, our results show that optimal LSH space partitions can have periodic structure. Understanding the extent to which additional structure can be imposed on these partitions, e.g. to yield low space and query complexity, remains an important open problem.
algorithms  numerical-methods  rather-interesting  approximation  computational-complexity  nudge-targets  consider:looking-to-see  consider:representation
2 days ago by Vaguery
[1711.03172] Curve Reconstruction via the Global Statistics of Natural Curves
Reconstructing the missing parts of a curve has been the subject of much computational research, with applications in image inpainting, object synthesis, etc. Different approaches for solving that problem are typically based on processes that seek visually pleasing or perceptually plausible completions. In this work we focus on reconstructing the underlying physically likely shape by utilizing the global statistics of natural curves. More specifically, we develop a reconstruction model that seeks the mean physical curve for a given inducer configuration. This simple model is both straightforward to compute and it is receptive to diverse additional information, but it requires enough samples for all curve configurations, a practical requirement that limits its effective utilization. To address this practical issue we explore and exploit statistical geometrical properties of natural curves, and in particular, we show that in many cases the mean curve is scale invariant and often times it is extensible. This, in turn, allows to boost the number of examples and thus the robustness of the statistics and its applicability. The reconstruction results are not only more physically plausible but they also lead to important insights on the reconstruction problem, including an elegant explanation why certain inducer configurations are more likely to yield consistent perceptual completions than others.
inference  computer-vision  rather-interesting  algorithms  nudge-targets  consider:looking-to-see  consider:representation  performance-measure  to-write-about
9 weeks ago by Vaguery
[1705.01887] Streaming for Aibohphobes: Longest Palindrome with Mismatches
A palindrome is a string that reads the same as its reverse, such as "aibohphobia" (fear of palindromes). Given an integer d>0, a d-near-palindrome is a string of Hamming distance at most d from its reverse. We study the natural problem of identifying a longest d-near-palindrome in data streams. The problem is relevant to the analysis of DNA databases, and to the task of repairing recursive structures in documents such as XML and JSON. We present an algorithm that returns a d-near-palindrome whose length is within a multiplicative (1+ϵ)-factor of the longest d-near-palindrome. Our algorithm also returns the set of mismatched indices of the d-near-palindrome, using (dlog7nϵlog(1+ϵ)) bits of space, and (dlog6nϵlog(1+ϵ)) update time per arriving symbol. We show that Ω(dlogn) space is necessary for estimating the length of longest d-near-palindromes with high probability. We further obtain an additive-error approximation algorithm and a comparable lower bound, as well as an exact two-pass algorithm that solves the longest d-near-palindrome problem using (d2n‾√log6n) bits of space.
strings  computational-complexity  algorithms  probability-theory  inference  online-learning  rather-interesting  to-write-about  nudge-targets  consider:looking-to-see  consider:representation
9 weeks ago by Vaguery
Proofs are not only often beautiful but also necessary | Dan McQuillan
I hope you get the idea. When mathematicians seem obsessed with rigor, it is at least in part due to our history of making mistakes, seemingly every time we tried to jump to a conclusion. But perhaps there’s something more. So for that, we end with a quote from William Thurston:

what we are doing is finding ways for people to understand and think about mathematics… what they really want is usually not some collection of of ‘answers’-what they want is understanding.
10 weeks ago by Vaguery
[1710.01410] Learning Registered Point Processes from Idiosyncratic Observations
A parametric point process model is developed, with modeling based on the assumption that sequential observations often share latent phenomena, while also possessing idiosyncratic effects. An alternating optimization method is proposed to learn a "registered" point process that accounts for shared structure, as well as "warping" functions that characterize idiosyncratic aspects of each observed sequence. Under reasonable constraints, in each iteration we update the sample-specific warping functions by solving a set of constrained nonlinear programming problems in parallel, and update the model by maximum likelihood estimation. The justifiability, complexity and robustness of the proposed method are investigated in detail. Experiments on both synthetic and real-world data demonstrate that the method yields explainable point process models, achieving encouraging results compared to state-of-the-art methods.
modeling-is-not-mathematics  rather-interesting  representation  machine-learning  algorithms  nudge  nudge-targets  consider:representation  consider:performance-measures
10 weeks ago by Vaguery
[1212.4771] Necklaces, Convolutions, and X+Y
We give subquadratic algorithms that, given two necklaces each with n beads at arbitrary positions, compute the optimal rotation of the necklaces to best align the beads. Here alignment is measured according to the p norm of the vector of distances between pairs of beads from opposite necklaces in the best perfect matching. We show surprisingly different results for p = 1, p even, and p = \infty. For p even, we reduce the problem to standard convolution, while for p = \infty and p = 1, we reduce the problem to (min, +) convolution and (median, +) convolution. Then we solve the latter two convolution problems in subquadratic time, which are interesting results in their own right. These results shed some light on the classic sorting X + Y problem, because the convolutions can be viewed as computing order statistics on the antidiagonals of the X + Y matrix. All of our algorithms run in o(n^2) time, whereas the obvious algorithms for these problems run in \Theta(n^2) time.
optimization  operations-research  combinatorics  distance  nudge-targets  consider:looking-to-see  consider:representation
11 weeks ago by Vaguery
[1601.01298] Visibility Graphs, Dismantlability, and the Cops and Robbers Game
We study versions of cop and robber pursuit-evasion games on the visibility graphs of polygons, and inside polygons with straight and curved sides. Each player has full information about the other player's location, players take turns, and the robber is captured when the cop arrives at the same point as the robber. In visibility graphs we show the cop can always win because visibility graphs are dismantlable, which is interesting as one of the few results relating visibility graphs to other known graph classes. We extend this to show that the cop wins games in which players move along straight line segments inside any polygon and, more generally, inside any simply connected planar region with a reasonable boundary. Essentially, our problem is a type of pursuit-evasion using the link metric rather than the Euclidean metric, and our result provides an interesting class of infinite cop-win graphs.
computational-geometry  game-theory  rather-interesting  to-write-about  nudge-targets  consider:representation  consider:splinegons
11 weeks ago by Vaguery
[1702.02017] Pushing the Bounds for Matrix-Matrix Multiplication
A tight lower bound for required I/O when computing a matrix-matrix multiplication on a processor with two layers of memory is established. Prior work obtained weaker lower bounds by reasoning about the number of \textit{phases} needed to perform C:=AB, where each phase is a series of operations involving S reads and writes to and from fast memory, and S is the size of fast memory. A lower bound on the number of phases was then determined by obtaining an upper bound on the number of scalar multiplications performed per phase. This paper follows the same high level approach, but improves the lower bound by considering C:=AB+C instead of C:=AB, and obtains the maximum number of scalar fused multiply-adds (FMAs) per phase instead of scalar additions. Key to obtaining the new result is the decoupling of the per-phase I/O from the size of fast memory. The new lower bound is 2mnk/S‾√−2S. The constant for the leading term is an improvement of a factor 42‾√. A theoretical algorithm that attains the lower bound is given, and how the state-of-the-art Goto's algorithm also in some sense meets the lower bound is discussed.
computational-complexity  algorithms  matrices  rather-interesting  nudge-targets  consider:looking-to-see  consider:representation  consider:performance-measures
11 weeks ago by Vaguery
[1604.08797] Ortho-polygon Visibility Representations of Embedded Graphs
An ortho-polygon visibility representation of an n-vertex embedded graph G (OPVR of G) is an embedding-preserving drawing of G that maps every vertex to a distinct orthogonal polygon and each edge to a vertical or horizontal visibility between its end-vertices. The vertex complexity of an OPVR of G is the minimum k such that every polygon has at most k reflex corners. We present polynomial time algorithms that test whether G has an OPVR and, if so, compute one of minimum vertex complexity. We argue that the existence and the vertex complexity of an OPVR of G are related to its number of crossings per edge and to its connectivity. More precisely, we prove that if G has at most one crossing per edge (i.e., G is a 1-plane graph), an OPVR of G always exists while this may not be the case if two crossings per edge are allowed. Also, if G is a 3-connected 1-plane graph, we can compute an OPVR of G whose vertex complexity is bounded by a constant in O(n) time. However, if G is a 2-connected 1-plane graph, the vertex complexity of any OPVR of G may be Ω(n). In contrast, we describe a family of 2-connected 1-plane graphs for which an embedding that guarantees constant vertex complexity can be computed in O(n) time. Finally, we present the results of an experimental study on the vertex complexity of ortho-polygon visibility representations of 1-plane graphs.
graph-layout  computational-geometry  optimization  rather-interesting  to-write-about  nudge-targets  consider:representation  consider:feature-discovery  algorithms  computational-complexity
11 weeks ago by Vaguery
[1710.01992] Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks
Convolutional neural networks have recently demonstrated high-quality reconstruction for single image super-resolution. However, existing methods often require a large number of network parameters and entail heavy computational loads at runtime for generating high-accuracy super-resolution results. In this paper, we propose the deep Laplacian Pyramid Super-Resolution Network for fast and accurate image super-resolution. The proposed network progressively reconstructs the sub-band residuals of high-resolution images at multiple pyramid levels. In contrast to existing methods that involve the bicubic interpolation for pre-processing (which results in large feature maps), the proposed method directly extracts features from the low-resolution input space and thereby entails low computational loads. We train the proposed network with deep supervision using the robust Charbonnier loss functions and achieve high-quality image reconstruction. Furthermore, we utilize the recursive layers to share parameters across as well as within pyramid levels, and thus drastically reduce the number of parameters. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of run-time and image quality.
superresolution  neural-networks  representation  algorithms  image-processing  generative-models  to-write-about  nudge-targets  consider:representation  consider:performance-measures
11 weeks ago by Vaguery
[1707.05425] Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network
We propose a highly efficient and faster Single Image Super-Resolution (SISR) model with Deep Convolutional neural networks (Deep CNN). Deep CNN have recently shown that they have a significant reconstruction performance on single-image super-resolution. Current trend is using deeper CNN layers to improve performance. However, deep models demand larger computation resources and is not suitable for network edge devices like mobile, tablet and IoT devices. Our model achieves state of the art reconstruction performance with at least 10 times lower calculation cost by Deep CNN with Residual Net, Skip Connection and Network in Network (DCSCN). A combination of Deep CNNs and Skip connection layers is used as a feature extractor for image features on both local and global area. Parallelized 1x1 CNNs, like the one called Network in Network, is also used for image reconstruction. That structure reduces the dimensions of the previous layer's output for faster computation with less information loss, and make it possible to process original images directly. Also we optimize the number of layers and filters of each CNN to significantly reduce the calculation cost. Thus, the proposed algorithm not only achieves the state of the art performance but also achieves faster and efficient computation. Code is available at this https URL
superresolution  image-processing  deep-learning  neural-networks  generative-models  nudge-targets  consider:representation
11 weeks ago by Vaguery
[1710.00228] Physics-based Motion Planning: Evaluation Criteria and Benchmarking
Motion planning has evolved from coping with simply geometric problems to physics-based ones that incorporate the kinodynamic and the physical constraints imposed by the robot and the physical world. Therefore, the criteria for evaluating physics-based motion planners goes beyond the computational complexity (e.g. in terms of planning time) usually used as a measure for evaluating geometrical planners, in order to consider also the quality of the solution in terms of dynamical parameters. This study proposes an evaluation criteria and analyzes the performance of several kinodynamic planners, which are at the core of physics-based motion planning, using different scenarios with fixed and manipulatable objects. RRT, EST, KPIECE and SyCLoP are used for the benchmarking. The results show that KPIECE computes the time-optimal solution with heighest success rate, whereas, SyCLoP compute the most power-optimal solution among the planners used.
simulation  planning  algorithms  rather-interesting  feature-construction  approximation  nudge-targets  consider:representation
october 2017 by Vaguery
[1706.10206] Sums of Palindromes: an Approach via Automata
Recently, Cilleruelo, Luca, & Baxter proved, for all bases b >= 5, that every natural number is the sum of at most 3 natural numbers whose base-b representation is a palindrome. However, the cases b = 2, 3, 4 were left unresolved.
We prove, using a decision procedure based on automata, that every natural number is the sum of at most 4 natural numbers whose base-2 representation is a palindrome. Here the constant 4 is optimal. We obtain similar results for bases 3 and 4, thus completely resolving the problem.
We consider some other variations on this problem, and prove similar results. We argue that heavily case-based proofs are a good signal that a decision procedure may help to automate the proof.
number-theory  rather-interesting  lovely  algorithms  proof  nudge-targets  consider:rediscovery  consider:representation  consider:looking-to-see
october 2017 by Vaguery
[1305.6596] On the Coloring of Pseudoknots
Pseudodiagrams are diagrams of knots where some information about which strand goes over/under at certain crossings may be missing. Pseudoknots are equivalence classes of pseudodiagrams, with equivalence defined by a class of Reidemeister-type moves. In this paper, we introduce two natural extensions of classical knot colorability to this broader class of knot-like objects. We use these definitions to define the determinant of a pseudoknot (i.e. the pseudodeterminant) that agrees with the classical determinant for classical knots. Moreover, we extend Conway notation to pseudoknots to facilitate the investigation of families of pseudoknots and links. The general formulae for pseudodeterminants of pseudoknot families may then be used as a criterion for p-colorability of pseudoknots.
knot-theory  representation  topology  to-understand  nudge-targets  consider:representation  consider:looking-to-see  consider:algorithms
october 2017 by Vaguery
[1603.00051] Algebraic Method in Tilings
In this paper we introduce a new algebraic method in tilings. Combining this method with Hilbert's Nullstellensatz we obtain a necessary condition for tiling n-space by translates of a cluster of cubes. Further, the polynomial method will enable us to show that if there exists a tiling of n-space by translates of a cluster V of prime size then there is a lattice tiling by V as well. Finally, we provide supporting evidence for a conjecture that each tiling by translates of a prime size cluster V is lattice if V generates n-space.
polyominoes  algebra  representation  rather-interesting  nudge-targets  consider:representation
october 2017 by Vaguery
[math/0204106] The Second Hull of a Knotted Curve
The convex hull of a set K in space consists of points which are, in a certain sense, "surrounded" by K. When K is a closed curve, we define its higher hulls, consisting of points which are "multiply surrounded" by the curve. Our main theorem shows that if a curve is knotted then it has a nonempty second hull. This provides a new proof of the Fary/Milnor theorem that every knotted curve has total curvature at least 4pi.
knot-theory  geometry  topology  rather-interesting  feature-construction  proof  nudge-targets  consider:representation  consider:looking-to-see
october 2017 by Vaguery
[1505.00667] An Unusual Continued Fraction
We consider the real number σ with continued fraction expansion [a0,a1,a2,…]=[1,2,1,4,1,2,1,8,1,2,1,4,1,2,1,16,…], where ai is the largest power of 2 dividing i+1. We compute the irrationality measure of σ2 and demonstrate that σ2 (and σ) are both transcendental numbers. We also show that certain partial quotients of σ2 grow doubly exponentially, thus confirming a conjecture of Hanna and Wilson.
continued-fractions  strings  number-theory  rather-interesting  to-understand  nudge-targets  consider:representation
october 2017 by Vaguery
[1606.07163] Interpretable Machine Learning Models for the Digital Clock Drawing Test
The Clock Drawing Test (CDT) is a rapid, inexpensive, and popular neuropsychological screening tool for cognitive conditions. The Digital Clock Drawing Test (dCDT) uses novel software to analyze data from a digitizing ballpoint pen that reports its position with considerable spatial and temporal precision, making possible the analysis of both the drawing process and final product. We developed methodology to analyze pen stroke data from these drawings, and computed a large collection of features which were then analyzed with a variety of machine learning techniques. The resulting scoring systems were designed to be more accurate than the systems currently used by clinicians, but just as interpretable and easy to use. The systems also allow us to quantify the tradeoff between accuracy and interpretability. We created automated versions of the CDT scoring systems currently used by clinicians, allowing us to benchmark our models, which indicated that our machine learning models substantially outperformed the existing scoring systems.
machine-learning  benchmarking  rather-interesting  nudge-targets  consider:representation  consider:looking-to-see
october 2017 by Vaguery
[1602.01241] Using separable non-negative matrix factorization techniques for the analysis of time-resolved Raman spectra
The key challenge of time-resolved Raman spectroscopy is the identification of the constituent species and the analysis of the kinetics of the underlying reaction network. In this work we present an integral approach that allows for determining both the component spectra and the rate constants simultaneously from a series of vibrational spectra. It is based on an algorithm for non-negative matrix factorization which is applied to the experimental data set following a few pre-processing steps. As a prerequisite for physically unambiguous solutions, each component spectrum must include one vibrational band that does not significantly interfere with vibrational bands of other species. The approach is applied to synthetic "experimental" spectra derived from model systems comprising a set of species with component spectra differing with respect to their degree of spectral interferences and signal-to-noise ratios. In each case, the species involved are connected via monomolecular reaction pathways. The potential and limitations of the approach for recovering the respective rate constants and component spectra are discussed.
spectroscopy  data-analysis  inference  numerical-methods  modeling  statistics  rather-interesting  nudge-targets  consider:representation
october 2017 by Vaguery
[1708.08377] Two-Dimensional Indirect Binary Search for the Positive One-in-Three Satisfiability Problem
In this paper, we propose an algorithm for the positive one-in-three satisfiability problem (Pos1in3SAT). The proposed algorithm can efficiently decide the existence of a satisfying assignment in all assignments for a given formula by using a 2-dimensional binary search method without constructing an exponential number of assignments.
satisfiability  algorithms  computational-complexity  machine-learning  nudge-targets  consider:looking-to-see  consider:representation  matrices
october 2017 by Vaguery

Copy this bookmark:

description:

tags: