Vaguery + consider:looking-to-see   721

 « earlier
[1902.04312] Irrationality and transcendence of continued fractions with algebraic integers
We extend a result of Hančl, Kolouch and Nair on the irrationality and transcendence of continued fractions. We show that for a sequence {αn} of algebraic integers of bounded degree, each attaining the maximum absolute value among their conjugates and satisfying certain growth conditions, the condition
limsupn→∞|αn|1Ddn−1∏n−2i=1(Ddi+1)=∞
implies that the continued fraction α=[0;α1,α2,…] is not an algebraic number of degree less than or equal to D.
number-theory  what-if  continued-fractions  complex-numbers  out-of-the-box  looking-to-see  proof  nudge-targets  consider:looking-to-see
yesterday by Vaguery
[1812.01382] A new Bound for the Maker-Breaker Triangle Game
The triangle game introduced by Chvátal and Erdős (1978) is one of the most famous combinatorial games. For n,q∈ℕ, the (n,q)-triangle game is played by two players, called Maker and Breaker, on the complete graph Kn. Alternately Maker claims one edge and thereafter Breaker claims q edges of the graph. Maker wins the game if he can claim all three edges of a triangle, otherwise Breaker wins. Chvátal and Erdős (1978) proved that for q<2n+2‾‾‾‾‾‾√−5/2≈1.414n‾√ Maker has a winning strategy, and for q≥2n‾√ Breaker has a winning strategy. Since then, the problem of finding the exact leading constant for the threshold bias of the triangle game has been one of the famous open problems in combinatorial game theory. In fact, the constant is not known for any graph with a cycle and we do not even know if such a constant exists. Balogh and Samotij (2011) slightly improved the Chvátal-Erdős constant for Breaker's winning strategy from 2 to 1.935 with a randomized approach. Since then no progress was made. In this work, we present a new deterministic strategy for Breaker's win whenever n is sufficiently large and q≥(8/3+o(1))n‾‾‾‾‾‾‾‾‾‾‾‾√≈1.633n‾√, significantly reducing the gap towards the lower bound. In previous strategies Breaker chooses his edges such that one node is part of the last edge chosen by Maker, whereas the remaining node is chosen more or less arbitrarily. In contrast, we introduce a suitable potential function on the set of nodes. This allows Breaker to pick edges that connect the most dangerous' nodes. The total potential of the game may still increase, even for several turns, but finally Breaker's strategy prevents the total potential of the game from exceeding a critical level and leads to Breaker's win.
combinatorics  game-theory  rather-interesting  feature-construction  open-questions  nudge-targets  consider:looking-to-see
7 days ago by Vaguery
[1709.01173] Tight paths in convex geometric hypergraphs
In this paper, we prove a theorem on tight paths in convex geometric hypergraphs, which is asymptotically sharp in infinitely many cases. Our geometric theorem is a common generalization of early results of Hopf and Pannwitz, Sutherland, Kupitz and Perles for convex geometric graphs, as well as the classical Erdős-Gallai Theorem for graphs. As a consequence, we obtain the first substantial improvement on the Turán problem for tight paths in uniform hypergraphs.
hypergraphs  geometric-graphs  graph-theory  combinatorics  consider:extension-of-convexity  to-write-about  consider:looking-to-see
15 days ago by Vaguery
[1808.06423] What Stands-in for a Missing Tool? A Prototypical Grounded Knowledge-based Approach to Tool Substitution
When a robot is operating in a dynamic environment, it cannot be assumed that a tool required to solve a given task will always be available. In case of a missing tool, an ideal response would be to find a substitute to complete the task. In this paper, we present a proof of concept of a grounded knowledge-based approach to tool substitution. In order to validate the suitability of a substitute, we conducted experiments involving 22 substitution scenarios. The substitutes computed by the proposed approach were validated on the basis of the experts' choices for each scenario. Our evaluation showed, in 20 out of 22 scenarios (91%), the approach identified the same substitutes as experts.
artificial-intelligence  feature-construction  learning-from-data  rather-interesting  to-write-about  consider:looking-to-see
15 days ago by Vaguery
[1704.08679] Age-Minimal Transmission in Energy Harvesting Two-hop Networks
We consider an energy harvesting two-hop network where a source is communicating to a destination through a relay. During a given communication session time, the source collects measurement updates from a physical phenomenon and sends them to the relay, which then forwards them to the destination. The objective is to send these updates to the destination as timely as possible; namely, such that the total age of information is minimized by the end of the communication session, subject to energy causality constraints at the source and the relay, and data causality constraints at the relay. Both the source and the relay use fixed, yet possibly different, transmission rates. Hence, each update packet incurs fixed non-zero transmission delays. We first solve the single-hop version of this problem, and then show that the two-hop problem is solved by treating the source and relay nodes as one combined node, with some parameter transformations, and solving a single-hop problem between that combined node and the destination.
queueing-theory  information-theory  multiobjective-optimization  rather-interesting  to-write-about  to-simulate  consider:looking-to-see
15 days ago by Vaguery
[1804.09869] Learning for Video Compression
One key challenge to learning-based video compression is that motion predictive coding, a very effective tool for video compression, can hardly be trained into a neural network. In this paper we propose the concept of PixelMotionCNN (PMCNN) which includes motion extension and hybrid prediction networks. PMCNN can model spatiotemporal coherence to effectively perform predictive coding inside the learning network. On the basis of PMCNN, we further explore a learning-based framework for video compression with additional components of iterative analysis/synthesis, binarization, etc. Experimental results demonstrate the effectiveness of the proposed scheme. Although entropy coding and complex configurations are not employed in this paper, we still demonstrate superior performance compared with MPEG-2 and achieve comparable results with H.264 codec. The proposed learning-based scheme provides a possible new direction to further improve compression efficiency and functionalities of future video coding.
video  compression  algorithms  neural-networks  models-as-compressed-forms  representation  consider:looking-to-see
15 days ago by Vaguery
[1807.09977] Colored range closest-pair problem under general distance functions
The range closest-pair (RCP) problem is the range-search version of the classical closest-pair problem, which aims to store a given dataset of points in some data structure such that when a query range X is specified, the closest pair of points contained in X can be reported efficiently. A natural generalization of the RCP problem is the {colored range closest-pair} (CRCP) problem in which the given data points are colored and the goal is to find the closest {bichromatic} pair contained in the query range. All the previous work on the RCP problem was restricted to the uncolored version and the Euclidean distance function. In this paper, we make the first progress on the CRCP problem. We investigate the problem under a general distance function induced by a monotone norm; in particular, this covers all the Lp-metrics for p>0 and the L∞-metric. We design efficient (1+ε)-approximate CRCP data structures for orthogonal queries in ℝ2, where ε>0 is a pre-specified parameter. The highlights are two data structures for answering rectangle queries, one of which uses O(ε−1nlog4n) space and O(log4n+ε−1log3n+ε−2logn) query time while the other uses O(ε−1nlog3n) space and O(log5n+ε−1log4n+ε−2log2n) query time. In addition, we also apply our techniques to the CRCP problem in higher dimensions, obtaining efficient data structures for slab, 2-box, and 3D dominance queries. Before this paper, almost all the existing results for the RCP problem were achieved in ℝ2.
computational-complexity  computational-geometry  algorithms  horse-races  to-write-about  nudge-targets  consider:looking-to-see  consider:relaxations  consider:representation
6 weeks ago by Vaguery
[1612.07545] A Revisit of Hashing Algorithms for Approximate Nearest Neighbor Search
Approximate Nearest Neighbor Search (ANNS) is a fundamental problem in many areas of machine learning and data mining. During the past decade, numerous hashing algorithms are proposed to solve this problem. Every proposed algorithm claims outperform other state-of-the-art hashing methods. However, the evaluation of these hashing papers was not thorough enough, and those claims should be re-examined. The ultimate goal of an ANNS method is returning the most accurate answers (nearest neighbors) in the shortest time. If implemented correctly, almost all the hashing methods will have their performance improved as the code length increases. However, many existing hashing papers only report the performance with the code length shorter than 128. In this paper, we carefully revisit the problem of search with a hash index, and analyze the pros and cons of two popular hash index search procedures. Then we proposed a very simple but effective two level index structures and make a thorough comparison of eleven popular hashing algorithms. Surprisingly, the random-projection-based Locality Sensitive Hashing (LSH) is the best performed algorithm, which is in contradiction to the claims in all the other ten hashing papers. Despite the extreme simplicity of random-projection-based LSH, our results show that the capability of this algorithm has been far underestimated. For the sake of reproducibility, all the codes used in the paper are released on GitHub, which can be used as a testing platform for a fair comparison between various hashing algorithms.
hashing  algorithms  approximation  dimension-reduction  representation  data-analysis  feature-extraction  nudge-targets  consider:looking-to-see  to-write-about
6 weeks ago by Vaguery
The Fermat primality test and the GCD test | The Math Less Traveled
In my previous post we proved that if shares a nontrivial common factor with , then , and this in turn proves that is not prime (by Fermat’s Little Theorem).

But wait a minute, this is silly: if shares a common factor with , then we don’t need anything as complex as Fermat’s Little Theorem to figure out that is not prime! All we have to do is compute using the Euclidean Algorithm, and when we find out that the result isn’t , we can immediately conclude that isn’t prime since it has a nontrivial divisor.
algorithms  number-theory  rather-interesting  nudge-targets  consider:looking-to-see
6 weeks ago by Vaguery
CTK Insights » Blog Archive A pizza with a hole - CTK Insights
Now we are in a position to tackle the problem from the Crux. Any line through the center of the rectangular pizza divides it into equal parts. One of these lines stands out. It's the one that divides into equal parts the circular hole, that's the line that passes through the hole's center. Thus to solve the problem we cut through the centers of the rectangle and that of the hole.

But that's not the only solution. In fact there are infinitely many more, although to find any of these in a constructive manner is rather difficult, if not impossible.
mathematical-recreations  plane-geometry  rather-interesting  open-questions  nudge-targets  consider:looking-to-see  compass-and-straightedge
7 weeks ago by Vaguery
[1802.06223] The Geodesic Farthest-point Voronoi Diagram in a Simple Polygon
Given a set of point sites in a simple polygon, the geodesic farthest-point Voronoi diagram partitions the polygon into cells, at most one cell per site, such that every point in a cell has the same farthest site with respect to the geodesic metric. We present an O(nloglogn+mlogm)- time algorithm to compute the geodesic farthest-point Voronoi diagram of m point sites in a simple n-gon. This improves the previously best known algorithm by Aronov et al. [Discrete Comput. Geom. 9(3):217-255, 1993]. In the case that all point sites are on the boundary of the simple polygon, we can compute the geodesic farthest-point Voronoi diagram in O((n+m)loglogn) time.
computational-geometry  algorithms  computational-complexity  to-understand  to-simulate  nudge-targets  consider:looking-to-see
8 weeks ago by Vaguery
[1707.02643] The Junta Method for Hypergraphs and the ErdH{o}s-Chv'{a}tal Simplex Conjecture
Numerous problems in extremal hypergraph theory ask to determine the maximal size of a k-uniform hypergraph on n vertices that does not contain an enlarged' copy H+ of a fixed hypergraph H. These include well-known problems such as the Erdős-Sós forbidding one intersection' problem and the Frankl-Füredi special simplex' problem.
We present a general approach to such problems, using a junta approximation method' that originates from analysis of Boolean functions. We prove that any H+-free hypergraph is essentially contained in a junta' -- a hypergraph determined by a small number of vertices -- that is also H+-free, which effectively reduces the extremal problem to an easier problem on juntas. Using this approach, we obtain, for all C<k<n/C, a complete solution of the extremal problem for a large class of H's, which includes the aforementioned problems, and solves them for a large new set of parameters.
We apply our method also to the 1974 Erdős-Chvátal simplex conjecture, which asserts that for any d<k≤dd+1n, the maximal size of a k-uniform family that does not contain a d-simplex (i.e., d+1 sets with empty intersection such that any d of them intersect) is (n−1k−1). We prove the conjecture for all d and k, provided n>n0(d).
hypergraphs  set-theory  optimization  conjecture  proof  combinatorics  nudge-targets  consider:looking-to-see  consider:representation
8 weeks ago by Vaguery
[1807.10492] Limits with Signed Digit Streams
We work with the signed digit representation of abstract real numbers, which roughly is the binary representation enriched by the additional digit -1. The main objective of this paper is an algorithm which takes a sequence of signed digit representations of reals and returns the signed digit representation of their limit, if the sequence converges. As a first application we use this algorithm together with Heron's method to build up an algorithm which converts the signed digit representation of a non-negative real number into the signed digit representation of its square root. Instead of writing the algorithms first and proving their correctness afterwards, we work the other way round, in the tradition of program extraction from proofs. In fact we first give constructive proofs, and from these proofs we then compute the extracted terms, which is the desired algorithm. The correctness of the extracted term follows directly by the Soundness Theorem of program extraction. In order to get the extracted term from some proofs which are often quite long, we use the proof assistant Minlog. However, to apply the extracted terms, the programming language Haskell is useful. Therefore after each proof we show a notation of the extracted term, which can be easily rewritten as a definition in Haskell.
algebra  representation  algorithms  rather-interesting  to-write-about  consider:looking-to-see  nudge-targets  consider:simulation
8 weeks ago by Vaguery
[1712.08911] Largest and Smallest Area Triangles on Imprecise Points
Assume we are given a set of parallel line segments in the plane, and we wish to place a point on each line segment such that the resulting point set maximizes or minimizes the area of the largest or smallest triangle in the set. We analyze the complexity of the four resulting computational problems, and we show that three of them admit polynomial-time algorithms, while the fourth is NP-hard. Specifically, we show that maximizing the largest triangle can be done in O(n2) time (or in O(nlogn) time for unit segments); minimizing the largest triangle can be done in O(n2logn) time; maximizing the smallest triangle is NP-hard; but minimizing the smallest triangle can be done in O(n2) time. We also discuss to what extent our results can be generalized to polygons with k>3 sides.
computational-geometry  rather-interesting  approximation  plane-geometry  optimization  to-write-about  nudge-targets  consider:looking-to-see
9 weeks ago by Vaguery
[1712.05081] Linear time Minimum Area All-flush Triangles Circumscribing a Convex Polygon
We study the problem of computing the minimum area triangle that circumscribes a given n-sided convex polygon touching edge-to-edge. In other words, we compute the minimum area triangle that is the intersection of 3 half-planes out of n half-planes defined by a given convex polygon. Building on the Rotate-and-Kill technique {arXiv:1707.04071}, we propose an algorithm that solves the problem in O(n) time, improving the best-known O(nlogn) time algorithms given in [A. Aggarwal et. al. DCG94; B. Schieber. SODA95}. Our algorithm computes all the locally minimal area circumscribing triangles touching edge-to-edge.
computational-geometry  algorithms  plane-geometry  rather-interesting  to-write-about  nudge-targets  consider:looking-to-see
9 weeks ago by Vaguery
[1705.11035] Maximum-Area Triangle in a Convex Polygon, Revisited
We revisit the following problem: Given a convex polygon P, find the largest-area inscribed triangle. We show by example that the linear-time algorithm presented in 1979 by Dobkin and Snyder for solving this problem fails. We then proceed to show that with a small adaptation, their approach does lead to a quadratic-time algorithm. We also present a more involved O(nlogn) time divide-and-conquer algorithm. Also we show by example that the algorithm presented in 1979 by Dobkin and Snyder for finding the largest-area k-gon that is inscribed in a convex polygon fails to find the optimal solution for k=4. Finally, we discuss the implications of our discoveries on the literature.
computational-geometry  optimization  plane-geometry  rather-interesting  to-write-about  oopsie  nudge-targets  consider:looking-to-see
9 weeks ago by Vaguery
[1707.04071] Maximal Area Triangles in a Convex Polygon
The widely known linear time algorithm for computing the maximum area triangle in a convex polygon was found incorrect recently by Keikha et. al.(arXiv:1705.11035). We present an alternative algorithm in this paper. Comparing to the only previously known correct solution, ours is much simpler and more efficient. More importantly, our new approach is powerful in solving related problems.
computational-geometry  algorithms  plane-geometry  rather-interesting  nudge-targets  consider:looking-to-see  to-write-about
9 weeks ago by Vaguery
[1809.06451] A new lower bound on Hadwiger-Debrunner numbers in the plane
A family of sets F is said to satisfy the (p,q) property if among any p sets in F, some q have a non-empty intersection. Hadwiger and Debrunner (1957) conjectured that for any p≥q≥d+1 there exists c=cd(p,q), such that any family of compact convex sets in ℝd that satisfies the (p,q) property, can be pierced by at most c points. In a celebrated result from 1992, Alon and Kleitman proved the conjecture. However, obtaining sharp bounds on cd(p,q), called the Hadwiger-Debrunner numbers', is still a major open problem in discrete and computational geometry. The best currently known lower bound on the Hadwiger-Debrunner numbers in the plane is c2(p,q)=Ω(pqlog(pq)) while the best known upper bound is O(p(1.5+δ)(1+1q−2)).
In this paper we improve the lower bound significantly by showing that c2(p,q)≥p1+Ω(1/q). Furthermore, the bound is obtained by a family of lines, and is tight for all families that have a bounded VC-dimension. Unlike previous bounds on the Hadwiger-Debrunner numbers which mainly used the weak epsilon-net theorem, our bound stems from a surprising connection of the (p,q) problem to an old problem of Erdős on points in general position in the plane. We use a novel construction for the Erdős' problem, obtained recently by Balogh and Solymosi using the hypergraph container method, to get the lower bound on c2(p,3). We then generalize the bound to c2(p,q) for any q≥3.
computational-geometry  plane-geometry  to-understand  graph-theory  proof  nudge-targets  consider:looking-to-see  also:a-fucking-drawing-or-two-might-be-nice-people
9 weeks ago by Vaguery
[1806.04484] A Fourier-Analytic Approach for the Discrepancy of Random Set Systems
One of the prominent open problems in combinatorics is the discrepancy of set systems where each element lies in at most t sets. The Beck-Fiala conjecture suggests that the right bound is O(t√), but for three decades the only known bound not depending on the size of the set system has been O(t). Arguably we currently lack techniques for breaking that barrier.
In this paper we introduce discrepancy bounds based on Fourier analysis. We demonstrate our method on random set systems. Suppose one has n elements and m sets containing each element independently with probability p. We prove that in the regime of n≥Θ(m2log(m)), the discrepancy is at most 1 with high probability. Previously, a result of Ezra and Lovett gave a bound of O(1) under the stricter assumption that n≫mt.
combinatorics  open-questions  probability-theory  approximation  limits  hypergraphs  to-understand  consider:looking-to-see
9 weeks ago by Vaguery
[1706.04290] A general method for lower bounds on fluctuations of random variables
There are many ways of establishing upper bounds on fluctuations of random variables, but there is no systematic approach for lower bounds. As a result, lower bounds are unknown in many important problems. This paper introduces a general method for lower bounds on fluctuations. The method is used to obtain new results for the stochastic traveling salesman problem, the stochastic minimal matching problem, the random assignment problem, the Sherrington-Kirkpatrick model of spin glasses, first-passage percolation and random matrices. A long list of open problems is provided at the end.
open-questions  extreme-values  probability-theory  representation  rather-odd  nudge-targets  consider:looking-to-see  to-write-about
10 weeks ago by Vaguery
[1802.06668] Sequentializing cellular automata
We study the problem of sequentializing a cellular automaton without introducing any intermediate states, and only performing reversible permutations on the tape. We give a decidable characterization of cellular automata which can be written as a single left-to-right sweep of a bijective rule from left to right over an infinite tape.
cellular-automata  representation  rather-interesting  a-bit-backwards-though  complexology  computational-complexity  to-write-about  consider:looking-to-see
november 2018 by Vaguery
The nxnxn Dots Problem Optimal Solution, viXra.org e-Print archive, viXra:1707.0298
We provide an optimal strategy to solve the n X n X n points problem inside the box, considering only 90° turns, and at the same time a pattern able to drastically lower down the known upper bound. We use a very simple spiral frame, especially if compared to the previous plane by plane approach, that significantly reduces the number of straight lines connected at their end-points necessary to join all the n^3 dots. In the end, we combine the square spiral frame with the rectangular spiral pattern in the most profitable way, in order to minimize the difference between the upper and the lower bound, proving that it is ≤ 0.5 ∙ n ∙ (n + 3), for any n > 1.
mathematical-recreations  problem-solving  optimization  nudge-targets  consider:looking-to-see  to-write-about
october 2018 by Vaguery
[1711.05793] On the proximity of large primes
By a sphere-packing argument, we show that there are infinitely many pairs of primes that are close to each other for some metrics on the integers. In particular, for any numeration basis q, we show that there are infinitely many pairs of primes the base q expansion of which differ in at most two digits. Likewise, for any fixed integer t, there are infinitely many pairs of primes, the first t digits of which are the same. In another direction, we show that, there is a constant c depending on q such that for infinitely many integers m there are at least cloglogm primes which differ from m by at most one base q digit.
number-theory  Hamming-distance  proof  consider:looking-to-see  consider:generative-algorithm  nudge-targets
october 2018 by Vaguery
[1708.03563] On the discriminator of Lucas sequences
"Thus the situation is more weird than Shallit expected and this is confirmed by Theorem 1."

We consider the family of Lucas sequences uniquely determined by Un+2(k)=(4k+2)Un+1(k)−Un(k), with initial values U0(k)=0 and U1(k)=1 and k≥1 an arbitrary integer. For any integer n≥1 the discriminator function k(n) of Un(k) is defined as the smallest integer m such that U0(k),U1(k),…,Un−1(k) are pairwise incongruent modulo m. Numerical work of Shallit on k(n) suggests that it has a relatively simple characterization. In this paper we will prove that this is indeed the case by showing that for every k≥1 there is a constant nk such that k(n) has a simple characterization for every n≥nk. The case k=1 turns out to be fundamentally different from the case k>1.
number-theory  sequences  conjecture  rather-interesting  nudge-targets  pattern-discovery  to-write-about  consider:looking-to-see
october 2018 by Vaguery
Quickly recognizing primes less than 100 | The Math Less Traveled
Recently, Mark Dominus wrote about trying to memorize all the prime numbers under . This is a cool idea, but it made me start thinking about alternative: instead of memorizing primes, could we memorize a procedure for determining whether a number under is prime or composite? And can we make it clever enough to be done relatively quickly? This does tie into my other recent posts about primality testing, but to be clear, it’s also quite different: I am not talking about a general method for determining primality, but the fastest method we can devise for mentally determining, by hook or by crook, whether a given number under is prime. Hopefully there are rules we can come up with which are valid for numbers less than —and thus make them easier to test—even though they aren’t valid for bigger numbers in general.
mathematical-recreations  number-theory  heuristics  nudge-targets  consider:looking-to-see  to-write-about
october 2018 by Vaguery
The Kolakoski Sequence - Futility Closet
This is a fractal, a mathematical object that encodes its own representation. It was described by William Kolakoski in 1965, and before him by Rufus Oldenburger in 1939. University of Evansville mathematician Clark Kimberling is offering a reward of $200 for the solution to five problems associated with the sequence: Is there a formula for the nth term? If a string occurs in the sequence, must it occur again? If a string occurs, must its reversal also occur? If a string occurs, and all its 1s and 2s are swapped, must the new string occur? Does the limiting frequency of 1s exist, and is it 1/2? So far, no one has found the answers. mathematical-recreations self-reference fractals open-questions nudge-targets consider:looking-to-see october 2018 by Vaguery Efficiency of repeated squaring | The Math Less Traveled Claim: the binary algorithm is the most efficient way to build using only doubling and incrementing steps. That is, any other way to build by doubling and incrementing uses an equal or greater number of steps than the binary algorithm. algorithms mathematical-recreations nudge-targets consider:looking-to-see to-write-about october 2018 by Vaguery [1602.06208] Every positive integer is a sum of three palindromes For integer g≥5, we prove that any positive integer can be written as a sum of three palindromes in base g. number-theory proof nudge-targets consider:algorithms consider:looking-to-see september 2018 by Vaguery [1808.05845] Popular Products and Continued Fractions We prove bounds for the popularity of products of sets with weak additive structure, and use these bounds to prove results about continued fractions. Namely, we obtain a nearly sharp upper bound for the cardinality of Zaremba's set modulo p. continued-fractions number-theory to-write-about nudge-targets consider:looking-to-see september 2018 by Vaguery SMT solutions | The Math Less Traveled In my last post I described the general approach I used to draw orthogons using an SMT solver, but left some of the details as exercises. In this post I’ll explain the solutions I came up with. geometry programming rather-interesting representation testing to-write-about mathematical-recreations nudge-targets consider:looking-to-see august 2018 by Vaguery Custom Baking - Futility Closet Is it possible to bake a cake that can be divided into four parts by a single straight cut? mathematical-recreations simplicity nudge-targets consider:looking-to-see consider:novelty-search august 2018 by Vaguery Dembo , Peres : A Topological Criterion for Hypothesis Testing A simple topological criterion is given for the existence of a sequence of tests for composite hypothesis testing problems, such that almost surely only finitely many errors are made. via:cshalizi statistics learning-from-data algorithms existence-proof consider:looking-to-see to-write-about nudge-targets august 2018 by Vaguery Theorem of the Day The list is presented here in reverse chronological order, so that new additions will appear at the top. This is not the order in which the theorem of the day is picked which is more designed to mix up the different areas of mathematics and the level of abstractness or technicality involved. The way that the list of theorems is indexed is described here. mathematics proof lists rather-interesting nudge-targets consider:looking-to-see july 2018 by Vaguery [1710.10964] At the Roots of Dictionary Compression: String Attractors A well-known fact in the field of lossless text compression is that high-order entropy is a weak model when the input contains long repetitions. Motivated by this fact, decades of research have generated myriads of so-called dictionary compressors: algorithms able to reduce the text's size by exploiting its repetitiveness. Lempel-Ziv 77 is probably one of the most successful and known tools of this kind, followed by straight-line programs, run-length Burrows-Wheeler transform, and other less-known schemes. In this paper, we show that these techniques are different solutions to the same, elegant, combinatorial problem: to find a small set of positions capturing all distinct text's substrings. We call such a set a string attractor. We first show reductions between dictionary compressors and string attractors. This gives us the approximation ratios of dictionary compressors with respect to the smallest string attractor and allows us to solve several open problems related to the asymptotic relations between the output sizes of different dictionary compressors. We then show that the k-attractor problem - that is, deciding whether a text has a size-t set of positions capturing all substrings of length at most k - is NP-complete for k >= 3. We provide approximation techniques for the smallest k-attractor, show that the problem is APX-complete for constant k, and give strong inapproximability results. To conclude, we provide matching lower- and upper- bounds for the random access problem on string attractors. Our optimal data structure is universal: by our reductions to string attractors, it supports random access on any dictionary-compression scheme. In particular, our solution matches the lower bound also on LZ77, straight-line programs, collage systems, and macro schemes, and therefore essentially closes (at once) the random access problem for all these compressors. compression strings feature-extraction representation algorithms computational-complexity to-understand nudge-targets consider:looking-to-see july 2018 by Vaguery [1805.07980] Collisionless periodic orbits in the free-fall three-body problem Although the free-fall three-body problem have been investigated for more than one century, however, only four collisionless periodic orbits have been found. In this paper, we report 234 collisionless periodic orbits of the free-fall three-body system with some mass ratios, including three known collisionless periodic orbits. Thus, 231 collisionless free-fall periodic orbits among them are entirely new. In theory, we can gain periodic orbits of the free-fall three-body system in arbitrary ratio of mass. Besides, it is found that, for a given ratio of masses of two bodies, there exists a generalized Kepler's third law for the periodic three-body system. All of these would enrich our knowledge and deepen our understanding about the famous three-body problem as a whole. nonlinear-dynamics stamp-collecting rather-interesting nudge-targets consider:looking-to-see consider:performance-measures may 2018 by Vaguery [1805.02436] Combining Tools for Optimization and Analysis of Floating-Point Computations Recent renewed interest in optimizing and analyzing floating-point programs has lead to a diverse array of new tools for numerical programs. These tools are often complementary, each focusing on a distinct aspect of numerical programming. Building reliable floating point applications typically requires addressing several of these aspects, which makes easy composition essential. This paper describes the composition of two recent floating-point tools: Herbie, which performs accuracy optimization, and Daisy, which performs accuracy verification. We find that the combination provides numerous benefits to users, such as being able to use Daisy to check whether Herbie's unsound optimizations improved the worst-case roundoff error, as well as benefits to tool authors, including uncovering a number of bugs in both tools. The combination also allowed us to compare the different program rewriting techniques implemented by these tools for the first time. The paper lays out a road map for combining other floating-point tools and for surmounting common challenges. numerical-methods algorithms optimization rather-interesting floating-point representation to-understand nudge-targets consider:looking-to-see may 2018 by Vaguery [1805.02274] On the$f-Matrices of Pascal-like Triangles Defined by Riordan Arrays We define and characterize the f-matrices associated to Pascal-like matrices that are defined by ordinary and exponential Riordan arrays. These generalize the face matrices of simplices and hypercubes. Their generating functions can be expressed simply in terms of continued fractions, which are shown to be transformations of the generating functions of the corresponding γ- and h-matrices. combinatorics continued-fractions topology to-understand to-write-about nudge-targets consider:looking-to-see may 2018 by Vaguery [1607.01117] Anagram-free Graph Colouring An anagram is a word of the form WP where W is a non-empty word and P is a permutation of W. We study anagram-free graph colouring and give bounds on the chromatic number. Alon et al. (2002) asked whether anagram-free chromatic number is bounded by a function of the maximum degree. We answer this question in the negative by constructing graphs with maximum degree 3 and unbounded anagram-free chromatic number. We also prove upper and lower bounds on the anagram-free chromatic number of trees in terms of their radius and pathwidth. Finally, we explore extensions to edge colouring and k-anagram-free colouring. graph-theory algorithms graph-coloring feature-construction nudge-targets consider:looking-to-see computational-complexity may 2018 by Vaguery [1612.09443] Transversals in Latin arrays with many distinct symbols An array is row-Latin if no symbol is repeated within any row. An array is Latin if it and its transpose are both row-Latin. A transversal in an n×n array is a selection of n different symbols from different rows and different columns. We prove that every n×n Latin array containing at least (2−2‾√)n2 distinct symbols has a transversal. Also, every n×n row-Latin array containing at least 14(5−5‾√)n2 distinct symbols has a transversal. Finally, we show by computation that every Latin array of order 7 has a transversal, and we describe all smaller Latin arrays that have no transversal. combinatorics existence-proof rather-interesting nudge-targets consider:looking-to-see may 2018 by Vaguery [1708.09571] Anagram-free colourings of graph subdivisions An anagram is a word of the form WP where W is a non-empty word and P is a permutation of W. A vertex colouring of a graph is anagram-free if no subpath of the graph is an anagram. Anagram-free graph colouring was independently introduced by Kam\v{c}ev, {\L}uczak and Sudakov and ourselves. In this paper we introduce the study of anagram-free colourings of graph subdivisions. We show that every graph has an anagram-free 8-colourable subdivision. The number of division vertices per edge is exponential in the number of edges. For trees, we construct anagram-free 10-colourable subdivisions with fewer division vertices per edge. Conversely, we prove lower bounds, in terms of division vertices per edge, on the anagram-free chromatic number for subdivisions of the complete graph and subdivisions of complete trees of bounded degree. combinatorics graph-theory graph-coloring computational-complexity rather-interesting nudge-targets consider:looking-to-see may 2018 by Vaguery [1801.07155] Continued fractions and orderings on the Markov numbers Markov numbers are integers that appear in the solution triples of the Diophantine equation, x2+y2+z2=3xyz, called the Markov equation. A classical topic in number theory, these numbers are related to many areas of mathematics such as combinatorics, hyperbolic geometry, approximation theory and cluster algebras. There is a natural map from the rational numbers between zero and one to the Markov numbers. In this paper, we prove two conjectures seen in Martin Aigner's book, Markov's theorem and 100 years of the uniqueness conjecture, that determine an ordering on subsets of the Markov numbers based on their corresponding rational. The proof uses the cluster algebra of the torus with one puncture and a resulting reformulation of the conjectures in terms of continued fractions. The key step is to analyze the difference in the numerator of a continued fraction when an operation is applied to its entries. number-theory continued-fractions rather-interesting to-write-about consider:looking-to-see consider:generalizations may 2018 by Vaguery [1803.01422] DAGs with NO TEARS: Smooth Optimization for Structure Learning Estimating the structure of directed acyclic graphs (DAGs, also known as Bayesian networks) is a challenging problem since the search space of DAGs is combinatorial and scales superexponentially with the number of nodes. Existing approaches rely on various local heuristics for enforcing the acyclicity constraint and are not well-suited to general purpose optimization packages for their solution. In this paper, we introduce a fundamentally different strategy: We formulate the structure learning problem as a smooth, constrained optimization problem over real matrices that avoids this combinatorial constraint entirely. This is achieved by a novel characterization of acyclicity that is not only smooth but also exact. The resulting nonconvex, constrained program involves smooth functions whose gradients are easy to compute and only involve elementary matrix operations. By using existing black-box optimization routines, our method uses global search to find an optimal DAG and can be implemented in about 50 lines of Python and outperforms existing methods without imposing any structural constraints. graph-theory combinatorics algorithms rather-interesting computational-complexity to-understand to-write-about nudge-targets consider:looking-to-see representation april 2018 by Vaguery [1801.00663] A Sharp Estimate for Probability Distributions We consider absolutely continuous probability distributions f(x)dx on ℝ≥0. A result of Feldheim and Feldheim shows, among other things, that if the distribution is not compactly supported, then there exist z>0 such that most events in {X+Y≥2z} are comprised of a 'small' term satisfying min(X,Y)≤z and a 'large' term satisfying max(X,Y)≥z (as opposed to two 'large' terms that are both larger than z) limsupz→∞ ℙ(min(X,Y)≤z|X+Y≥2z)=1. The result fails if the distribution is compactly supported. We prove supz>0 ℙ(min(X,Y)≤z|X+Y≥2z)≥124+8log2(med(X)‖f‖L∞), where med(X) denotes the median. Interestingly, the logarithm is necessary and the result is sharp up to constants; we also discuss some open problems. probability-theory statistics rather-interesting nudge-targets consider:looking-to-see march 2018 by Vaguery A hybrid placement strategy for the three-dimensional strip packing problem - ScienceDirect This paper presents a hybrid placement strategy for the three-dimensional strip packing problem which involves packing a set of cuboids (‘boxes’) into a three-dimensional bin (parallelepiped) of fixed width and height but unconstrained length (the ‘container’). The goal is to pack all of the boxes into the container, minimising its resulting length. This problem has potential industry application in stock cutting (wood, polystyrene, etc. – minimising wastage) and also cargo loading, as well as other applications in areas such as multi-dimensional resource scheduling. In addition to the proposed strategy a number of test results on available literature benchmark problems are presented and analysed. The results of empirical testing of the algorithm show that it out-performs other methods from the literature, consistently in terms of speed and solution quality-producing 28 best known results from 35 test cases. operations-research hyperheuristics metaheuristics optimization constraint-satisfaction nudge-targets consider:representation consider:looking-to-see to-write-about march 2018 by Vaguery [1712.08373] Notes on complexity of packing coloring A packing k-coloring for some integer k of a graph G=(V,E) is a mapping φ:V→{1,…,k} such that any two vertices u,v of color φ(u)=φ(v) are in distance at least φ(u)+1. This concept is motivated by frequency assignment problems. The \emph{packing chromatic number} of G is the smallest k such that there exists a packing k-coloring of G. Fiala and Golovach showed that determining the packing chromatic number for chordal graphs is \NP-complete for diameter exactly 5. While the problem is easy to solve for diameter 2, we show \NP-completeness for any diameter at least 3. Our reduction also shows that the packing chromatic number is hard to approximate within n1/2−ε for any ε>0. In addition, we design an \FPT algorithm for interval graphs of bounded diameter. This leads us to exploring the problem of finding a partial coloring that maximizes the number of colored vertices. graph-theory algorithms combinatorics proof approximation nudge-targets consider:looking-to-see consider:feature-discovery march 2018 by Vaguery [1711.03532] Co-Optimization Generation and Distribution Planning in Microgrids This paper proposes a co-optimization generation and distribution planning model in microgrids in which simultaneous investment in generation, i.e., distributed generation (DG) and distributed energy storage (DES), and distribution, i.e., upgrading the existing distribution network, is considered. The objective of the proposed model is to minimize the microgrid total planning cost which comprises the investment cost of installed generation assets and lines, the microgrid operation cost, and the cost of unserved energy. The microgrid planning solution determines the optimal generation size, location, and mix, as well as required network upgrade. To consider line flow and voltage limits, a linearized power flow model is proposed and used, allowing further application of mixed integer linear programming (MILP) in problem modeling. The proposed model is applied to the IEEE 33-bus standard test system to demonstrate the acceptable performance and the effectiveness of the proposed model. optimization network-theory mathematical-programming multiobjective-optimization rather-interesting operations-research utilities nudge-targets consider:looking-to-see march 2018 by Vaguery [1801.01922] Vectorization of Line Drawings via PolyVector Fields Image tracing is a foundational component of the workflow in graphic design, engineering, and computer animation, linking hand-drawn concept images to collections of smooth curves needed for geometry processing and editing. Even for clean line drawings, modern algorithms often fail to faithfully vectorize junctions, or points at which curves meet; this produces vector drawings with incorrect connectivity. This subtle issue undermines the practical application of vectorization tools and accounts for hesitance among artists and engineers to use automatic vectorization software. To address this issue, we propose a novel image vectorization method based on state-of-the-art mathematical algorithms for frame field processing. Our algorithm is tailored specifically to disambiguate junctions without sacrificing quality. graphics algorithms inference rather-interesting feature-construction nudge-targets consider:representation consider:looking-to-see consider:performance-measures march 2018 by Vaguery [1709.06217] Deterministic meeting of sniffing agents in the plane Two mobile agents, starting at arbitrary, possibly different times from arbitrary locations in the plane, have to meet. Agents are modeled as discs of diameter 1, and meeting occurs when these discs touch. Agents have different labels which are integers from the set of 0 to L-1. Each agent knows L and knows its own label, but not the label of the other agent. Agents are equipped with compasses and have synchronized clocks. They make a series of moves. Each move specifies the direction and the duration of moving. This includes a null move which consists in staying inert for some time, or forever. In a non-null move agents travel at the same constant speed, normalized to 1. We assume that agents have sensors enabling them to estimate the distance from the other agent (defined as the distance between centers of discs), but not the direction towards it. We consider two models of estimation. In both models an agent reads its sensor at the moment of its appearance in the plane and then at the end of each move. This reading (together with the previous ones) determines the decision concerning the next move. In both models the reading of the sensor tells the agent if the other agent is already present. Moreover, in the monotone model, each agent can find out, for any two readings in moments t1 and t2, whether the distance from the other agent at time t1 was smaller, equal or larger than at time t2. In the weaker binary model, each agent can find out, at any reading, whether it is at distance less than \r{ho} or at distance at least \r{ho} from the other agent, for some real \r{ho} > 1 unknown to them. Such distance estimation mechanism can be implemented, e.g., using chemical sensors. Each agent emits some chemical substance (scent), and the sensor of the other agent detects it, i.e., sniffs. The intensity of the scent decreases with the distance. agent-based rather-interesting random-processes sensors emergent-design proof to-write-about consider:looking-to-see march 2018 by Vaguery [1611.05896] Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic In this work, we explore a genre of puzzles ("image riddles") which involves a set of images and a question. Answering these puzzles require both capabilities involving visual detection (including object, activity recognition) and, knowledge-based or commonsense reasoning. We compile a dataset of over 3k riddles where each riddle consists of 4 images and a groundtruth answer. The annotations are validated using crowd-sourced evaluation. We also define an automatic evaluation metric to track future progress. Our task bears similarity with the commonly known IQ tasks such as analogy solving, sequence filling that are often used to test intelligence. We develop a Probabilistic Reasoning-based approach that utilizes probabilistic commonsense knowledge to answer these riddles with a reasonable accuracy. We demonstrate the results of our approach using both automatic and human evaluations. Our approach achieves some promising results for these riddles and provides a strong baseline for future attempts. We make the entire dataset and related materials publicly available to the community in ImageRiddle Website (this http URL). machine-learning image-processing natural-language-processing deep-learning puzzles rather-interesting to-write-about nudge-targets consider:looking-to-see consider:integrating-NLP march 2018 by Vaguery [1710.01802] Automatic Structural Scene Digitalization In this paper, we present an automatic system for the analysis and labeling of structural scenes, floor plan drawings in Computer-aided Design (CAD) format. The proposed system applies a fusion strategy to detect and recognize various components of CAD floor plans, such as walls, doors, windows and other ambiguous assets. Technically, a general rule-based filter parsing method is fist adopted to extract effective information from the original floor plan. Then, an image-processing based recovery method is employed to correct information extracted in the first step. Our proposed method is fully automatic and real-time. Such analysis system provides high accuracy and is also evaluated on a public website that, on average, archives more than ten thousands effective uses per day and reaches a relatively high satisfaction rate. rather-interesting digitization feature-extraction CAD abstraction machine-learning to-write-about consider:looking-to-see february 2018 by Vaguery [1802.08535] Can Neural Networks Understand Logical Entailment? We introduce a new dataset of logical entailments for the purpose of measuring models' ability to capture and exploit the structure of logical expressions against an entailment prediction task. We use this task to compare a series of architectures which are ubiquitous in the sequence-processing literature, in addition to a new model class---PossibleWorldNets---which computes entailment as a "convolution over possible worlds". Results show that convolutional networks present the wrong inductive bias for this class of problems relative to LSTM RNNs, tree-structured neural networks outperform LSTM RNNs due to their enhanced ability to exploit the syntax of logic, and PossibleWorldNets outperform all benchmarks. Betteridge's-Law neural-networks representation nudge-targets consider:looking-to-see low-hanging-fruit right-tool-for-the-job february 2018 by Vaguery [1510.02875] Exploring mod2 n-queens games We introduce a two player game on an n x n chessboard where queens are placed by alternating turns on a chessboard square whose availability is determined by the number of queens already on the board which can attack that square modulo two. The game is explored along with some variations and its complexity. mathematical-recreations chess enumeration game-theory rather-interesting nudge-targets consider:looking-to-see february 2018 by Vaguery [1801.02262] Magic Polygons and Their Properties Magic squares are arrangements of natural numbers into square arrays, where the sum of each row, each column, and both diagonals is the same. In this paper, the concept of a magic square with 3 rows and 3 columns is generalized to define magic polygons. Furthermore, this paper will examine the existence of magic polygons, along with several other properties inherent to magic polygons. mathematical-recreations magic-squares combinatorics number-theory rather-interesting to-write-about nudge-targets consider:looking-to-see february 2018 by Vaguery [1710.00194] Where computer vision can aid physics: dynamic cloud motion forecasting from satellite images This paper describes a new algorithm for solar energy forecasting from a sequence of Cloud Optical Depth (COD) images. The algorithm is based on the following simple observation: the dynamics of clouds represented by COD images resembles the motion (transport) of a density in a fluid flow. This suggests that, to forecast the motion of COD images, it is sufficient to forecast the flow. The latter, in turn, can be accomplished by fitting a parametric model of the fluid flow to the COD images observed in the past. Namely, the learning phase of the algorithm is composed of the following steps: (i) given a sequence of COD images, the snapshots of the optical flow are estimated from two consecutive COD images; (ii) these snapshots are then assimilated into a Navier-Stokes Equation (NSE), i.e. an initial velocity field for NSE is selected so that the corresponding NSE' solution is as close as possible to the optical flow snapshots. The prediction phase consists of utilizing a linear transport equation, which describes the propagation of COD images in the fluid flow predicted by NSE, to estimate the future motion of the COD images. The algorithm has been tested on COD images provided by two geostationary operational environmental satellites from NOAA serving the west-hemisphere. image-processing prediction nudge-targets consider:looking-to-see representation low-hanging-fruit february 2018 by Vaguery [1709.10030] Sparse Hierarchical Regression with Polynomials We present a novel method for exact hierarchical sparse polynomial regression. Our regressor is that degree r polynomial which depends on at most k inputs, counting at most ℓ monomial terms, which minimizes the sum of the squares of its prediction errors. The previous hierarchical sparse specification aligns well with modern big data settings where many inputs are not relevant for prediction purposes and the functional complexity of the regressor needs to be controlled as to avoid overfitting. We present a two-step approach to this hierarchical sparse regression problem. First, we discard irrelevant inputs using an extremely fast input ranking heuristic. Secondly, we take advantage of modern cutting plane methods for integer optimization to solve our resulting reduced hierarchical (k,ℓ)-sparse problem exactly. The ability of our method to identify all k relevant inputs and all ℓ monomial terms is shown empirically to experience a phase transition. Crucially, the same transition also presents itself in our ability to reject all irrelevant features and monomials as well. In the regime where our method is statistically powerful, its computational complexity is interestingly on par with Lasso based heuristics. The presented work fills a void in terms of a lack of powerful disciplined nonlinear sparse regression methods in high-dimensional settings. Our method is shown empirically to scale to regression problems with n≈10,000 observations for input dimension p≈1,000. statistics regression algorithms approximation performance-measure to-understand nudge-targets consider:looking-to-see february 2018 by Vaguery [1710.06647] Image Restoration by Iterative Denoising and Backward Projections Inverse problems appear in many applications such as image deblurring and inpainting. The common approach to address them is to design a specific algorithm for each problem. The Plug-and-Play (P&P) framework, which has been recently introduced, allows solving general inverse problems by leveraging the impressive capabilities of existing denoising algorithms. While this fresh strategy has found many applications, a burdensome parameter tuning is often required in order to obtain high-quality results. In this work, we propose an alternative method for solving inverse problems using denoising algorithms, which requires less parameter tuning. We provide a theoretical analysis of the method, and empirically demonstrate that it is competitive with task-specific techniques and the P&P approach for image inpainting and deblurring. inverse-problems image-processing superresolution algorithms performance-measure to-write-about nudge-targets consider:looking-to-see february 2018 by Vaguery [1707.06340] On Unlimited Sampling Shannon's sampling theorem provides a link between the continuous and the discrete realms stating that bandlimited signals are uniquely determined by its values on a discrete set. This theorem is realized in practice using so called analog--to--digital converters (ADCs). Unlike Shannon's sampling theorem, the ADCs are limited in dynamic range. Whenever a signal exceeds some preset threshold, the ADC saturates, resulting in aliasing due to clipping. The goal of this paper is to analyze an alternative approach that does not suffer from these problems. Our work is based on recent developments in ADC design, which allow for ADCs that reset rather than to saturate, thus producing modulo samples. An open problem that remains is: Given such modulo samples of a bandlimited function as well as the dynamic range of the ADC, how can the original signal be recovered and what are the sufficient conditions that guarantee perfect recovery? In this paper, we prove such sufficiency conditions and complement them with a stable recovery algorithm. Our results are not limited to certain amplitude ranges, in fact even the same circuit architecture allows for the recovery of arbitrary large amplitudes as long as some estimate of the signal norm is available when recovering. Numerical experiments that corroborate our theory indeed show that it is possible to perfectly recover function that takes values that are orders of magnitude higher than the ADC's threshold. information-theory rather-interesting signal-processing the-mangle-in-practice hardware-changes-models to-write-about nudge-targets consider:looking-to-see february 2018 by Vaguery [1710.00709] Synthesising Evolutionarily Stable Normative Systems Within the area of multi-agent systems, normative systems are a widely used framework for the coordination of interdependent activities. A crucial problem associated with normative systems is that of synthesising norms that effectively accomplish a coordination task and whose compliance forms a rational choice for the agents within the system. In this work, we introduce a framework for the synthesis of normative systems that effectively coordinate a multi-agent system and whose norms are likely to be adopted by rational agents. Our approach roots in evolutionary game theory. Our framework considers multi-agent systems in which evolutionary forces lead successful norms to prosper and spread within the agent population, while unsuccessful norms are discarded. The outputs of this evolutionary norm synthesis process are normative systems whose compliance forms a rational choice for the agents. We empirically show the effectiveness of our approach through empirical evaluation in a simulated traffic domain. evolutionary-dynamics game-theory rather-interesting meta-simulation performance-measure inverse-problems to-write-about consider:looking-to-see february 2018 by Vaguery [1611.03144] Therapeutic target discovery using Boolean network attractors: improvements of kali In a previous article, an algorithm for identifying therapeutic targets in Boolean networks modeling pathological mechanisms was introduced. In the present article, the improvements made on this algorithm, named kali, are described. These improvements are i) the possibility to work on asynchronous Boolean networks, ii) a finer assessment of therapeutic targets and iii) the possibility to use multivalued logic. kali assumes that the attractors of a dynamical system, such as a Boolean network, are associated with the phenotypes of the modeled biological system. Given a logic-based model of pathological mechanisms, kali searches for therapeutic targets able to reduce the reachability of the attractors associated with pathological phenotypes, thus reducing their likeliness. kali is illustrated on an example network and used on a biological case study. The case study is a published logic-based model of bladder tumorigenesis from which kali returns consistent results. However, like any computational tool, kali can predict but can not replace human expertise: it is a supporting tool for coping with the complexity of biological systems in the field of drug discovery. reaction-networks boolean-networks Kauffmania bioinformatics systems-biology rather-interesting nudge-targets consider:looking-to-see february 2018 by Vaguery Maverick Solitaire An early form of Poker solitaire is actually a puzzle of sorts, one which has been called Maverick solitaire (after its appearance on the 1950's/1960's Western T.V. show Maverick, in the first season episode Rope of Cards). Twenty five cards are dealt from a shuffled 52-card deck. The object is to divide the 25 cards into five groups of five cards so that each is a pat hand in draw poker (a hand which does not need to be drawn to). Maverick Solitaire is well-known as a sucker bet, as the probability of success with a random group of 25 cards would seem low, but is actually quite high: the eponymous author of Maverick's Guide to Poker estimates the odds to be at least 98 percent (he disallows four of a kind). This is remarkably accurate: Mark Masten's computer solver, allowing four of a kind, solved about 98.1 percent of a random set of 1000 deals. Deals with unique solutions are even less common than impossible ones: the sample above had 19 impossible deals and only 8 with unique solutions. mathematical-recreations cards nudge-targets consider:representation consider:looking-to-see february 2018 by Vaguery [1709.10061] Active Information Acquisition for Linear Optimization We consider partially-specified optimization problems where the goal is to actively, but efficiently, acquire missing information about the problem in order to solve it. An algorithm designer wishes to solve a linear program (LP), maxcTx s.t. Ax≤b,x≥0, but does not initially know some of the parameters. The algorithm can iteratively choose an unknown parameter and gather information in the form of a noisy sample centered at the parameter's (unknown) value. The goal is to find an approximately feasible and optimal solution to the underlying LP with high probability while drawing a small number of samples. We focus on two cases. (1) When the parameters c of the objective are initially unknown, we take an information-theoretic approach and give roughly matching upper and lower sample complexity bounds, with an (inefficient) successive-elimination algorithm. (2) When the parameters b of the constraints are initially unknown, we propose an efficient algorithm combining techniques from the ellipsoid method for LP and confidence-bound approaches from bandit algorithms. The algorithm adaptively gathers information about constraints only as needed in order to make progress. We give sample complexity bounds for the algorithm and demonstrate its improvement over a naive approach via simulation. operations-research game-theory experimental-design information-theory rather-interesting nudge-targets consider:looking-to-see consider:planning-versions january 2018 by Vaguery [1510.06890] Generalized Transformation Design: metrics, speeds, and diffusion We show that a unified and maximally generalized approach to spatial transformation design is possible, one that encompasses all second order waves, rays, and diffusion processes in anisotropic media. Until the final step, it is unnecessary to specify the physical process for which a specific transformation design is to be implemented. The principal approximation is the neglect of wave impedance, an attribute that plays no role in ray propagation, and is therefore irrelevant for pure ray devices; another constraint is that for waves the spatial variation in material parameters needs to be sufficiently small compared with the wavelength. The key link between our general formulation and a specific implementation is how the spatial metric relates to the speed of disturbance in a given medium, whether it is electromagnetic, acoustic, or diffusive. Notably, we show that our generalised ray theory, in allowing for anisotropic indexes (speeds), generates the same predictions as does a wave theory, and the results are closely related to those for diffusion processes. engineering-design rather-interesting indistinguishable-from-magic performance-measure representation nudge-targets consider:looking-to-see consider:symbolic-regression to-write-about january 2018 by Vaguery [1705.00441] Learning Topic-Sensitive Word Representations Distributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling topics and integrating topic distributions for each document we obtain representations that are able to distinguish between different meanings of a given word. Our models yield statistically significant improvements for the lexical substitution task indicating that commonly used single word representations, even when combined with contextual information, are insufficient for this task. natural-language-processing feature-construction neural-networks algorithms nudge-targets consider:looking-to-see consider:performance-measures january 2018 by Vaguery [1711.03525] Improving the Redundancy of the Knuth Balancing Scheme for Packet Transmission Systems simple scheme was proposed by Knuth to generate balanced codewords from a random binary in- formation sequence. However, this method presents a redundancy which is twice that of the full sets of bal- anced codewords, that is the minimal achievable redun- dancy. The gap between the Knuth's algorithm gen- erated redundancy and the minimal one is significant and can be reduced. This paper attempts to achieve this goal through a method based on information se- quence candidates. The proposed scheme is suitable for various communication systems as it generates very ef- ?cient and less redundant balanced codes. information-theory Knuth algorithms performance-measure nudge-targets consider:looking-to-see january 2018 by Vaguery [1709.05725] FlashProfile: Interactive Synthesis of Syntactic Profiles We address the problem of learning comprehensive syntactic profiles for a set of strings. Real-world datasets, typically curated from multiple sources, often contain data in various formats. Thus any data processing task is preceded by the critical step of data format identification. However, manual inspection of data to identify various formats is infeasible in standard big-data scenarios. We present a technique for generating comprehensive syntactic profiles in terms of user-defined patterns that also allows for interactive refinement. We define a syntactic profile as a set of succinct patterns that describe the entire dataset. Our approach efficiently learns such profiles, and allows refinement by exposing a desired number of patterns. Our implementation, FlashProfile, shows a median profiling time of 0.7s over 142 tasks on 74 real datasets. We also show that access to the generated data profiles allow for more accurate synthesis of programs, using fewer examples in programming-by-example workflows. pattern-discovery rather-interesting strings data-mining statistics nudge-targets consider:looking-to-see consider:representation consider:performance-measures january 2018 by Vaguery [1709.02180] Structurally Parameterizedd$-Scattered Set In d-Scattered Set we are given an (edge-weighted) graph and are asked to select at least k vertices, so that the distance between any pair is at least d, thus generalizing Independent Set. We provide upper and lower bounds on the complexity of this problem with respect to various standard graph parameters. In particular, we show the following: - For any d≥2, an O∗(dtw)-time algorithm, where tw is the treewidth of the input graph. - A tight SETH-based lower bound matching this algorithm's performance. These generalize known results for Independent Set. - d-Scattered Set is W[1]-hard parameterized by vertex cover (for edge-weighted graphs), or feedback vertex set (for unweighted graphs), even if k is an additional parameter. - A single-exponential algorithm parameterized by vertex cover for unweighted graphs, complementing the above-mentioned hardness. - A 2O(td2)-time algorithm parameterized by tree-depth (td), as well as a matching ETH-based lower bound, both for unweighted graphs. We complement these mostly negative results by providing an FPT approximation scheme parameterized by treewidth. In particular, we give an algorithm which, for any error parameter ϵ>0, runs in time O∗((tw/ϵ)O(tw)) and returns a d/(1+ϵ)-scattered set of size k, if a d-scattered set of the same size exists. computational-complexity graph-theory algorithms rather-interesting operations-research nudge-targets to-write-about consider:looking-to-see  january 2018 by Vaguery [1512.00337] Absolutely abnormal, continued fraction normal numbers In this short note, we give a proof, conditional on the Generalized Riemann Hypothesis, that there exist numbers x which are normal with respect to the continued fraction expansion but not to any base b expansion. This partially answers a question of Bugeaud. number-theory continued-fractions rather-interesting lovely-title to-write-about consider:looking-to-see nudge-targets january 2018 by Vaguery [1710.06384] Efficient Neighbor-Finding on Space-Filling Curves Space-filling curves (SFC, also known as FASS-curves) are a useful tool in scientific computing and other areas of computer science to sequentialize multidimensional grids in a cache-efficient and parallelization-friendly way for storage in an array. Many algorithms, for example grid-based numerical PDE solvers, have to access all neighbor cells of each grid cell during a grid traversal. While the array indices of neighbors can be stored in a cell, they still have to be computed for initialization or when the grid is adaptively refined. A fast neighbor-finding algorithm can thus significantly improve the runtime of computations on multidimensional grids. In this thesis, we show how neighbors on many regular grids ordered by space-filling curves can be found in an average-case time complexity of O(1). In general, this assumes that the local orientation (i.e. a variable of a describing grammar) of the SFC inside the grid cell is known in advance, which can be efficiently realized during traversals. Supported SFCs include Hilbert, Peano and Sierpinski curves in arbitrary dimensions. We assume that integer arithmetic operations can be performed in O(1), i.e. independent of the size of the integer. We do not deal with the case of adaptively refined grids here. However, it appears that a generalization of the algorithm to suitable adaptive grids is possible. To formulate the neighbor-finding algorithm and prove its correctness and runtime properties, a modeling framework is introduced. This framework extends the idea of vertex-labeling to a description using grammars and matrices. With the sfcpp library, we provide a C++ implementation to render SFCs generated by such models and automatically compute all lookup tables needed for the neighbor-finding algorithm. Furthermore, optimized neighbor-finding implementations for various SFCs are included for which we provide runtime measurements. mathematics rather-interesting metrics nudge-targets consider:looking-to-see to-write-about algorithms january 2018 by Vaguery [1705.00318] An Order-based Algorithm for Minimum Dominating Set with Application in Graph Mining Dominating set is a set of vertices of a graph such that all other vertices have a neighbour in the dominating set. We propose a new order-based randomised local search (RLSo) algorithm to solve minimum dominating set problem in large graphs. Experimental evaluation is presented for multiple types of problem instances. These instances include unit disk graphs, which represent a model of wireless networks, random scale-free networks, as well as samples from two social networks and real-world graphs studied in network science. Our experiments indicate that RLSo performs better than both a classical greedy approximation algorithm and two metaheuristic algorithms based on ant colony optimisation and local search. The order-based algorithm is able to find small dominating sets for graphs with tens of thousands of vertices. In addition, we propose a multi-start variant of RLSo that is suitable for solving the minimum weight dominating set problem. The application of RLSo in graph mining is also briefly demonstrated. graph-theory algorithms combinatorics horse-races nudge-targets consider:looking-to-see january 2018 by Vaguery [1712.08558] Lattice-based Locality Sensitive Hashing is Optimal Locality sensitive hashing (LSH) was introduced by Indyk and Motwani (STOC 98) to give the first sublinear time algorithm for the c-approximate nearest neighbor (ANN) problem using only polynomial space. At a high level, an LSH family hashes "nearby" points to the same bucket and "far away" points to different buckets. The quality of measure of an LSH family is its LSH exponent, which helps determine both query time and space usage. In a seminal work, Andoni and Indyk (FOCS 06) constructed an LSH family based on random ball partitioning of space that achieves an LSH exponent of 1/c^2 for the l_2 norm, which was later shown to be optimal by Motwani, Naor and Panigrahy (SIDMA 07) and O'Donnell, Wu and Zhou (TOCT 14). Although optimal in the LSH exponent, the ball partitioning approach is computationally expensive. So, in the same work, Andoni and Indyk proposed a simpler and more practical hashing scheme based on Euclidean lattices and provided computational results using the 24-dimensional Leech lattice. However, no theoretical analysis of the scheme was given, thus leaving open the question of finding the exponent of lattice based LSH. In this work, we resolve this question by showing the existence of lattices achieving the optimal LSH exponent of 1/c^2 using techniques from the geometry of numbers. At a more conceptual level, our results show that optimal LSH space partitions can have periodic structure. Understanding the extent to which additional structure can be imposed on these partitions, e.g. to yield low space and query complexity, remains an important open problem. algorithms numerical-methods rather-interesting approximation computational-complexity nudge-targets consider:looking-to-see consider:representation january 2018 by Vaguery [1712.08573] Longest common substring with approximately$k\$ mismatches
In the longest common substring problem we are given two strings of length n and must find a substring of maximal length that occurs in both strings. It is well-known that the problem can be solved in linear time, but the solution is not robust and can vary greatly when the input strings are changed even by one letter. To circumvent this, Leimeister and Morgenstern introduced the problem of the longest common substring with k mismatches. Lately, this problem has received a lot of attention in the literature. In this paper we first show a conditional lower bound based on the SETH hypothesis implying that there is little hope to improve existing solutions. We then introduce a new but closely related problem of the longest common substring with approximately k mismatches and use computational geometry techniques to show that it admits a solution with strongly subquadratic running time. We also apply these results to obtain a strongly subquadratic approximation algorithm for the longest common substring with k mismatches problem and show conditional hardness of improving its approximation ratio.
strings  computational-complexity  algorithms  rather-interesting  nudge-targets  consider:looking-to-see
january 2018 by Vaguery
[1711.08903] Tilings of convex sets by mutually incongruent equilateral triangles contain arbitrarily small tiles
We show that every tiling of a convex set in the Euclidean plane ℝ2 by equilateral triangles of mutually different sizes contains arbitrarily small tiles. The proof is purely elementary up to the discussion of one family of tilings of the full plane ℝ2, which is based on a surprising connection to a random walk on a directed graph.
combinatorics  tiling  constraint-satisfaction  rather-interesting  consider:looking-to-see  to-write-about
january 2018 by Vaguery
[1710.07145] Reaching a Target in the Plane with no Information
A mobile agent has to reach a target in the Euclidean plane. Both the agent and the target are modeled as points. In the beginning, the agent is at distance at most D>0 from the target. Reaching the target means that the agent gets at a {\em sensing distance} at most r>0 from it. The agent has a measure of length and a compass. We consider two scenarios: in the {\em static} scenario the target is inert, and in the {\em dynamic} scenario it may move arbitrarily at any (possibly varying) speed bounded by v. The agent has no information about the parameters of the problem, in particular it does not know D, r or v. The goal is to reach the target at lowest possible cost, measured by the total length of the trajectory of the agent.
Our main result is establishing the minimum cost (up to multiplicative constants) of reaching the target under both scenarios, and providing the optimal algorithm for the agent. For the static scenario the minimum cost is Θ((logD+log1r)D2/r), and for the dynamic scenario it is Θ((logM+log1r)M2/r), where M=max(D,v). Under the latter scenario, the speed of the agent in our algorithm grows exponentially with time, and we prove that for an agent whose speed grows only polynomially with time, this cost is impossible to achieve.
agent-based  optimization  mathematical-recreations  planning  algorithms  nudge-targets  consider:looking-to-see
january 2018 by Vaguery
[1709.04380] Neural Network Based Nonlinear Weighted Finite Automata
Weighted finite automata (WFA) can expressively model functions defined over strings but are inherently linear models. Given the recent successes of nonlinear models in machine learning, it is natural to wonder whether ex-tending WFA to the nonlinear setting would be beneficial. In this paper, we propose a novel model of neural network based nonlinearWFA model (NL-WFA) along with a learning algorithm. Our learning algorithm is inspired by the spectral learning algorithm for WFAand relies on a nonlinear decomposition of the so-called Hankel matrix, by means of an auto-encoder network. The expressive power of NL-WFA and the proposed learning algorithm are assessed on both synthetic and real-world data, showing that NL-WFA can lead to smaller model sizes and infer complex grammatical structures from data.
formal-languages  automata  rather-interesting  neural-networks  representation  statistics  modeling  to-write-about  consider:looking-to-see
january 2018 by Vaguery
Chopping a square into 7 triangles with almost the same area It's impossible...
It's impossible to divide a square into an odd number of triangles with the same area. This amazing result was proved by Paul Monsky in 1970, but I only heard about it today, from +David Eppstein, here on G+.

So it's impossible. But you can still try your best! These are the two best tries with 7 triangles and at most 8 vertices. These were discovered last year by Jean-Philippe Labbé, Günter Rote, and Günter Ziegler. They used a lot of clever math and programming to do this
mathematical-recreations  plane-geometry  nudge-targets  consider:looking-to-see
january 2018 by Vaguery
[1801.03534] Mechanical Computing Systems Using Only Links and Rotary Joints
A new paradigm for mechanical computing is demonstrated that requires only two basic parts, links and rotary joints. These basic parts are combined into two main higher level structures, locks and balances, and suffice to create all necessary combinatorial and sequential logic required for a Turing-complete computational system. While working systems have yet to be implemented using this new paradigm, the mechanical simplicity of the systems described may lend themselves better to, e.g., microfabrication, than previous mechanical computing designs. Additionally, simulations indicate that if molecular-scale implementations could be realized, they would be far more energy-efficient than conventional electronic computers.
out-of-the-box  computation  representation  not-quite-far-enough  nudge-targets  consider:looking-to-see
january 2018 by Vaguery
[1609.06349] A review of matrix scaling and Sinkhorn's normal form for matrices and positive maps
Given a nonnegative matrix A, can you find diagonal matrices D1, D2 such that D1AD2 is doubly stochastic? The answer to this question is known as Sinkhorn's theorem. It has been proved with a wide variety of methods, each presenting a variety of possible generalisations. Recently, generalisations such as to positive maps between matrix algebras have become more and more interesting for applications. This text gives a review of over 70 years of matrix scaling. The focus lies on the mathematical landscape surrounding the problem and its solution as well as the generalisation to positive maps and contains hardly any nontrivial unpublished results.
matrices  algorithms  to-write-about  nudge-targets  consider:looking-to-see  easily-checked
december 2017 by Vaguery
[1712.05390] Partisan gerrymandering with geographically compact districts
Bizarrely shaped voting districts are frequently lambasted as likely instances of gerrymandering. In order to systematically identify such instances, researchers have devised several tests for so-called geographic compactness (i.e., shape niceness). We demonstrate that under certain conditions, a party can gerrymander a competitive state into geographically compact districts to win an average of over 70% of the districts. Our results suggest that geometric features alone may fail to adequately combat partisan gerrymandering.
computational-geometry  politics  multiobjective-optimization  rather-interesting  to-write-about  consider:looking-to-see  nudge-targets  consider:stochastic-versions
december 2017 by Vaguery
per page:    204080120160

Copy this bookmark:

description:

tags: