**to-write-about**822

[1708.08691] Extremal solutions to some art gallery and terminal-pairability problems

5 days ago by Vaguery

The chosen tool of this thesis is an extremal type approach. The lesson drawn by the theorems proved in the thesis is that surprisingly small compromise is necessary on the efficacy of the solutions to make the approach work. The problems studied have several connections to other subjects and practical applications.

The first part of the thesis is concerned with orthogonal art galleries. A sharp extremal bound is proved on partitioning orthogonal polygons into at most 8-vertex polygons using established techniques in the field of art gallery problems. This fills in the gap between already known results for partitioning into at most 6- and 10-vertex orthogonal polygons.

Next, these techniques are further developed to prove a new type of extremal art gallery result. The novelty provided by this approach is that it establishes a connection between mobile and stationary guards. This theorem has strong computational consequences, in fact, it provides the basis for an 83-approximation algorithm for guarding orthogonal polygons with rectangular vision.

In the second part, the graph theoretical concept of terminal-pairability is studied in complete and complete grid graphs. Once again, the extremal approach is conductive to discovering efficient methods to solve the problem.

In the case of a complete base graph, the new demonstrated lower bound on the maximum degree of realizable demand graphs is 4 times higher than previous best results. The techniques developed are then used to solve the classical extremal edge number problem for the terminal-pairability problem in complete base graphs.

The complete grid base graph lies on the other end of the spectrum in terms density amongst path-pairable graphs. It is shown that complete grid graphs are relatively efficient in routing edge-disjoint paths.

computational-geometry
combinatorics
plane-geometry
extreme-values
rather-interesting
to-write-about
nudge-targets
consider:approximation
consider:looking-to-see
The first part of the thesis is concerned with orthogonal art galleries. A sharp extremal bound is proved on partitioning orthogonal polygons into at most 8-vertex polygons using established techniques in the field of art gallery problems. This fills in the gap between already known results for partitioning into at most 6- and 10-vertex orthogonal polygons.

Next, these techniques are further developed to prove a new type of extremal art gallery result. The novelty provided by this approach is that it establishes a connection between mobile and stationary guards. This theorem has strong computational consequences, in fact, it provides the basis for an 83-approximation algorithm for guarding orthogonal polygons with rectangular vision.

In the second part, the graph theoretical concept of terminal-pairability is studied in complete and complete grid graphs. Once again, the extremal approach is conductive to discovering efficient methods to solve the problem.

In the case of a complete base graph, the new demonstrated lower bound on the maximum degree of realizable demand graphs is 4 times higher than previous best results. The techniques developed are then used to solve the classical extremal edge number problem for the terminal-pairability problem in complete base graphs.

The complete grid base graph lies on the other end of the spectrum in terms density amongst path-pairable graphs. It is shown that complete grid graphs are relatively efficient in routing edge-disjoint paths.

5 days ago by Vaguery

[1710.03395] Efficient Dynamic Dictionary Matching with DAWGs and AC-automata

5 days ago by Vaguery

The dictionary matching is a task to find all occurrences of pattern strings in a set D (called a dictionary) on a text string T. The Aho-Corasick-automaton (AC-automaton) is a data structure which enables us to solve the dictionary matching problem in O(dlogσ) preprocessing time and O(nlogσ+occ) matching time, where d is the total length of the patterns in D, n is the length of the text, σ is the alphabet size, and occ is the total number of occurrences of all the patterns in the text. The dynamic dictionary matching is a variant where patterns may dynamically be inserted into and deleted from D. This problem is called semi-dynamic dictionary matching if only insertions are allowed. In this paper, we propose two efficient algorithms. For a pattern of length m, our first algorithm supports insertions in O(mlogσ+logd/loglogd) time and pattern matching in O(nlogσ+occ) time for the semi-dynamic setting and supports both insertions and deletions in O(σm+logd/loglogd) time and pattern matching in O(n(logd/loglogd+logσ)+occ(logd/loglogd)) time for the dynamic setting by some modifications. This algorithm is based on the directed acyclic word graph. Our second algorithm, which is based on the AC-automaton, supports insertions in O(mlogσ+uf+uo) time for the semi-dynamic setting and supports both insertions and deletions in O(σm+uf+uo) time for the dynamic setting, where uf and uo respectively denote the numbers of states of which the failure function and the output function need to be updated. This algorithm performs pattern matching in O(nlogσ+occ) time for both settings. Our algorithm achieves optimal update time for AC-automaton based methods, since any algorithm which explicitly maintains the AC-automaton requires Ω(uf+uo) update time.

algorithms
strings
rather-interesting
formal-languages
representation
rewriting-systems
to-write-about
consider:rediscovery
nudge-targets
5 days ago by Vaguery

[1706.01557] The minimum Manhattan distance and minimum jump of permutations

5 days ago by Vaguery

Let π be a permutation of {1,2,…,n}. If we identify a permutation with its graph, namely the set of n dots at positions (i,π(i)), it is natural to consider the minimum L1 (Manhattan) distance, d(π), between any pair of dots. The paper computes the expected value of d(π) when n→∞ and π is chosen uniformly, and settles a conjecture of Bevan, Homberger and Tenner (motivated by permutation patterns), showing that when d is fixed and n→∞, the probability that d(π)≥d+2 tends to e−d2−d.

The minimum jump 𝗆𝗃(π) of π, defined by 𝗆𝗃(π)=min1≤i≤n−1|π(i+1)−π(i)|, is another natural measure in this context. The paper computes the expected value of 𝗆𝗃(π), and the asymptotic probability that 𝗆𝗃(π)≥d+1 for any constant d.

combinatorics
probability-theory
permutations
rather-interesting
to-write-about
to-simulate
nudge-targets
consider:looking-to-see
The minimum jump 𝗆𝗃(π) of π, defined by 𝗆𝗃(π)=min1≤i≤n−1|π(i+1)−π(i)|, is another natural measure in this context. The paper computes the expected value of 𝗆𝗃(π), and the asymptotic probability that 𝗆𝗃(π)≥d+1 for any constant d.

5 days ago by Vaguery

[1711.01012] Genetic Policy Optimization

5 days ago by Vaguery

Genetic algorithms have been widely used in many practical optimization problems. Inspired by natural selection, operators, including mutation, crossover and selection, provide effective heuristics for search and black-box optimization. However, they have not been shown useful for deep reinforcement learning, possibly due to the catastrophic consequence of parameter crossovers of neural networks. Here, we present Genetic Policy Optimization (GPO), a new genetic algorithm for sample-efficient deep policy optimization. GPO uses imitation learning for policy crossover in the state space and applies policy gradient methods for mutation. Our experiments on Mujoco tasks show that GPO as a genetic algorithm is able to provide superior performance over the state-of-the-art policy gradient methods and achieves comparable or higher sample efficiency.

machine-learning
genetic-algorithm
representation
reinvented-wheel
to-understand
to-write-about
evolutionary-algorithms
5 days ago by Vaguery

[1610.08123] Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data

5 days ago by Vaguery

A challenge in training discriminative models like neural networks is obtaining enough labeled training data. Recent approaches use generative models to combine weak supervision sources, like user-defined heuristics or knowledge bases, to label training data. Prior work has explored learning accuracies for these sources even without ground truth labels, but they assume that a single accuracy parameter is sufficient to model the behavior of these sources over the entire training set. In particular, they fail to model latent subsets in the training data in which the supervision sources perform differently than on average. We present Socratic learning, a paradigm that uses feedback from a corresponding discriminative model to automatically identify these subsets and augments the structure of the generative model accordingly. Experimentally, we show that without any ground truth labels, the augmented generative model reduces error by up to 56.06% for a relation extraction task compared to a state-of-the-art weak supervision technique that utilizes generative models.

machine-learning
algorithms
performance-space-analysis
coevolution
to-write-about
system-of-professions
reinvented-wheels
5 days ago by Vaguery

[1710.06993] Improved Search in Hamming Space using Deep Multi-Index Hashing

5 days ago by Vaguery

Similarity-preserving hashing is a widely-used method for nearest neighbour search in large-scale image retrieval tasks. There has been considerable research on generating efficient image representation via the deep-network-based hashing methods. However, the issue of efficient searching in the deep representation space remains largely unsolved. To this end, we propose a simple yet efficient deep-network-based multi-index hashing method for simultaneously learning the powerful image representation and the efficient searching. To achieve these two goals, we introduce the multi-index hashing (MIH) mechanism into the proposed deep architecture, which divides the binary codes into multiple substrings. Due to the non-uniformly distributed codes will result in inefficiency searching, we add the two balanced constraints at feature-level and instance-level, respectively. Extensive evaluations on several benchmark image retrieval datasets show that the learned balanced binary codes bring dramatic speedups and achieve comparable performance over the existing baselines.

metrics
image-processing
deep-learning
performance-space-analysis
feature-construction
machine-learning
rather-interesting
similarity-measures
to-write-about
5 days ago by Vaguery

[1610.06934] The K Shortest Paths Problem with Application to Routing

5 days ago by Vaguery

Due to the computational complexity of finding almost shortest simple paths, we propose that identifying a larger collection of (nonbacktracking) paths is more efficient than finding almost shortest simple paths on positively weighted real-world networks. First, we present an easy to implement O(mlogm+kL) solution for finding all (nonbacktracking) paths with bounded length D between two arbitrary nodes on a positively weighted graph, where L is an upperbound for the number of nodes in any of the k outputted paths. Subsequently, we illustrate that for undirected Chung-Lu random graphs, the ratio between the number of nonbacktracking and simple paths asymptotically approaches 1 with high probability for a wide range of parameters. We then consider an application to the almost shortest paths algorithm to measure path diversity for internet routing in a snapshot of the Autonomous System graph subject to an edge deletion process.

planning
data-structures
algorithms
rather-interesting
operations-research
graph-theory
to-write-about
nudge-targets
consider:looking-to-see
5 days ago by Vaguery

[1711.03172] Curve Reconstruction via the Global Statistics of Natural Curves

5 days ago by Vaguery

Reconstructing the missing parts of a curve has been the subject of much computational research, with applications in image inpainting, object synthesis, etc. Different approaches for solving that problem are typically based on processes that seek visually pleasing or perceptually plausible completions. In this work we focus on reconstructing the underlying physically likely shape by utilizing the global statistics of natural curves. More specifically, we develop a reconstruction model that seeks the mean physical curve for a given inducer configuration. This simple model is both straightforward to compute and it is receptive to diverse additional information, but it requires enough samples for all curve configurations, a practical requirement that limits its effective utilization. To address this practical issue we explore and exploit statistical geometrical properties of natural curves, and in particular, we show that in many cases the mean curve is scale invariant and often times it is extensible. This, in turn, allows to boost the number of examples and thus the robustness of the statistics and its applicability. The reconstruction results are not only more physically plausible but they also lead to important insights on the reconstruction problem, including an elegant explanation why certain inducer configurations are more likely to yield consistent perceptual completions than others.

inference
computer-vision
rather-interesting
algorithms
nudge-targets
consider:looking-to-see
consider:representation
performance-measure
to-write-about
5 days ago by Vaguery

[1705.01887] Streaming for Aibohphobes: Longest Palindrome with Mismatches

5 days ago by Vaguery

A palindrome is a string that reads the same as its reverse, such as "aibohphobia" (fear of palindromes). Given an integer d>0, a d-near-palindrome is a string of Hamming distance at most d from its reverse. We study the natural problem of identifying a longest d-near-palindrome in data streams. The problem is relevant to the analysis of DNA databases, and to the task of repairing recursive structures in documents such as XML and JSON. We present an algorithm that returns a d-near-palindrome whose length is within a multiplicative (1+ϵ)-factor of the longest d-near-palindrome. Our algorithm also returns the set of mismatched indices of the d-near-palindrome, using (dlog7nϵlog(1+ϵ)) bits of space, and (dlog6nϵlog(1+ϵ)) update time per arriving symbol. We show that Ω(dlogn) space is necessary for estimating the length of longest d-near-palindromes with high probability. We further obtain an additive-error approximation algorithm and a comparable lower bound, as well as an exact two-pass algorithm that solves the longest d-near-palindrome problem using (d2n‾√log6n) bits of space.

strings
computational-complexity
algorithms
probability-theory
inference
online-learning
rather-interesting
to-write-about
nudge-targets
consider:looking-to-see
consider:representation
5 days ago by Vaguery

[1611.07769] The many facets of community detection in complex networks

5 days ago by Vaguery

Community detection, the decomposition of a graph into essential building blocks, has been a core research topic in network science over the past years. Since a precise notion of what constitutes a community has remained evasive, community detection algorithms have often been compared on benchmark graphs with a particular form of assortative community structure and classified based on the mathematical techniques they employ. However, this comparison can be misleading because apparent similarities in their mathematical machinery can disguise different goals and reasons for why we want to employ community detection in the first place. Here we provide a focused review of these different motivations that underpin community detection. This problem-driven classification is useful in applied network science, where it is important to select an appropriate algorithm for the given purpose. Moreover, highlighting the different facets of community detection also delineates the many lines of research and points out open directions and avenues for future research.

community-detection
social-networks
philosophy-of-science
define-your-terms
review
to-write-about
5 days ago by Vaguery

[1702.01522] Inverse statistical problems: from the inverse Ising problem to data science

5 days ago by Vaguery

Inverse problems in statistical physics are motivated by the challenges of `big data' in different fields, in particular high-throughput experiments in biology. In inverse problems, the usual procedure of statistical physics needs to be reversed: Instead of calculating observables on the basis of model parameters, we seek to infer parameters of a model based on observations. In this review, we focus on the inverse Ising problem and closely related problems, namely how to infer the coupling strengths between spins given observed spin correlations, magnetisations, or other data. We review applications of the inverse Ising problem, including the reconstruction of neural connections, protein structure determination, and the inference of gene regulatory networks. For the inverse Ising problem in equilibrium, a number of controlled and uncontrolled approximate solutions have been developed in the statistical mechanics community. A particularly strong method, pseudolikelihood, stems from statistics. We also review the inverse Ising problem in the non-equilibrium case, where the model parameters must be reconstructed based on non-equilibrium statistics.

data-science
statistics
inverse-problems
complexology
rather-interesting
inference
to-write-about
review
to-simulate
philosophy-of-science
5 days ago by Vaguery

[1709.07485] The Covering Path Problem on a Grid

5 days ago by Vaguery

This paper introduces the covering path problem on a grid (CPPG) which finds the cost-minimizing path connecting a subset of points in a grid such that each point is within a predetermined distance of a point from the chosen subset. We leverage the geometric properties of the grid graph which captures the road network structure in many transportation problems, including our motivating setting of school bus routing. As defined in this paper, the CPPG is a bi-objective optimization problem comprised of one cost term related to path length and one cost term related to stop count. We develop a trade-off constraint which quantifies the trade-off between path length and stop count and provides a lower bound for the bi-objective optimization problem. We introduce simple construction techniques to provide feasible paths that match the lower bound within a constant factor. Importantly, this solution approach uses transformations of the general CPPG to either a discrete CPPG or continuous CPPG based on the value of the coverage radius. For both the discrete and continuous versions, we provide fast constant-factor approximations, thus solving the general CPPG.

operations-research
planning
multiobjective-optimization
rather-interesting
to-write-about
5 days ago by Vaguery

plaidml/plaidml: PlaidML is a framework for making deep learning work everywhere.

10 days ago by Vaguery

PlaidML is a multi-language acceleration framework that:

Enables practitioners to deploy high-performance neural nets on any device

Allows hardware developers to quickly integrate with high-level frameworks

Allows framework developers to easily add support for many kinds of hardware

For background and early benchmarks see our blog post announcing the release. PlaidML is under active development and should be thought of as early alpha quality.

machine-learning
library
software-development
to-learn
to-write-about
Enables practitioners to deploy high-performance neural nets on any device

Allows hardware developers to quickly integrate with high-level frameworks

Allows framework developers to easily add support for many kinds of hardware

For background and early benchmarks see our blog post announcing the release. PlaidML is under active development and should be thought of as early alpha quality.

10 days ago by Vaguery

[1710.05183] Inferring Mesoscale Models of Neural Computation

10 days ago by Vaguery

Recent years have seen dramatic progress in the development of techniques for measuring the activity and connectivity of large populations of neurons in the brain. However, as these techniques grow ever more powerful---allowing us to even contemplate measuring every neuron in entire brain---a new problem arises: how do we make sense of the mountains of data that these techniques produce? Here, we argue that the time is ripe for building an intermediate or "mesoscale" computational theory that can bridge between single-cell (microscale) accounts of neural function and behavioral (macroscale) accounts of animal cognition and environmental complexity. Just as digital accounts of computation in conventional computers abstract away the non-essential dynamics of the analog circuits that implementing gates and registers, so too a computational account of animal cognition can afford to abstract from the non-essential dynamics of neurons. We argue that the geometry of neural circuits is essential in explaining the computational limitations and technological innovations inherent in biological information processing. We propose a blueprint for how to employ tools from modern machine learning to automatically infer a satisfying mesoscale account of neural computation that combines functional and structural data, with an emphasis on learning and exploiting regularity and repeating motifs in neuronal circuits. Rather than suggest a specific theory, we present a new class of scientific instruments that can enable neuroscientists to design, propose, implement and test mesoscale theories of neural computation.

dynamical-systems
machine-learning
deep-learning
representation
temporal-models
rather-interesting
inference
to-write-about
10 days ago by Vaguery

[1512.00507] Beyond Aztec Castles: Toric Cascades in the $dP_3$ Quiver

10 days ago by Vaguery

Given one of an infinite class of supersymmetric quiver gauge theories, string theorists can associate a corresponding toric variety (which is a Calabi-Yau 3-fold) as well as an associated combinatorial model known as a brane tiling. In combinatorial language, a brane tiling is a bipartite graph on a torus and its perfect matchings are of interest to both combinatorialists and physicists alike. A cluster algebra may also be associated to such quivers and in this paper we study the generators of this algebra, known as cluster variables, for the quiver associated to the cone over the del Pezzo surface dP3. In particular, mutation sequences involving mutations exclusively at vertices with two in-coming arrows and two out-going arrows are referred to as toric cascades in the string theory literature. Such toric cascades give rise to interesting discrete integrable systems on the level of cluster variable dynamics. We provide an explicit algebraic formula for all cluster variables which are reachable by toric cascades as well as a combinatorial interpretation involving perfect matchings of subgraphs of the dP3 brane tiling for these formulas in most cases.

tiling
domino-tiling
enumeration
combinatorics
to-write-about
to-simulate
review
rather-interesting
10 days ago by Vaguery

[1504.00303] A Generalization of Aztec Dragons

10 days ago by Vaguery

Aztec dragons are lattice regions first introduced by James Propp, which have the number of tilings given by a power of 2. This family of regions has been investigated further by a number of authors. In this paper, we consider a generalization of the Aztec dragons to two new families of 6-sided regions. By using Kuo's graphical condensation method, we prove that the tilings of the new regions are always enumerated by powers of 2 and 3.

tiling
combinatorics
enumeration
out-of-the-box
to-write-about
to-simulate
10 days ago by Vaguery

[1504.00291] On the numbers of perfect matchings of trimmed Aztec rectangles

10 days ago by Vaguery

We consider several new families of graphs obtained from Aztec rectangle and augmented Aztec rectangle graphs by trimming two opposite corners. We prove that the perfect matchings of these new graphs are enumerated by powers of 2, 3, 5, and 11. The result yields a proof of a conjectured posed by Ciucu. In addition, we reveal a hidden relation between our graphs and the hexagonal dungeons introduced by Blum.

combinatorics
Aztec-diamond
domino-tiling
enumeration
to-write-about
to-simulate
10 days ago by Vaguery

[1411.0146] Double Aztec Rectangles

10 days ago by Vaguery

We investigate the connection between lozenge tilings and domino tilings by introducing a new family of regions obtained by attaching two different Aztec rectangles. We prove a simple product formula for the generating functions of the tilings of the new regions, which involves the statistics as in the Aztec diamond theorem (Elkies, Kuperberg, Larsen, and Propp, J. Algebraic Combin. 1992). Moreover, we consider the connection between the generating function and MacMahon's q-enumeration of plane partitions fitting in a given box

combinatorics
enumeration
tiling
domino-tiling
Aztec-diamond
to-write-about
to-simulate
10 days ago by Vaguery

[1403.4493] Enumeration of tilings of quartered Aztec rectangles

10 days ago by Vaguery

We generalize a theorem of W. Jockusch and J. Propp on quartered Aztec diamonds by enumerating the tilings of quartered Aztec rectangles. We use subgraph replacement method to transform the dual graph of a quartered Aztec rectangle to the dual graph of a quartered lozenge hexagon, and then use Lindstr\"{o}m-Gessel-Viennot methodology to find the number of tilings of a quartered lozenge hexagon.

combinatorics
enumeration
Aztec-diamond
domino-tiling
to-write-about
to-simulate
10 days ago by Vaguery

[1711.02818] Enumeration of lozenge tilings of a hexagon with a shamrock missing on the symmetry axis

10 days ago by Vaguery

In their paper about a dual of MacMahon's classical theorem on plane partitions, Ciucu and Krattenthaler proved a closed form product formula for the tiling number of a hexagon with a "shamrock", a union of four adjacent triangles, removed in the center (Proc. Natl. Acad. Sci. USA 2013). Lai later presented a q-enumeration for lozenge tilings of a hexagon with a shamrock removed from the boundary (European J. Combin. 2017). It appears that the above are the only two positions of the shamrock hole that yield nice tiling enumerations. In this paper, we show that in the case of symmetric hexagons, we always have a simple product formula for the number of tilings when removing a shamrock at any position along the symmetry axis. Our result also generalizes Eisenk\"olbl's related work about lozenge tilings of a hexagon with two unit triangles missing on the symmetry axis (Electron. J. Combin. 1999).

tiling
domino-tiling
rather-interesting
combinatorics
enumeration
the-mangle-in-practice
representation
to-write-about
consider:the-convoluted-approach
10 days ago by Vaguery

**related tags**

Copy this bookmark: