to-write-about   822

« earlier    

[1708.08691] Extremal solutions to some art gallery and terminal-pairability problems
The chosen tool of this thesis is an extremal type approach. The lesson drawn by the theorems proved in the thesis is that surprisingly small compromise is necessary on the efficacy of the solutions to make the approach work. The problems studied have several connections to other subjects and practical applications.
The first part of the thesis is concerned with orthogonal art galleries. A sharp extremal bound is proved on partitioning orthogonal polygons into at most 8-vertex polygons using established techniques in the field of art gallery problems. This fills in the gap between already known results for partitioning into at most 6- and 10-vertex orthogonal polygons.
Next, these techniques are further developed to prove a new type of extremal art gallery result. The novelty provided by this approach is that it establishes a connection between mobile and stationary guards. This theorem has strong computational consequences, in fact, it provides the basis for an 83-approximation algorithm for guarding orthogonal polygons with rectangular vision.
In the second part, the graph theoretical concept of terminal-pairability is studied in complete and complete grid graphs. Once again, the extremal approach is conductive to discovering efficient methods to solve the problem.
In the case of a complete base graph, the new demonstrated lower bound on the maximum degree of realizable demand graphs is 4 times higher than previous best results. The techniques developed are then used to solve the classical extremal edge number problem for the terminal-pairability problem in complete base graphs.
The complete grid base graph lies on the other end of the spectrum in terms density amongst path-pairable graphs. It is shown that complete grid graphs are relatively efficient in routing edge-disjoint paths.
computational-geometry  combinatorics  plane-geometry  extreme-values  rather-interesting  to-write-about  nudge-targets  consider:approximation  consider:looking-to-see 
5 days ago by Vaguery
[1710.03395] Efficient Dynamic Dictionary Matching with DAWGs and AC-automata
The dictionary matching is a task to find all occurrences of pattern strings in a set D (called a dictionary) on a text string T. The Aho-Corasick-automaton (AC-automaton) is a data structure which enables us to solve the dictionary matching problem in O(dlogσ) preprocessing time and O(nlogσ+occ) matching time, where d is the total length of the patterns in D, n is the length of the text, σ is the alphabet size, and occ is the total number of occurrences of all the patterns in the text. The dynamic dictionary matching is a variant where patterns may dynamically be inserted into and deleted from D. This problem is called semi-dynamic dictionary matching if only insertions are allowed. In this paper, we propose two efficient algorithms. For a pattern of length m, our first algorithm supports insertions in O(mlogσ+logd/loglogd) time and pattern matching in O(nlogσ+occ) time for the semi-dynamic setting and supports both insertions and deletions in O(σm+logd/loglogd) time and pattern matching in O(n(logd/loglogd+logσ)+occ(logd/loglogd)) time for the dynamic setting by some modifications. This algorithm is based on the directed acyclic word graph. Our second algorithm, which is based on the AC-automaton, supports insertions in O(mlogσ+uf+uo) time for the semi-dynamic setting and supports both insertions and deletions in O(σm+uf+uo) time for the dynamic setting, where uf and uo respectively denote the numbers of states of which the failure function and the output function need to be updated. This algorithm performs pattern matching in O(nlogσ+occ) time for both settings. Our algorithm achieves optimal update time for AC-automaton based methods, since any algorithm which explicitly maintains the AC-automaton requires Ω(uf+uo) update time.
algorithms  strings  rather-interesting  formal-languages  representation  rewriting-systems  to-write-about  consider:rediscovery  nudge-targets 
5 days ago by Vaguery
[1706.01557] The minimum Manhattan distance and minimum jump of permutations
Let π be a permutation of {1,2,…,n}. If we identify a permutation with its graph, namely the set of n dots at positions (i,π(i)), it is natural to consider the minimum L1 (Manhattan) distance, d(π), between any pair of dots. The paper computes the expected value of d(π) when n→∞ and π is chosen uniformly, and settles a conjecture of Bevan, Homberger and Tenner (motivated by permutation patterns), showing that when d is fixed and n→∞, the probability that d(π)≥d+2 tends to e−d2−d.
The minimum jump 𝗆𝗃(π) of π, defined by 𝗆𝗃(π)=min1≤i≤n−1|π(i+1)−π(i)|, is another natural measure in this context. The paper computes the expected value of 𝗆𝗃(π), and the asymptotic probability that 𝗆𝗃(π)≥d+1 for any constant d.
combinatorics  probability-theory  permutations  rather-interesting  to-write-about  to-simulate  nudge-targets  consider:looking-to-see 
5 days ago by Vaguery
[1711.01012] Genetic Policy Optimization
Genetic algorithms have been widely used in many practical optimization problems. Inspired by natural selection, operators, including mutation, crossover and selection, provide effective heuristics for search and black-box optimization. However, they have not been shown useful for deep reinforcement learning, possibly due to the catastrophic consequence of parameter crossovers of neural networks. Here, we present Genetic Policy Optimization (GPO), a new genetic algorithm for sample-efficient deep policy optimization. GPO uses imitation learning for policy crossover in the state space and applies policy gradient methods for mutation. Our experiments on Mujoco tasks show that GPO as a genetic algorithm is able to provide superior performance over the state-of-the-art policy gradient methods and achieves comparable or higher sample efficiency.
machine-learning  genetic-algorithm  representation  reinvented-wheel  to-understand  to-write-about  evolutionary-algorithms 
5 days ago by Vaguery
[1610.08123] Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data
A challenge in training discriminative models like neural networks is obtaining enough labeled training data. Recent approaches use generative models to combine weak supervision sources, like user-defined heuristics or knowledge bases, to label training data. Prior work has explored learning accuracies for these sources even without ground truth labels, but they assume that a single accuracy parameter is sufficient to model the behavior of these sources over the entire training set. In particular, they fail to model latent subsets in the training data in which the supervision sources perform differently than on average. We present Socratic learning, a paradigm that uses feedback from a corresponding discriminative model to automatically identify these subsets and augments the structure of the generative model accordingly. Experimentally, we show that without any ground truth labels, the augmented generative model reduces error by up to 56.06% for a relation extraction task compared to a state-of-the-art weak supervision technique that utilizes generative models.
machine-learning  algorithms  performance-space-analysis  coevolution  to-write-about  system-of-professions  reinvented-wheels 
5 days ago by Vaguery
[1710.06993] Improved Search in Hamming Space using Deep Multi-Index Hashing
Similarity-preserving hashing is a widely-used method for nearest neighbour search in large-scale image retrieval tasks. There has been considerable research on generating efficient image representation via the deep-network-based hashing methods. However, the issue of efficient searching in the deep representation space remains largely unsolved. To this end, we propose a simple yet efficient deep-network-based multi-index hashing method for simultaneously learning the powerful image representation and the efficient searching. To achieve these two goals, we introduce the multi-index hashing (MIH) mechanism into the proposed deep architecture, which divides the binary codes into multiple substrings. Due to the non-uniformly distributed codes will result in inefficiency searching, we add the two balanced constraints at feature-level and instance-level, respectively. Extensive evaluations on several benchmark image retrieval datasets show that the learned balanced binary codes bring dramatic speedups and achieve comparable performance over the existing baselines.
metrics  image-processing  deep-learning  performance-space-analysis  feature-construction  machine-learning  rather-interesting  similarity-measures  to-write-about 
5 days ago by Vaguery
[1610.06934] The K Shortest Paths Problem with Application to Routing
Due to the computational complexity of finding almost shortest simple paths, we propose that identifying a larger collection of (nonbacktracking) paths is more efficient than finding almost shortest simple paths on positively weighted real-world networks. First, we present an easy to implement O(mlogm+kL) solution for finding all (nonbacktracking) paths with bounded length D between two arbitrary nodes on a positively weighted graph, where L is an upperbound for the number of nodes in any of the k outputted paths. Subsequently, we illustrate that for undirected Chung-Lu random graphs, the ratio between the number of nonbacktracking and simple paths asymptotically approaches 1 with high probability for a wide range of parameters. We then consider an application to the almost shortest paths algorithm to measure path diversity for internet routing in a snapshot of the Autonomous System graph subject to an edge deletion process.
planning  data-structures  algorithms  rather-interesting  operations-research  graph-theory  to-write-about  nudge-targets  consider:looking-to-see 
5 days ago by Vaguery
[1711.03172] Curve Reconstruction via the Global Statistics of Natural Curves
Reconstructing the missing parts of a curve has been the subject of much computational research, with applications in image inpainting, object synthesis, etc. Different approaches for solving that problem are typically based on processes that seek visually pleasing or perceptually plausible completions. In this work we focus on reconstructing the underlying physically likely shape by utilizing the global statistics of natural curves. More specifically, we develop a reconstruction model that seeks the mean physical curve for a given inducer configuration. This simple model is both straightforward to compute and it is receptive to diverse additional information, but it requires enough samples for all curve configurations, a practical requirement that limits its effective utilization. To address this practical issue we explore and exploit statistical geometrical properties of natural curves, and in particular, we show that in many cases the mean curve is scale invariant and often times it is extensible. This, in turn, allows to boost the number of examples and thus the robustness of the statistics and its applicability. The reconstruction results are not only more physically plausible but they also lead to important insights on the reconstruction problem, including an elegant explanation why certain inducer configurations are more likely to yield consistent perceptual completions than others.
inference  computer-vision  rather-interesting  algorithms  nudge-targets  consider:looking-to-see  consider:representation  performance-measure  to-write-about 
5 days ago by Vaguery
[1705.01887] Streaming for Aibohphobes: Longest Palindrome with Mismatches
A palindrome is a string that reads the same as its reverse, such as "aibohphobia" (fear of palindromes). Given an integer d>0, a d-near-palindrome is a string of Hamming distance at most d from its reverse. We study the natural problem of identifying a longest d-near-palindrome in data streams. The problem is relevant to the analysis of DNA databases, and to the task of repairing recursive structures in documents such as XML and JSON. We present an algorithm that returns a d-near-palindrome whose length is within a multiplicative (1+ϵ)-factor of the longest d-near-palindrome. Our algorithm also returns the set of mismatched indices of the d-near-palindrome, using (dlog7nϵlog(1+ϵ)) bits of space, and (dlog6nϵlog(1+ϵ)) update time per arriving symbol. We show that Ω(dlogn) space is necessary for estimating the length of longest d-near-palindromes with high probability. We further obtain an additive-error approximation algorithm and a comparable lower bound, as well as an exact two-pass algorithm that solves the longest d-near-palindrome problem using (d2n‾√log6n) bits of space.
strings  computational-complexity  algorithms  probability-theory  inference  online-learning  rather-interesting  to-write-about  nudge-targets  consider:looking-to-see  consider:representation 
5 days ago by Vaguery
[1611.07769] The many facets of community detection in complex networks
Community detection, the decomposition of a graph into essential building blocks, has been a core research topic in network science over the past years. Since a precise notion of what constitutes a community has remained evasive, community detection algorithms have often been compared on benchmark graphs with a particular form of assortative community structure and classified based on the mathematical techniques they employ. However, this comparison can be misleading because apparent similarities in their mathematical machinery can disguise different goals and reasons for why we want to employ community detection in the first place. Here we provide a focused review of these different motivations that underpin community detection. This problem-driven classification is useful in applied network science, where it is important to select an appropriate algorithm for the given purpose. Moreover, highlighting the different facets of community detection also delineates the many lines of research and points out open directions and avenues for future research.
community-detection  social-networks  philosophy-of-science  define-your-terms  review  to-write-about 
5 days ago by Vaguery
[1702.01522] Inverse statistical problems: from the inverse Ising problem to data science
Inverse problems in statistical physics are motivated by the challenges of `big data' in different fields, in particular high-throughput experiments in biology. In inverse problems, the usual procedure of statistical physics needs to be reversed: Instead of calculating observables on the basis of model parameters, we seek to infer parameters of a model based on observations. In this review, we focus on the inverse Ising problem and closely related problems, namely how to infer the coupling strengths between spins given observed spin correlations, magnetisations, or other data. We review applications of the inverse Ising problem, including the reconstruction of neural connections, protein structure determination, and the inference of gene regulatory networks. For the inverse Ising problem in equilibrium, a number of controlled and uncontrolled approximate solutions have been developed in the statistical mechanics community. A particularly strong method, pseudolikelihood, stems from statistics. We also review the inverse Ising problem in the non-equilibrium case, where the model parameters must be reconstructed based on non-equilibrium statistics.
data-science  statistics  inverse-problems  complexology  rather-interesting  inference  to-write-about  review  to-simulate  philosophy-of-science 
5 days ago by Vaguery
[1709.07485] The Covering Path Problem on a Grid
This paper introduces the covering path problem on a grid (CPPG) which finds the cost-minimizing path connecting a subset of points in a grid such that each point is within a predetermined distance of a point from the chosen subset. We leverage the geometric properties of the grid graph which captures the road network structure in many transportation problems, including our motivating setting of school bus routing. As defined in this paper, the CPPG is a bi-objective optimization problem comprised of one cost term related to path length and one cost term related to stop count. We develop a trade-off constraint which quantifies the trade-off between path length and stop count and provides a lower bound for the bi-objective optimization problem. We introduce simple construction techniques to provide feasible paths that match the lower bound within a constant factor. Importantly, this solution approach uses transformations of the general CPPG to either a discrete CPPG or continuous CPPG based on the value of the coverage radius. For both the discrete and continuous versions, we provide fast constant-factor approximations, thus solving the general CPPG.
operations-research  planning  multiobjective-optimization  rather-interesting  to-write-about 
5 days ago by Vaguery
plaidml/plaidml: PlaidML is a framework for making deep learning work everywhere.
PlaidML is a multi-language acceleration framework that:

Enables practitioners to deploy high-performance neural nets on any device
Allows hardware developers to quickly integrate with high-level frameworks
Allows framework developers to easily add support for many kinds of hardware
For background and early benchmarks see our blog post announcing the release. PlaidML is under active development and should be thought of as early alpha quality.
machine-learning  library  software-development  to-learn  to-write-about 
10 days ago by Vaguery
[1710.05183] Inferring Mesoscale Models of Neural Computation
Recent years have seen dramatic progress in the development of techniques for measuring the activity and connectivity of large populations of neurons in the brain. However, as these techniques grow ever more powerful---allowing us to even contemplate measuring every neuron in entire brain---a new problem arises: how do we make sense of the mountains of data that these techniques produce? Here, we argue that the time is ripe for building an intermediate or "mesoscale" computational theory that can bridge between single-cell (microscale) accounts of neural function and behavioral (macroscale) accounts of animal cognition and environmental complexity. Just as digital accounts of computation in conventional computers abstract away the non-essential dynamics of the analog circuits that implementing gates and registers, so too a computational account of animal cognition can afford to abstract from the non-essential dynamics of neurons. We argue that the geometry of neural circuits is essential in explaining the computational limitations and technological innovations inherent in biological information processing. We propose a blueprint for how to employ tools from modern machine learning to automatically infer a satisfying mesoscale account of neural computation that combines functional and structural data, with an emphasis on learning and exploiting regularity and repeating motifs in neuronal circuits. Rather than suggest a specific theory, we present a new class of scientific instruments that can enable neuroscientists to design, propose, implement and test mesoscale theories of neural computation.
dynamical-systems  machine-learning  deep-learning  representation  temporal-models  rather-interesting  inference  to-write-about 
10 days ago by Vaguery
[1512.00507] Beyond Aztec Castles: Toric Cascades in the $dP_3$ Quiver
Given one of an infinite class of supersymmetric quiver gauge theories, string theorists can associate a corresponding toric variety (which is a Calabi-Yau 3-fold) as well as an associated combinatorial model known as a brane tiling. In combinatorial language, a brane tiling is a bipartite graph on a torus and its perfect matchings are of interest to both combinatorialists and physicists alike. A cluster algebra may also be associated to such quivers and in this paper we study the generators of this algebra, known as cluster variables, for the quiver associated to the cone over the del Pezzo surface dP3. In particular, mutation sequences involving mutations exclusively at vertices with two in-coming arrows and two out-going arrows are referred to as toric cascades in the string theory literature. Such toric cascades give rise to interesting discrete integrable systems on the level of cluster variable dynamics. We provide an explicit algebraic formula for all cluster variables which are reachable by toric cascades as well as a combinatorial interpretation involving perfect matchings of subgraphs of the dP3 brane tiling for these formulas in most cases.
tiling  domino-tiling  enumeration  combinatorics  to-write-about  to-simulate  review  rather-interesting 
10 days ago by Vaguery
[1504.00303] A Generalization of Aztec Dragons
Aztec dragons are lattice regions first introduced by James Propp, which have the number of tilings given by a power of 2. This family of regions has been investigated further by a number of authors. In this paper, we consider a generalization of the Aztec dragons to two new families of 6-sided regions. By using Kuo's graphical condensation method, we prove that the tilings of the new regions are always enumerated by powers of 2 and 3.
tiling  combinatorics  enumeration  out-of-the-box  to-write-about  to-simulate 
10 days ago by Vaguery
[1504.00291] On the numbers of perfect matchings of trimmed Aztec rectangles
We consider several new families of graphs obtained from Aztec rectangle and augmented Aztec rectangle graphs by trimming two opposite corners. We prove that the perfect matchings of these new graphs are enumerated by powers of 2, 3, 5, and 11. The result yields a proof of a conjectured posed by Ciucu. In addition, we reveal a hidden relation between our graphs and the hexagonal dungeons introduced by Blum.
combinatorics  Aztec-diamond  domino-tiling  enumeration  to-write-about  to-simulate 
10 days ago by Vaguery
[1411.0146] Double Aztec Rectangles
We investigate the connection between lozenge tilings and domino tilings by introducing a new family of regions obtained by attaching two different Aztec rectangles. We prove a simple product formula for the generating functions of the tilings of the new regions, which involves the statistics as in the Aztec diamond theorem (Elkies, Kuperberg, Larsen, and Propp, J. Algebraic Combin. 1992). Moreover, we consider the connection between the generating function and MacMahon's q-enumeration of plane partitions fitting in a given box
combinatorics  enumeration  tiling  domino-tiling  Aztec-diamond  to-write-about  to-simulate 
10 days ago by Vaguery
[1403.4493] Enumeration of tilings of quartered Aztec rectangles
We generalize a theorem of W. Jockusch and J. Propp on quartered Aztec diamonds by enumerating the tilings of quartered Aztec rectangles. We use subgraph replacement method to transform the dual graph of a quartered Aztec rectangle to the dual graph of a quartered lozenge hexagon, and then use Lindstr\"{o}m-Gessel-Viennot methodology to find the number of tilings of a quartered lozenge hexagon.
combinatorics  enumeration  Aztec-diamond  domino-tiling  to-write-about  to-simulate 
10 days ago by Vaguery
[1711.02818] Enumeration of lozenge tilings of a hexagon with a shamrock missing on the symmetry axis
In their paper about a dual of MacMahon's classical theorem on plane partitions, Ciucu and Krattenthaler proved a closed form product formula for the tiling number of a hexagon with a "shamrock", a union of four adjacent triangles, removed in the center (Proc. Natl. Acad. Sci. USA 2013). Lai later presented a q-enumeration for lozenge tilings of a hexagon with a shamrock removed from the boundary (European J. Combin. 2017). It appears that the above are the only two positions of the shamrock hole that yield nice tiling enumerations. In this paper, we show that in the case of symmetric hexagons, we always have a simple product formula for the number of tilings when removing a shamrock at any position along the symmetry axis. Our result also generalizes Eisenk\"olbl's related work about lozenge tilings of a hexagon with two unit triangles missing on the symmetry axis (Electron. J. Combin. 1999).
tiling  domino-tiling  rather-interesting  combinatorics  enumeration  the-mangle-in-practice  representation  to-write-about  consider:the-convoluted-approach 
10 days ago by Vaguery

« earlier    

related tags

academia  academic-culture  agent-based  algorithms  alternative-computational-models  aperiodic-tiling  approximation  artificial-intelligence  artificial-life  asynchronous  automata  aztec-diamond  bin-packing  bioinformatics  biological-engineering  capitalism-as-a-bug  cellular-automata  chaos  classification  clustering  coevolution  collective-behavior  coloring  combinatorics  community-detection  community-formation  complexology  computation  computational-complexity  computational-geometry  computer-science  computer-vision  conferences  consider:algorithms  consider:approximation  consider:benford's-law?  consider:classification  consider:comparing-to-random  consider:coordination  consider:evolving-tools  consider:feature-discovery  consider:find-shape-for-points  consider:fitness  consider:game-applications  consider:gp-applications  consider:hardest-problems  consider:heuristics  consider:including-operators  consider:inverse-problem  consider:lexicase  consider:looking-to-see  consider:multiobjective-approach  consider:performance-measures  consider:rediscovery  consider:representation  consider:robustness  consider:simulation  consider:splinegons  consider:stress-testing  consider:the-convoluted-approach  construction  control-theory  criticism  cultural-dynamics  data-science  data-structures  deep-learning  define-your-terms  depression  disintermediation-in-action  distance  distributed-processing  domino-tiling  dynamical-systems  engineering-criticism  engineering-design  enumeration  epigenetics  evolutionary-algorithms  evolutionary-economics  experimental-design  extreme-values  feature-construction  feature-extraction  formal-languages  game-theory  gene-regulatory-networks  generative-art  generative-models  genetic-algorithm  geometry  graph-layout  graph-theory  hard-problems  hardware  have-watched  heuristics  history  hypergraphs  i-agree  image-processing  inference  information-theory  integer-programming  inverse-problems  ipd  library  life-o'-the-mind  looking-to-see  ludics  machine-learning  management  manufacturability  matchings  materials-science  math  mathematical-recreations  mathematics  matrices  maybe-not-the-representation-i'd-start-from  metrics  mindfulness  modeling-is-not-mathematics  modeling  multiobjective-optimization  music  natural-language-processing  neural-networks  nonlinear-dynamics  nudge-targets  nudge  number-theory  numerical-methods  online-learning  open-problems  open-questions  operations-research  optimization  out-of-the-box  p-systems  paper-folding  partial-ordering  pattern-matching  patterns  pedagogy  performance-measure  performance-space-analysis  permutations  philosophy-of-engineering  philosophy-of-science  philosophy  plane-geometry  planning  political-economy  prediction  probability-theory  proceedings  programming-language  programming  proof  purdy-pitchers  puzzles  python  quotes  rather-interesting  reinvented-wheel  reinvented-wheels  representation  review  rewriting-systems  rhythm  robustness  saliency  sandpiles  see-also:exploding-dots  semantics  set-theory  signal-processing  similarity-measures  similarity  simulation  social-networks  software-development-is-not-programming  software-development  statistics  strings  superresolution  system-of-professions  temporal-models  the-mangle-in-practice  theoretical-biology  tiling  time-series  time-warping  to-do  to-draw  to-learn  to-read  to-simulate  to-understand  topology  transients  try-not-to-do-this  universality  unsupervised-learning  workalike 

Copy this bookmark: