**Vaguery + consider:representation**
383

[1709.04109] Empower Sequence Labeling with Task-Aware Neural Language Model

22 days ago by Vaguery

Linguistic sequence labeling is a general modeling approach that encompasses a variety of problems, such as part-of-speech tagging and named entity recognition. Recent advances in neural networks (NNs) make it possible to build reliable models without handcrafted features. However, in many cases, it is hard to obtain sufficient annotations to train these models. In this study, we develop a novel neural framework to extract abundant knowledge hidden in raw texts to empower the sequence labeling task. Besides word-level knowledge contained in pre-trained word embeddings, character-aware neural language models are incorporated to extract character-level knowledge. Transfer learning techniques are further adopted to mediate different components and guide the language model towards the key knowledge. Comparing to previous methods, these task-specific knowledge allows us to adopt a more concise model and conduct more efficient training. Different from most transfer learning methods, the proposed framework does not rely on any additional supervision. It extracts knowledge from self-contained order information of training sequences. Extensive experiments on benchmark datasets demonstrate the effectiveness of leveraging character-level knowledge and the efficiency of co-training. For example, on the CoNLL03 NER task, model training completes in about 6 hours on a single GPU, reaching F1 score of 91.71±0.10 without using any extra annotation.

natural-language-processing
deep-learning
neural-networks
nudge-targets
consider:feature-discovery
consider:representation
to-write-about
22 days ago by Vaguery

A hybrid placement strategy for the three-dimensional strip packing problem - ScienceDirect

4 weeks ago by Vaguery

This paper presents a hybrid placement strategy for the three-dimensional strip packing problem which involves packing a set of cuboids (‘boxes’) into a three-dimensional bin (parallelepiped) of fixed width and height but unconstrained length (the ‘container’). The goal is to pack all of the boxes into the container, minimising its resulting length. This problem has potential industry application in stock cutting (wood, polystyrene, etc. – minimising wastage) and also cargo loading, as well as other applications in areas such as multi-dimensional resource scheduling. In addition to the proposed strategy a number of test results on available literature benchmark problems are presented and analysed. The results of empirical testing of the algorithm show that it out-performs other methods from the literature, consistently in terms of speed and solution quality-producing 28 best known results from 35 test cases.

operations-research
hyperheuristics
metaheuristics
optimization
constraint-satisfaction
nudge-targets
consider:representation
consider:looking-to-see
to-write-about
4 weeks ago by Vaguery

[1709.08359] On the expressive power of query languages for matrices

4 weeks ago by Vaguery

We investigate the expressive power of 𝖬𝖠𝖳𝖫𝖠𝖭𝖦, a formal language for matrix manipulation based on common matrix operations and linear algebra. The language can be extended with the operation 𝗂𝗇𝗏 of inverting a matrix. In 𝖬𝖠𝖳𝖫𝖠𝖭𝖦+𝗂𝗇𝗏 we can compute the transitive closure of directed graphs, whereas we show that this is not possible without inversion. Indeed we show that the basic language can be simulated in the relational algebra with arithmetic operations, grouping, and summation. We also consider an operation 𝖾𝗂𝗀𝖾𝗇 for diagonalizing a matrix, which is defined so that different eigenvectors returned for a same eigenvalue are orthogonal. We show that 𝗂𝗇𝗏 can be expressed in 𝖬𝖠𝖳𝖫𝖠𝖭𝖦+𝖾𝗂𝗀𝖾𝗇. We put forward the open question whether there are boolean queries about matrices, or generic queries about graphs, expressible in 𝖬𝖠𝖳𝖫𝖠𝖭𝖦+𝖾𝗂𝗀𝖾𝗇 but not in 𝖬𝖠𝖳𝖫𝖠𝖭𝖦+𝗂𝗇𝗏. The evaluation problem for 𝖬𝖠𝖳𝖫𝖠𝖭𝖦+𝖾𝗂𝗀𝖾𝗇 is shown to be complete for the complexity class ∃R.

matrices
programming-language
representation
rather-interesting
nudge
consider:representation
to--do
4 weeks ago by Vaguery

[1801.01922] Vectorization of Line Drawings via PolyVector Fields

5 weeks ago by Vaguery

Image tracing is a foundational component of the workflow in graphic design, engineering, and computer animation, linking hand-drawn concept images to collections of smooth curves needed for geometry processing and editing. Even for clean line drawings, modern algorithms often fail to faithfully vectorize junctions, or points at which curves meet; this produces vector drawings with incorrect connectivity. This subtle issue undermines the practical application of vectorization tools and accounts for hesitance among artists and engineers to use automatic vectorization software. To address this issue, we propose a novel image vectorization method based on state-of-the-art mathematical algorithms for frame field processing. Our algorithm is tailored specifically to disambiguate junctions without sacrificing quality.

graphics
algorithms
inference
rather-interesting
feature-construction
nudge-targets
consider:representation
consider:looking-to-see
consider:performance-measures
5 weeks ago by Vaguery

[1802.06415] Solving Large-Scale Minimum-Weight Triangulation Instances to Provable Optimality

6 weeks ago by Vaguery

We consider practical methods for the problem of finding a minimum-weight triangulation (MWT) of a planar point set, a classic problem of computational geometry with many applications. While Mulzer and Rote proved in 2006 that computing an MWT is NP-hard, Beirouti and Snoeyink showed in 1998 that computing provably optimal solutions for MWT instances of up to 80,000 uniformly distributed points is possible, making use of clever heuristics that are based on geometric insights. We show that these techniques can be refined and extended to instances of much bigger size and different type, based on an array of modifications and parallelizations in combination with more efficient geometric encodings and data structures. As a result, we are able to solve MWT instances with up to 30,000,000 uniformly distributed points in less than 4 minutes to provable optimality. Moreover, we can compute optimal solutions for a vast array of other benchmark instances that are not uniformly distributed, including normally distributed instances (up to 30,000,000 points), all point sets in the TSPLIB (up to 85,900 points), and VLSI instances with up to 744,710 points. This demonstrates that from a practical point of view, MWT instances can be handled quite well, despite their theoretical difficulty.

computational-geometry
algorithms
plane-geometry
optimization
to-write-about
nudge-targets
consider:representation
6 weeks ago by Vaguery

[1709.08004] Slow-scale split-step tau-leap method for stiff stochastic chemical systems

7 weeks ago by Vaguery

Tau-leaping is a family of algorithms for the approximate simulation of discrete state continuous time Markov chains. The motivation for the development of such methods can be found, for instance, in the fields of chemical kinetics and systems biology. It is well known that the dynamical behavior of biochemical systems is often intrinsically stiff representing a serious challenge for their numerical approximation. The naive extension of stiff deterministic solvers to stochastic integration usually yields numerical solutions with either impractically large relaxation times or incorrectly resolved covariance. In this paper, we propose a novel splitting heuristic which allows to resolve these issues. The proposed numerical integrator takes advantage of the special structure of the linear systems with explicitly available equations for the mean and the covariance which we use to calibrate the parameters of the scheme. It is shown that the method is able to reproduce the exact mean and variance of the linear scalar test equation and has very good accuracy for the arbitrarily stiff systems at least in linear case. The numerical examples for both linear and nonlinear systems are also provided and the obtained results confirm the efficiency of the considered splitting approach.

numerical-methods
diffy-Qs
approximation
systems-biology
algorithms
to-write-about
consider:representation
rather-interesting
nudge
7 weeks ago by Vaguery

Discovering Types for Entity Disambiguation

7 weeks ago by Vaguery

Our system uses the following steps:

Extract every Wikipedia-internal link to determine, for each word, the set of conceivable entities it can refer to. For example, when encountering the link [jaguar](https://en.wikipedia.org/wiki/Jaguar) in a Wikipedia page, we conclude that https://en.wikipedia.org/wiki/Jaguar is one of the meanings of jaguar.

Walk the Wikipedia category tree (using the Wikidata knowledge graph) to determine, for each entity, the set of categories it belongs to. For example, at the bottom of https://en.wikipedia.org/wiki/Jaguar_Cars’s Wikipedia page, are the following categories (which themselves have their own categories, such as Automobiles):

Pick a list of ~100 categories to be your “type” system, and optimize over this choice of categories so that they compactly express any entity. We know the mapping of entities to categories, so given a type system, we can represent each entity as a ~100-dimensional binary vector indicating membership in each category.

Using every Wikipedia-internal link and its surrounding context, produce training data mapping a word plus context to the ~100-dimensional binary representation of the corresponding entity, and train a neural network to predict this mapping. This chains together the previous steps: Wikipedia links map a word to an entity, we know the categories for each entity from step 2, and step 3 picked the categories in our type system.

At test time, given a word and surrounding context, our neural network’s output can be interpreted as the probability that the word belongs to each category. If we knew the exact set of category memberships, we would narrow down to one entity (assuming well-chosen categories). But instead, we must play a probabilistic 20 questions: use Bayes’ theorem to calculate the chance of the word disambiguating to each of its possible entities.

machine-learning
disambiguation
data-fusion
to-write-about
consider:representation
Extract every Wikipedia-internal link to determine, for each word, the set of conceivable entities it can refer to. For example, when encountering the link [jaguar](https://en.wikipedia.org/wiki/Jaguar) in a Wikipedia page, we conclude that https://en.wikipedia.org/wiki/Jaguar is one of the meanings of jaguar.

Walk the Wikipedia category tree (using the Wikidata knowledge graph) to determine, for each entity, the set of categories it belongs to. For example, at the bottom of https://en.wikipedia.org/wiki/Jaguar_Cars’s Wikipedia page, are the following categories (which themselves have their own categories, such as Automobiles):

Pick a list of ~100 categories to be your “type” system, and optimize over this choice of categories so that they compactly express any entity. We know the mapping of entities to categories, so given a type system, we can represent each entity as a ~100-dimensional binary vector indicating membership in each category.

Using every Wikipedia-internal link and its surrounding context, produce training data mapping a word plus context to the ~100-dimensional binary representation of the corresponding entity, and train a neural network to predict this mapping. This chains together the previous steps: Wikipedia links map a word to an entity, we know the categories for each entity from step 2, and step 3 picked the categories in our type system.

At test time, given a word and surrounding context, our neural network’s output can be interpreted as the probability that the word belongs to each category. If we knew the exact set of category memberships, we would narrow down to one entity (assuming well-chosen categories). But instead, we must play a probabilistic 20 questions: use Bayes’ theorem to calculate the chance of the word disambiguating to each of its possible entities.

7 weeks ago by Vaguery

[1802.01021] DeepType: Multilingual Entity Linking by Neural Type System Evolution

7 weeks ago by Vaguery

The wealth of structured (e.g. Wikidata) and unstructured data about the world available today presents an incredible opportunity for tomorrow's Artificial Intelligence. So far, integration of these two different modalities is a difficult process, involving many decisions concerning how best to represent the information so that it will be captured or useful, and hand-labeling large amounts of data. DeepType overcomes this challenge by explicitly integrating symbolic information into the reasoning process of a neural network with a type system. First we construct a type system, and second, we use it to constrain the outputs of a neural network to respect the symbolic structure. We achieve this by reformulating the design problem into a mixed integer problem: create a type system and subsequently train a neural network with it. In this reformulation discrete variables select which parent-child relations from an ontology are types within the type system, while continuous variables control a classifier fit to the type system. The original problem cannot be solved exactly, so we propose a 2-step algorithm: 1) heuristic search or stochastic optimization over discrete variables that define a type system informed by an Oracle and a Learnability heuristic, 2) gradient descent to fit classifier parameters. We apply DeepType to the problem of Entity Linking on three standard datasets (i.e. WikiDisamb30, CoNLL (YAGO), TAC KBP 2010) and find that it outperforms all existing solutions by a wide margin, including approaches that rely on a human-designed type system or recent deep learning-based entity embeddings, while explicitly using symbolic information lets it integrate new entities without retraining.

data-fusion
machine-learning
deep-learning
rather-interesting
inference
classification
to-write-about
consider:representation
7 weeks ago by Vaguery

[1607.01985] A Common Derivation for Markov Chain Monte Carlo Algorithms with Tractable and Intractable Targets

7 weeks ago by Vaguery

Markov chain Monte Carlo is a class of algorithms for drawing Markovian samples from high-dimensional target densities to approximate the numerical integration associated with computing statistical expectation, especially in Bayesian statistics. However, many Markov chain Monte Carlo algorithms do not seem to share the same theoretical support and each algorithm is proven in a different way. This incurs many terminologies and ancillary concepts, which makes Markov chain Monte Carlo literature seems to be scattered and intimidating to researchers from many other fields, including new researchers of Bayesian statistics.

A generalised version of the Metropolis-Hastings algorithm is constructed with a random number generator and a self-reverse mapping. This formulation admits many other Markov chain Monte Carlo algorithms as special cases. A common derivation for many Markov chain Monte Carlo algorithms is useful in drawing connections and comparisons between these algorithms. As a result, we now can construct many novel combinations of multiple Markov chain Monte Carlo algorithms that amplify the efficiency of each individual algorithm. Specifically, we propose two novel sampling schemes that combine slice sampling with directional or Hamiltonian sampling. Our Hamiltonian slice sampling scheme is also applicable in the pseudo-marginal context where the target density is intractable but can be unbiasedly estimated, e.g. using particle filtering.

sampling
algorithms
representation
rather-interesting
nudge-targets
consider:representation
define-your-terms
A generalised version of the Metropolis-Hastings algorithm is constructed with a random number generator and a self-reverse mapping. This formulation admits many other Markov chain Monte Carlo algorithms as special cases. A common derivation for many Markov chain Monte Carlo algorithms is useful in drawing connections and comparisons between these algorithms. As a result, we now can construct many novel combinations of multiple Markov chain Monte Carlo algorithms that amplify the efficiency of each individual algorithm. Specifically, we propose two novel sampling schemes that combine slice sampling with directional or Hamiltonian sampling. Our Hamiltonian slice sampling scheme is also applicable in the pseudo-marginal context where the target density is intractable but can be unbiasedly estimated, e.g. using particle filtering.

7 weeks ago by Vaguery

Battleship

8 weeks ago by Vaguery

I’m going to continue my analysis of classic card and board games by looking at the game of Battleship. (See early postings for analysis of Chutes & Ladders, Candyland and Risk).

dammit-arthegall
games
game-theory
probability-theory
simulation
nudge-targets
to-write-about
consider:representation
8 weeks ago by Vaguery

[1802.04196] Universal quantum computing and three-manifolds

8 weeks ago by Vaguery

A single qubit may be represented on the Bloch sphere or similarly on the 3-sphere S3. Our goal is to dress this correspondence by converting the language of universal quantum computing (uqc) to that of 3-manifolds. A magic state and the Pauli group acting on it define a model of uqc as a POVM that one recognizes to be a 3-manifold M3. E. g., the d-dimensional POVMs defined from subgroups of finite index of the modular group PSL(2,ℤ) correspond to d-fold M3- coverings over the trefoil knot. In this paper, one also investigates quantum information on a few \lq universal' knots and links such as the figure-of-eight knot, the Whitehead link and Borromean rings, making use of the catalog of platonic manifolds available on SnapPy. Further connections between POVMs based uqc and M3's obtained from Dehn fillings are explored.

quantums
quantum-computing
representation
rather-interesting
to-write-about
nudge
consider:representation
8 weeks ago by Vaguery

[1709.04851] Factor Analysis of Interval Data

8 weeks ago by Vaguery

This paper presents a factor analysis model for symbolic data, focusing on the particular case of interval-valued variables. The proposed method describes the correlation structure among the measured interval-valued variables in terms of a few underlying, but unobservable, uncorrelated interval-valued variables, called \textit{common factors}. Uniform and Triangular distributions are considered within each observed interval. We obtain the corresponding sample mean, variance and covariance assuming a general Triangular distribution.

In our proposal, factors are extracted either by Principal Component or by Principal Axis Factoring, performed on the interval-valued variables correlation matrix. To estimate the values of the common factors, usually called \textit{factor scores}, two approaches are considered, which are inspired in methods for real-valued data: the Bartlett and the Anderson-Rubin methods. In both cases, the estimated values are obtained solving an optimization problem that minimizes a function of the weighted squared Mallows distance between quantile functions. Explicit expressions for the quantile function and the squared Mallows distance are derived assuming a general Triangular distribution.

The applicability of the method is illustrated using two sets of data: temperature and precipitation in cities of the United States of America between the years 1971 and 2000 and measures of car characteristics of different makes and models. Moreover, the method is evaluated on synthetic data with predefined correlation structures.

representation
statistics
algorithms
rather-interesting
to-write-about
nudge
consider:representation
In our proposal, factors are extracted either by Principal Component or by Principal Axis Factoring, performed on the interval-valued variables correlation matrix. To estimate the values of the common factors, usually called \textit{factor scores}, two approaches are considered, which are inspired in methods for real-valued data: the Bartlett and the Anderson-Rubin methods. In both cases, the estimated values are obtained solving an optimization problem that minimizes a function of the weighted squared Mallows distance between quantile functions. Explicit expressions for the quantile function and the squared Mallows distance are derived assuming a general Triangular distribution.

The applicability of the method is illustrated using two sets of data: temperature and precipitation in cities of the United States of America between the years 1971 and 2000 and measures of car characteristics of different makes and models. Moreover, the method is evaluated on synthetic data with predefined correlation structures.

8 weeks ago by Vaguery

Creating Intricate Art with Neural Style Transfer – Becoming Human: Artificial Intelligence Magazine

11 weeks ago by Vaguery

The architecture is based on Gatys’ style transfer algorithm with a few minor modifications. In this case, the content image is a silhouette and style image can be any pattern (ranging from simple black and white doodle to more complex color mosaics). The code also contains a module to invert and create a mask based on the content image, which will eventually be applied to the generated pattern.

generative-art
generative-models
style-transfer
computational-recreations
to-write-about
to-do
nudge-targets
consider:representation
11 weeks ago by Vaguery

Maverick Solitaire

11 weeks ago by Vaguery

An early form of Poker solitaire is actually a puzzle of sorts, one which has been called Maverick solitaire (after its appearance on the 1950's/1960's Western T.V. show Maverick, in the first season episode Rope of Cards). Twenty five cards are dealt from a shuffled 52-card deck. The object is to divide the 25 cards into five groups of five cards so that each is a pat hand in draw poker (a hand which does not need to be drawn to). Maverick Solitaire is well-known as a sucker bet, as the probability of success with a random group of 25 cards would seem low, but is actually quite high: the eponymous author of Maverick's Guide to Poker estimates the odds to be at least 98 percent (he disallows four of a kind). This is remarkably accurate: Mark Masten's computer solver, allowing four of a kind, solved about 98.1 percent of a random set of 1000 deals. Deals with unique solutions are even less common than impossible ones: the sample above had 19 impossible deals and only 8 with unique solutions.

mathematical-recreations
cards
nudge-targets
consider:representation
consider:looking-to-see
11 weeks ago by Vaguery

[1707.05994] Computing Tutte Paths

12 weeks ago by Vaguery

Tutte paths are one of the most successful tools for attacking Hamiltonicity problems in planar graphs. Unfortunately, results based on them are non-constructive, as their proofs inherently use an induction on overlapping subgraphs and these overlaps hinder to bound the running time to a polynomial. For special cases however, computational results of Tutte paths are known: For 4-connected planar graphs, Tutte paths are in fact Hamiltonian paths and Chiba and Nishizeki showed how to compute such paths in linear time. For 3-connected planar graphs, Tutte paths have a more complicated structure, and it has only recently been shown that they can be computed in polynomial time. However, Tutte paths are defined for general 2-connected planar graphs and this is what most applications need. Unfortunately, no computational results are known. We give the first efficient algorithm that computes a Tutte path (for the general case of 2-connected planar graphs). One of the strongest existence results about such Tutte paths is due to Sanders, which allows to prescribe the end vertices and an intermediate edge of the desired path. Encompassing and strengthening all previous computational results on Tutte paths, we show how to compute this special Tutte path efficiently. Our method refines both, the results of Thomassen and Sanders, and avoids overlapping subgraphs by using a novel iterative decomposition along 2-separators. Finally, we show that our algorithm runs in quadratic time.

graph-theory
algorithms
representation
rather-interesting
to-understand
nudge-targets
consider:representation
consider:feature-discovery
12 weeks ago by Vaguery

[1709.08878] Generating Sentences by Editing Prototypes

12 weeks ago by Vaguery

We propose a new generative model of sentences that first samples a prototype sentence from the training corpus and then edits it into a new sentence. Compared to traditional models that generate from scratch either left-to-right or by first sampling a latent sentence vector, our prototype-then-edit model improves perplexity on language modeling and generates higher quality outputs according to human evaluation. Furthermore, the model gives rise to a latent edit vector that captures interpretable semantics such as sentence similarity and sentence-level analogies.

machine-learning
natural-language-processing
rather-interesting
representation
to-write-about
nudge-targets
consider:representation
consider:performance-measures
generative-models
12 weeks ago by Vaguery

Tanya Khovanova's Math Blog » Blog Archive » 2015 Coin Problem Solution

12 weeks ago by Vaguery

A while ago I posted my second favorite problem from the 2015 All-Russian Math Olympiad:

mathematical-recreations
puzzles
nudge-targets
consider:looking-to-see
consider:representation
to-write-about
12 weeks ago by Vaguery

[1709.05725] FlashProfile: Interactive Synthesis of Syntactic Profiles

12 weeks ago by Vaguery

We address the problem of learning comprehensive syntactic profiles for a set of strings. Real-world datasets, typically curated from multiple sources, often contain data in various formats. Thus any data processing task is preceded by the critical step of data format identification. However, manual inspection of data to identify various formats is infeasible in standard big-data scenarios.

We present a technique for generating comprehensive syntactic profiles in terms of user-defined patterns that also allows for interactive refinement. We define a syntactic profile as a set of succinct patterns that describe the entire dataset. Our approach efficiently learns such profiles, and allows refinement by exposing a desired number of patterns.

Our implementation, FlashProfile, shows a median profiling time of 0.7s over 142 tasks on 74 real datasets. We also show that access to the generated data profiles allow for more accurate synthesis of programs, using fewer examples in programming-by-example workflows.

pattern-discovery
rather-interesting
strings
data-mining
statistics
nudge-targets
consider:looking-to-see
consider:representation
consider:performance-measures
We present a technique for generating comprehensive syntactic profiles in terms of user-defined patterns that also allows for interactive refinement. We define a syntactic profile as a set of succinct patterns that describe the entire dataset. Our approach efficiently learns such profiles, and allows refinement by exposing a desired number of patterns.

Our implementation, FlashProfile, shows a median profiling time of 0.7s over 142 tasks on 74 real datasets. We also show that access to the generated data profiles allow for more accurate synthesis of programs, using fewer examples in programming-by-example workflows.

12 weeks ago by Vaguery

[1712.08558] Lattice-based Locality Sensitive Hashing is Optimal

january 2018 by Vaguery

Locality sensitive hashing (LSH) was introduced by Indyk and Motwani (STOC `98) to give the first sublinear time algorithm for the c-approximate nearest neighbor (ANN) problem using only polynomial space. At a high level, an LSH family hashes "nearby" points to the same bucket and "far away" points to different buckets. The quality of measure of an LSH family is its LSH exponent, which helps determine both query time and space usage.

In a seminal work, Andoni and Indyk (FOCS `06) constructed an LSH family based on random ball partitioning of space that achieves an LSH exponent of 1/c^2 for the l_2 norm, which was later shown to be optimal by Motwani, Naor and Panigrahy (SIDMA `07) and O'Donnell, Wu and Zhou (TOCT `14). Although optimal in the LSH exponent, the ball partitioning approach is computationally expensive. So, in the same work, Andoni and Indyk proposed a simpler and more practical hashing scheme based on Euclidean lattices and provided computational results using the 24-dimensional Leech lattice. However, no theoretical analysis of the scheme was given, thus leaving open the question of finding the exponent of lattice based LSH.

In this work, we resolve this question by showing the existence of lattices achieving the optimal LSH exponent of 1/c^2 using techniques from the geometry of numbers. At a more conceptual level, our results show that optimal LSH space partitions can have periodic structure. Understanding the extent to which additional structure can be imposed on these partitions, e.g. to yield low space and query complexity, remains an important open problem.

algorithms
numerical-methods
rather-interesting
approximation
computational-complexity
nudge-targets
consider:looking-to-see
consider:representation
In a seminal work, Andoni and Indyk (FOCS `06) constructed an LSH family based on random ball partitioning of space that achieves an LSH exponent of 1/c^2 for the l_2 norm, which was later shown to be optimal by Motwani, Naor and Panigrahy (SIDMA `07) and O'Donnell, Wu and Zhou (TOCT `14). Although optimal in the LSH exponent, the ball partitioning approach is computationally expensive. So, in the same work, Andoni and Indyk proposed a simpler and more practical hashing scheme based on Euclidean lattices and provided computational results using the 24-dimensional Leech lattice. However, no theoretical analysis of the scheme was given, thus leaving open the question of finding the exponent of lattice based LSH.

In this work, we resolve this question by showing the existence of lattices achieving the optimal LSH exponent of 1/c^2 using techniques from the geometry of numbers. At a more conceptual level, our results show that optimal LSH space partitions can have periodic structure. Understanding the extent to which additional structure can be imposed on these partitions, e.g. to yield low space and query complexity, remains an important open problem.

january 2018 by Vaguery

[1711.03172] Curve Reconstruction via the Global Statistics of Natural Curves

november 2017 by Vaguery

Reconstructing the missing parts of a curve has been the subject of much computational research, with applications in image inpainting, object synthesis, etc. Different approaches for solving that problem are typically based on processes that seek visually pleasing or perceptually plausible completions. In this work we focus on reconstructing the underlying physically likely shape by utilizing the global statistics of natural curves. More specifically, we develop a reconstruction model that seeks the mean physical curve for a given inducer configuration. This simple model is both straightforward to compute and it is receptive to diverse additional information, but it requires enough samples for all curve configurations, a practical requirement that limits its effective utilization. To address this practical issue we explore and exploit statistical geometrical properties of natural curves, and in particular, we show that in many cases the mean curve is scale invariant and often times it is extensible. This, in turn, allows to boost the number of examples and thus the robustness of the statistics and its applicability. The reconstruction results are not only more physically plausible but they also lead to important insights on the reconstruction problem, including an elegant explanation why certain inducer configurations are more likely to yield consistent perceptual completions than others.

inference
computer-vision
rather-interesting
algorithms
nudge-targets
consider:looking-to-see
consider:representation
performance-measure
to-write-about
november 2017 by Vaguery

[1705.01887] Streaming for Aibohphobes: Longest Palindrome with Mismatches

november 2017 by Vaguery

A palindrome is a string that reads the same as its reverse, such as "aibohphobia" (fear of palindromes). Given an integer d>0, a d-near-palindrome is a string of Hamming distance at most d from its reverse. We study the natural problem of identifying a longest d-near-palindrome in data streams. The problem is relevant to the analysis of DNA databases, and to the task of repairing recursive structures in documents such as XML and JSON. We present an algorithm that returns a d-near-palindrome whose length is within a multiplicative (1+ϵ)-factor of the longest d-near-palindrome. Our algorithm also returns the set of mismatched indices of the d-near-palindrome, using (dlog7nϵlog(1+ϵ)) bits of space, and (dlog6nϵlog(1+ϵ)) update time per arriving symbol. We show that Ω(dlogn) space is necessary for estimating the length of longest d-near-palindromes with high probability. We further obtain an additive-error approximation algorithm and a comparable lower bound, as well as an exact two-pass algorithm that solves the longest d-near-palindrome problem using (d2n‾√log6n) bits of space.

strings
computational-complexity
algorithms
probability-theory
inference
online-learning
rather-interesting
to-write-about
nudge-targets
consider:looking-to-see
consider:representation
november 2017 by Vaguery

Proofs are not only often beautiful but also necessary | Dan McQuillan

november 2017 by Vaguery

I hope you get the idea. When mathematicians seem obsessed with rigor, it is at least in part due to our history of making mistakes, seemingly every time we tried to jump to a conclusion. But perhaps there’s something more. So for that, we end with a quote from William Thurston:

what we are doing is finding ways for people to understand and think about mathematics… what they really want is usually not some collection of of ‘answers’-what they want is understanding.

mathematics
proof
philosophy
to-write-about
consider:representation
what we are doing is finding ways for people to understand and think about mathematics… what they really want is usually not some collection of of ‘answers’-what they want is understanding.

november 2017 by Vaguery

[1710.01410] Learning Registered Point Processes from Idiosyncratic Observations

november 2017 by Vaguery

A parametric point process model is developed, with modeling based on the assumption that sequential observations often share latent phenomena, while also possessing idiosyncratic effects. An alternating optimization method is proposed to learn a "registered" point process that accounts for shared structure, as well as "warping" functions that characterize idiosyncratic aspects of each observed sequence. Under reasonable constraints, in each iteration we update the sample-specific warping functions by solving a set of constrained nonlinear programming problems in parallel, and update the model by maximum likelihood estimation. The justifiability, complexity and robustness of the proposed method are investigated in detail. Experiments on both synthetic and real-world data demonstrate that the method yields explainable point process models, achieving encouraging results compared to state-of-the-art methods.

modeling-is-not-mathematics
rather-interesting
representation
machine-learning
algorithms
nudge
nudge-targets
consider:representation
consider:performance-measures
november 2017 by Vaguery

[1212.4771] Necklaces, Convolutions, and X+Y

november 2017 by Vaguery

We give subquadratic algorithms that, given two necklaces each with n beads at arbitrary positions, compute the optimal rotation of the necklaces to best align the beads. Here alignment is measured according to the p norm of the vector of distances between pairs of beads from opposite necklaces in the best perfect matching. We show surprisingly different results for p = 1, p even, and p = \infty. For p even, we reduce the problem to standard convolution, while for p = \infty and p = 1, we reduce the problem to (min, +) convolution and (median, +) convolution. Then we solve the latter two convolution problems in subquadratic time, which are interesting results in their own right. These results shed some light on the classic sorting X + Y problem, because the convolutions can be viewed as computing order statistics on the antidiagonals of the X + Y matrix. All of our algorithms run in o(n^2) time, whereas the obvious algorithms for these problems run in \Theta(n^2) time.

optimization
operations-research
combinatorics
distance
nudge-targets
consider:looking-to-see
consider:representation
november 2017 by Vaguery

[1601.01298] Visibility Graphs, Dismantlability, and the Cops and Robbers Game

november 2017 by Vaguery

We study versions of cop and robber pursuit-evasion games on the visibility graphs of polygons, and inside polygons with straight and curved sides. Each player has full information about the other player's location, players take turns, and the robber is captured when the cop arrives at the same point as the robber. In visibility graphs we show the cop can always win because visibility graphs are dismantlable, which is interesting as one of the few results relating visibility graphs to other known graph classes. We extend this to show that the cop wins games in which players move along straight line segments inside any polygon and, more generally, inside any simply connected planar region with a reasonable boundary. Essentially, our problem is a type of pursuit-evasion using the link metric rather than the Euclidean metric, and our result provides an interesting class of infinite cop-win graphs.

computational-geometry
game-theory
rather-interesting
to-write-about
nudge-targets
consider:representation
consider:splinegons
november 2017 by Vaguery

[1702.02017] Pushing the Bounds for Matrix-Matrix Multiplication

november 2017 by Vaguery

A tight lower bound for required I/O when computing a matrix-matrix multiplication on a processor with two layers of memory is established. Prior work obtained weaker lower bounds by reasoning about the number of \textit{phases} needed to perform C:=AB, where each phase is a series of operations involving S reads and writes to and from fast memory, and S is the size of fast memory. A lower bound on the number of phases was then determined by obtaining an upper bound on the number of scalar multiplications performed per phase. This paper follows the same high level approach, but improves the lower bound by considering C:=AB+C instead of C:=AB, and obtains the maximum number of scalar fused multiply-adds (FMAs) per phase instead of scalar additions. Key to obtaining the new result is the decoupling of the per-phase I/O from the size of fast memory. The new lower bound is 2mnk/S‾√−2S. The constant for the leading term is an improvement of a factor 42‾√. A theoretical algorithm that attains the lower bound is given, and how the state-of-the-art Goto's algorithm also in some sense meets the lower bound is discussed.

computational-complexity
algorithms
matrices
rather-interesting
nudge-targets
consider:looking-to-see
consider:representation
consider:performance-measures
november 2017 by Vaguery

[1604.08797] Ortho-polygon Visibility Representations of Embedded Graphs

november 2017 by Vaguery

An ortho-polygon visibility representation of an n-vertex embedded graph G (OPVR of G) is an embedding-preserving drawing of G that maps every vertex to a distinct orthogonal polygon and each edge to a vertical or horizontal visibility between its end-vertices. The vertex complexity of an OPVR of G is the minimum k such that every polygon has at most k reflex corners. We present polynomial time algorithms that test whether G has an OPVR and, if so, compute one of minimum vertex complexity. We argue that the existence and the vertex complexity of an OPVR of G are related to its number of crossings per edge and to its connectivity. More precisely, we prove that if G has at most one crossing per edge (i.e., G is a 1-plane graph), an OPVR of G always exists while this may not be the case if two crossings per edge are allowed. Also, if G is a 3-connected 1-plane graph, we can compute an OPVR of G whose vertex complexity is bounded by a constant in O(n) time. However, if G is a 2-connected 1-plane graph, the vertex complexity of any OPVR of G may be Ω(n). In contrast, we describe a family of 2-connected 1-plane graphs for which an embedding that guarantees constant vertex complexity can be computed in O(n) time. Finally, we present the results of an experimental study on the vertex complexity of ortho-polygon visibility representations of 1-plane graphs.

graph-layout
computational-geometry
optimization
rather-interesting
to-write-about
nudge-targets
consider:representation
consider:feature-discovery
algorithms
computational-complexity
november 2017 by Vaguery

[1710.01992] Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks

november 2017 by Vaguery

Convolutional neural networks have recently demonstrated high-quality reconstruction for single image super-resolution. However, existing methods often require a large number of network parameters and entail heavy computational loads at runtime for generating high-accuracy super-resolution results. In this paper, we propose the deep Laplacian Pyramid Super-Resolution Network for fast and accurate image super-resolution. The proposed network progressively reconstructs the sub-band residuals of high-resolution images at multiple pyramid levels. In contrast to existing methods that involve the bicubic interpolation for pre-processing (which results in large feature maps), the proposed method directly extracts features from the low-resolution input space and thereby entails low computational loads. We train the proposed network with deep supervision using the robust Charbonnier loss functions and achieve high-quality image reconstruction. Furthermore, we utilize the recursive layers to share parameters across as well as within pyramid levels, and thus drastically reduce the number of parameters. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of run-time and image quality.

superresolution
neural-networks
representation
algorithms
image-processing
generative-models
to-write-about
nudge-targets
consider:representation
consider:performance-measures
november 2017 by Vaguery

[1707.05425] Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network

november 2017 by Vaguery

We propose a highly efficient and faster Single Image Super-Resolution (SISR) model with Deep Convolutional neural networks (Deep CNN). Deep CNN have recently shown that they have a significant reconstruction performance on single-image super-resolution. Current trend is using deeper CNN layers to improve performance. However, deep models demand larger computation resources and is not suitable for network edge devices like mobile, tablet and IoT devices. Our model achieves state of the art reconstruction performance with at least 10 times lower calculation cost by Deep CNN with Residual Net, Skip Connection and Network in Network (DCSCN). A combination of Deep CNNs and Skip connection layers is used as a feature extractor for image features on both local and global area. Parallelized 1x1 CNNs, like the one called Network in Network, is also used for image reconstruction. That structure reduces the dimensions of the previous layer's output for faster computation with less information loss, and make it possible to process original images directly. Also we optimize the number of layers and filters of each CNN to significantly reduce the calculation cost. Thus, the proposed algorithm not only achieves the state of the art performance but also achieves faster and efficient computation. Code is available at this https URL

superresolution
image-processing
deep-learning
neural-networks
generative-models
nudge-targets
consider:representation
november 2017 by Vaguery

[1710.00228] Physics-based Motion Planning: Evaluation Criteria and Benchmarking

october 2017 by Vaguery

Motion planning has evolved from coping with simply geometric problems to physics-based ones that incorporate the kinodynamic and the physical constraints imposed by the robot and the physical world. Therefore, the criteria for evaluating physics-based motion planners goes beyond the computational complexity (e.g. in terms of planning time) usually used as a measure for evaluating geometrical planners, in order to consider also the quality of the solution in terms of dynamical parameters. This study proposes an evaluation criteria and analyzes the performance of several kinodynamic planners, which are at the core of physics-based motion planning, using different scenarios with fixed and manipulatable objects. RRT, EST, KPIECE and SyCLoP are used for the benchmarking. The results show that KPIECE computes the time-optimal solution with heighest success rate, whereas, SyCLoP compute the most power-optimal solution among the planners used.

simulation
planning
algorithms
rather-interesting
feature-construction
approximation
nudge-targets
consider:representation
october 2017 by Vaguery

[1706.10206] Sums of Palindromes: an Approach via Automata

october 2017 by Vaguery

Recently, Cilleruelo, Luca, & Baxter proved, for all bases b >= 5, that every natural number is the sum of at most 3 natural numbers whose base-b representation is a palindrome. However, the cases b = 2, 3, 4 were left unresolved.

We prove, using a decision procedure based on automata, that every natural number is the sum of at most 4 natural numbers whose base-2 representation is a palindrome. Here the constant 4 is optimal. We obtain similar results for bases 3 and 4, thus completely resolving the problem.

We consider some other variations on this problem, and prove similar results. We argue that heavily case-based proofs are a good signal that a decision procedure may help to automate the proof.

number-theory
rather-interesting
lovely
algorithms
proof
nudge-targets
consider:rediscovery
consider:representation
consider:looking-to-see
We prove, using a decision procedure based on automata, that every natural number is the sum of at most 4 natural numbers whose base-2 representation is a palindrome. Here the constant 4 is optimal. We obtain similar results for bases 3 and 4, thus completely resolving the problem.

We consider some other variations on this problem, and prove similar results. We argue that heavily case-based proofs are a good signal that a decision procedure may help to automate the proof.

october 2017 by Vaguery

[1305.6596] On the Coloring of Pseudoknots

october 2017 by Vaguery

Pseudodiagrams are diagrams of knots where some information about which strand goes over/under at certain crossings may be missing. Pseudoknots are equivalence classes of pseudodiagrams, with equivalence defined by a class of Reidemeister-type moves. In this paper, we introduce two natural extensions of classical knot colorability to this broader class of knot-like objects. We use these definitions to define the determinant of a pseudoknot (i.e. the pseudodeterminant) that agrees with the classical determinant for classical knots. Moreover, we extend Conway notation to pseudoknots to facilitate the investigation of families of pseudoknots and links. The general formulae for pseudodeterminants of pseudoknot families may then be used as a criterion for p-colorability of pseudoknots.

knot-theory
representation
topology
to-understand
nudge-targets
consider:representation
consider:looking-to-see
consider:algorithms
october 2017 by Vaguery

[1603.00051] Algebraic Method in Tilings

october 2017 by Vaguery

In this paper we introduce a new algebraic method in tilings. Combining this method with Hilbert's Nullstellensatz we obtain a necessary condition for tiling n-space by translates of a cluster of cubes. Further, the polynomial method will enable us to show that if there exists a tiling of n-space by translates of a cluster V of prime size then there is a lattice tiling by V as well. Finally, we provide supporting evidence for a conjecture that each tiling by translates of a prime size cluster V is lattice if V generates n-space.

polyominoes
algebra
representation
rather-interesting
nudge-targets
consider:representation
october 2017 by Vaguery

[math/0204106] The Second Hull of a Knotted Curve

october 2017 by Vaguery

The convex hull of a set K in space consists of points which are, in a certain sense, "surrounded" by K. When K is a closed curve, we define its higher hulls, consisting of points which are "multiply surrounded" by the curve. Our main theorem shows that if a curve is knotted then it has a nonempty second hull. This provides a new proof of the Fary/Milnor theorem that every knotted curve has total curvature at least 4pi.

knot-theory
geometry
topology
rather-interesting
feature-construction
proof
nudge-targets
consider:representation
consider:looking-to-see
october 2017 by Vaguery

[1505.00667] An Unusual Continued Fraction

october 2017 by Vaguery

We consider the real number σ with continued fraction expansion [a0,a1,a2,…]=[1,2,1,4,1,2,1,8,1,2,1,4,1,2,1,16,…], where ai is the largest power of 2 dividing i+1. We compute the irrationality measure of σ2 and demonstrate that σ2 (and σ) are both transcendental numbers. We also show that certain partial quotients of σ2 grow doubly exponentially, thus confirming a conjecture of Hanna and Wilson.

continued-fractions
strings
number-theory
rather-interesting
to-understand
nudge-targets
consider:representation
october 2017 by Vaguery

[1606.07163] Interpretable Machine Learning Models for the Digital Clock Drawing Test

october 2017 by Vaguery

The Clock Drawing Test (CDT) is a rapid, inexpensive, and popular neuropsychological screening tool for cognitive conditions. The Digital Clock Drawing Test (dCDT) uses novel software to analyze data from a digitizing ballpoint pen that reports its position with considerable spatial and temporal precision, making possible the analysis of both the drawing process and final product. We developed methodology to analyze pen stroke data from these drawings, and computed a large collection of features which were then analyzed with a variety of machine learning techniques. The resulting scoring systems were designed to be more accurate than the systems currently used by clinicians, but just as interpretable and easy to use. The systems also allow us to quantify the tradeoff between accuracy and interpretability. We created automated versions of the CDT scoring systems currently used by clinicians, allowing us to benchmark our models, which indicated that our machine learning models substantially outperformed the existing scoring systems.

machine-learning
benchmarking
rather-interesting
nudge-targets
consider:representation
consider:looking-to-see
october 2017 by Vaguery

[1602.01241] Using separable non-negative matrix factorization techniques for the analysis of time-resolved Raman spectra

october 2017 by Vaguery

The key challenge of time-resolved Raman spectroscopy is the identification of the constituent species and the analysis of the kinetics of the underlying reaction network. In this work we present an integral approach that allows for determining both the component spectra and the rate constants simultaneously from a series of vibrational spectra. It is based on an algorithm for non-negative matrix factorization which is applied to the experimental data set following a few pre-processing steps. As a prerequisite for physically unambiguous solutions, each component spectrum must include one vibrational band that does not significantly interfere with vibrational bands of other species. The approach is applied to synthetic "experimental" spectra derived from model systems comprising a set of species with component spectra differing with respect to their degree of spectral interferences and signal-to-noise ratios. In each case, the species involved are connected via monomolecular reaction pathways. The potential and limitations of the approach for recovering the respective rate constants and component spectra are discussed.

spectroscopy
data-analysis
inference
numerical-methods
modeling
statistics
rather-interesting
nudge-targets
consider:representation
october 2017 by Vaguery

[1708.08377] Two-Dimensional Indirect Binary Search for the Positive One-in-Three Satisfiability Problem

october 2017 by Vaguery

In this paper, we propose an algorithm for the positive one-in-three satisfiability problem (Pos1in3SAT). The proposed algorithm can efficiently decide the existence of a satisfying assignment in all assignments for a given formula by using a 2-dimensional binary search method without constructing an exponential number of assignments.

satisfiability
algorithms
computational-complexity
machine-learning
nudge-targets
consider:looking-to-see
consider:representation
matrices
october 2017 by Vaguery

[1709.09022] On Integer Images of Max-plus Linear Mappings

october 2017 by Vaguery

Let us extend the pair of operations (max,+) over real numbers to matrices in the same way as in conventional linear algebra. We study integer images of max-plus linear mappings. The question whether Ax (in the max-plus algebra) is an integer vector for at least one x has been studied for some time but polynomial solution methods seem to exist only in special cases. In the terminology of combinatorial matrix theory this question reads: is it possible to add constants to the columns of a given matrix so that all row maxima are integer? This problem has been motivated by attempts to solve a class of job-scheduling problems. We present two polynomially solvable special cases aiming to move closer to a polynomial solution method in the general case.

representation
linear-algebra
out-of-the-box
mathematics
algebra
nudge-targets
consider:looking-to-see
consider:representation
october 2017 by Vaguery

[1708.02891v1] Area difference bounds for dissections of a square into an odd number of triangles

september 2017 by Vaguery

Monsky's theorem from 1970 states that a square cannot be dissected into an odd number of triangles of the same area, but it does not give a lower bound for the area differences that must occur.

We extend Monsky's theorem to "constrained framed maps"; based on this we can apply a gap theorem from semi-algebraic geometry to a polynomial area difference measure and thus get a lower bound for the area differences that decreases doubly-exponentially with the number of triangles. On the other hand, we obtain the first superpolynomial upper bounds for this problem, derived from an explicit construction that uses the Thue-Morse sequence.

computational-geometry
limits
geometry
constraint-satisfaction
nudge-targets
consider:looking-to-see
consider:representation
to-write-about
We extend Monsky's theorem to "constrained framed maps"; based on this we can apply a gap theorem from semi-algebraic geometry to a polynomial area difference measure and thus get a lower bound for the area differences that decreases doubly-exponentially with the number of triangles. On the other hand, we obtain the first superpolynomial upper bounds for this problem, derived from an explicit construction that uses the Thue-Morse sequence.

september 2017 by Vaguery

[1702.03615v2] Online Prediction with Selfish Experts

september 2017 by Vaguery

We consider the problem of binary prediction with expert advice in settings where experts have agency and seek to maximize their credibility. This paper makes three main contributions. First, it defines a model to reason formally about settings with selfish experts, and demonstrates that "incentive compatible" (IC) algorithms are closely related to the design of proper scoring rules. Designing a good IC algorithm is easy if the designer's loss function is quadratic, but for other loss functions, novel techniques are required. Second, we design IC algorithms with good performance guarantees for the absolute loss function. Third, we give a formal separation between the power of online prediction with selfish experts and online prediction with honest experts by proving lower bounds for both IC and non-IC algorithms. In particular, with selfish experts and the absolute loss function, there is no (randomized) algorithm for online prediction-IC or otherwise-with asymptotically vanishing regret.

mechanism-design
prediction
rather-interesting
collective-behavior
markets
game-theory
nudge-targets
consider:representation
september 2017 by Vaguery

[1709.07241] Satisfiability Modulo Theory based Methodology for Floorplanning in VLSI Circuits

september 2017 by Vaguery

This paper proposes a Satisfiability Modulo Theory based formulation for floorplanning in VLSI circuits. The proposed approach allows a number of fixed blocks to be placed within a layout region without overlapping and at the same time minimizing the area of the layout region. The proposed approach is extended to allow a number of fixed blocks with ability to rotate and flexible blocks (with variable width and height) to be placed within a layout without overlap. Our target in all cases is reduction in area occupied on a chip which is of vital importance in obtaining a good circuit design. Satisfiability Modulo Theory combines the problem of Boolean satisfiability with domains such as convex optimization. Satisfiability Modulo Theory provides a richer modeling language than is possible with pure Boolean SAT formulas. We have conducted our experiments on MCNC and GSRC benchmark circuits to calculate the total area occupied, amount of deadspace and the total CPU time consumed while placing the blocks without overlapping. The results obtained shows clearly that the amount of dead space or wasted space is reduced if rotation is applied to the blocks.

operations-research
optimization
representation
engineering-design
rather-interesting
nudge-targets
consider:representation
mathematical-programming
to-write-about
september 2017 by Vaguery

[1709.06309] Aspect-Based Relational Sentiment Analysis Using a Stacked Neural Network Architecture

september 2017 by Vaguery

Sentiment analysis can be regarded as a relation extraction problem in which the sentiment of some opinion holder towards a certain aspect of a product, theme or event needs to be extracted. We present a novel neural architecture for sentiment analysis as a relation extraction problem that addresses this problem by dividing it into three subtasks: i) identification of aspect and opinion terms, ii) labeling of opinion terms with a sentiment, and iii) extraction of relations between opinion terms and aspect terms. For each subtask, we propose a neural network based component and combine all of them into a complete system for relational sentiment analysis. The component for aspect and opinion term extraction is a hybrid architecture consisting of a recurrent neural network stacked on top of a convolutional neural network. This approach outperforms a standard convolutional deep neural architecture as well as a recurrent network architecture and performs competitively compared to other methods on two datasets of annotated customer reviews. To extract sentiments for individual opinion terms, we propose a recurrent architecture in combination with word distance features and achieve promising results, outperforming a majority baseline by 18% accuracy and providing the first results for the USAGE dataset. Our relation extraction component outperforms the current state-of-the-art in aspect-opinion relation extraction by 15% F-Measure.

sentiment-analysis
natural-language-processing
deep-learning
neural-networks
machine-learning
architecture
nudge-targets
consider:representation
september 2017 by Vaguery

[1709.06389] A General Framework for the Recognition of Online Handwritten Graphics

september 2017 by Vaguery

We propose a new framework for the recognition of online handwritten graphics. Three main features of the framework are its ability to treat symbol and structural level information in an integrated way, its flexibility with respect to different families of graphics, and means to control the tradeoff between recognition effectiveness and computational cost. We model a graphic as a labeled graph generated from a graph grammar. Non-terminal vertices represent subcomponents, terminal vertices represent symbols, and edges represent relations between subcomponents or symbols. We then model the recognition problem as a graph parsing problem: given an input stroke set, we search for a parse tree that represents the best interpretation of the input. Our graph parsing algorithm generates multiple interpretations (consistent with the grammar) and then we extract an optimal interpretation according to a cost function that takes into consideration the likelihood scores of symbols and structures. The parsing algorithm consists in recursively partitioning the stroke set according to structures defined in the grammar and it does not impose constraints present in some previous works (e.g. stroke ordering). By avoiding such constraints and thanks to the powerful representativeness of graphs, our approach can be adapted to the recognition of different graphic notations. We show applications to the recognition of mathematical expressions and flowcharts. Experimentation shows that our method obtains state-of-the-art accuracy in both applications.

OCR
handwriting-recognition
machine-learning
image-processing
nudge-targets
consider:looking-to-see
consider:representation
september 2017 by Vaguery

[1709.05397] Zero-Shot Learning to Manage a Large Number of Place-Specific Compressive Change Classifiers

september 2017 by Vaguery

With recent progress in large-scale map maintenance and long-term map learning, the task of change detection on a large-scale map from a visual image captured by a mobile robot has become a problem of increasing criticality. Previous approaches for change detection are typically based on image differencing and require the memorization of a prohibitively large number of mapped images in the above context. In contrast, this study follows the recent, efficient paradigm of change-classifier-learning and specifically employs a collection of place-specific change classifiers. Our change-classifier-learning algorithm is based on zero-shot learning (ZSL) and represents a place-specific change classifier by its training examples mined from an external knowledge base (EKB). The proposed algorithm exhibits several advantages. First, we are required to memorize only training examples (rather than the classifier itself), which can be further compressed in the form of bag-of-words (BoW). Secondly, we can incorporate the most recent map into the classifiers by straightforwardly adding or deleting a few training examples that correspond to these classifiers. Thirdly, we can share the BoW vocabulary with other related task scenarios (e.g., BoW-based self-localization), wherein the vocabulary is generally designed as a rich, continuously growing, and domain-adaptive knowledge base. In our contribution, the proposed algorithm is applied and evaluated on a practical long-term cross-season change detection system that consists of a large number of place-specific object-level change classifiers.

machine-learning
data-fusion
image-processing
deep-learning
zero-shot-learning
algorithms
nudge-targets
consider:representation
september 2017 by Vaguery

[1705.01991] Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU

september 2017 by Vaguery

Attentional sequence-to-sequence models have become the new standard for machine translation, but one challenge of such models is a significant increase in training and decoding cost compared to phrase-based systems. Here, we focus on efficient decoding, with a goal of achieving accuracy close the state-of-the-art in neural machine translation (NMT), while achieving CPU decoding speed/throughput close to that of a phrasal decoder.

We approach this problem from two angles: First, we describe several techniques for speeding up an NMT beam search decoder, which obtain a 4.4x speedup over a very efficient baseline decoder without changing the decoder output. Second, we propose a simple but powerful network architecture which uses an RNN (GRU/LSTM) layer at bottom, followed by a series of stacked fully-connected layers applied at every timestep. This architecture achieves similar accuracy to a deep recurrent model, at a small fraction of the training and decoding cost. By combining these techniques, our best system achieves a very competitive accuracy of 38.3 BLEU on WMT English-French NewsTest2014, while decoding at 100 words/sec on single-threaded CPU. We believe this is the best published accuracy/speed trade-off of an NMT system.

machine-learning
representation
algorithms
deep-learning
neural-networks
approximation
nudge-targets
consider:performance-measures
consider:representation
natural-language-processing
We approach this problem from two angles: First, we describe several techniques for speeding up an NMT beam search decoder, which obtain a 4.4x speedup over a very efficient baseline decoder without changing the decoder output. Second, we propose a simple but powerful network architecture which uses an RNN (GRU/LSTM) layer at bottom, followed by a series of stacked fully-connected layers applied at every timestep. This architecture achieves similar accuracy to a deep recurrent model, at a small fraction of the training and decoding cost. By combining these techniques, our best system achieves a very competitive accuracy of 38.3 BLEU on WMT English-French NewsTest2014, while decoding at 100 words/sec on single-threaded CPU. We believe this is the best published accuracy/speed trade-off of an NMT system.

september 2017 by Vaguery

[1608.08324] Topological Drawings of Complete Bipartite Graphs

september 2017 by Vaguery

Topological drawings are natural representations of graphs in the plane, where vertices are represented by points, and edges by curves connecting the points. Topological drawings of complete graphs and of complete bipartite graphs have been studied extensively in the context of crossing number problems. We consider a natural class of simple topological drawings of complete bipartite graphs, in which we require that one side of the vertex set bipartition lies on the outer boundary of the drawing.

We investigate the combinatorics of such drawings. For this purpose, we define combinatorial encodings of the drawings by enumerating the distinct drawings of subgraphs isomorphic to K2,2 and K3,2, and investigate the constraints they must satisfy. We prove that a drawing of Kk,n exists if and only if some simple local conditions are satisfied by the encodings. This directly yields a polynomial-time algorithm for deciding the existence of such a drawing given the encoding. We show the encoding is equivalent to specifying which pairs of edges cross, yielding a similar polynomial-time algorithm for the realizability of abstract topological graphs.

We also completely characterize and enumerate such drawings of Kk,n in which the order of the edges around each vertex is the same for vertices on the same side of the bipartition. Finally, we investigate drawings of Kk,n using straight lines and pseudolines, and consider the complexity of the corresponding realizability problems.

computational-geometry
graph-layout
rather-interesting
purdy-pitchers
nudge-targets
consider:representation
consider:performance-measures
We investigate the combinatorics of such drawings. For this purpose, we define combinatorial encodings of the drawings by enumerating the distinct drawings of subgraphs isomorphic to K2,2 and K3,2, and investigate the constraints they must satisfy. We prove that a drawing of Kk,n exists if and only if some simple local conditions are satisfied by the encodings. This directly yields a polynomial-time algorithm for deciding the existence of such a drawing given the encoding. We show the encoding is equivalent to specifying which pairs of edges cross, yielding a similar polynomial-time algorithm for the realizability of abstract topological graphs.

We also completely characterize and enumerate such drawings of Kk,n in which the order of the edges around each vertex is the same for vertices on the same side of the bipartition. Finally, we investigate drawings of Kk,n using straight lines and pseudolines, and consider the complexity of the corresponding realizability problems.

september 2017 by Vaguery

[1709.05769] Where to Focus: Deep Attention-based Spatially Recurrent Bilinear Networks for Fine-Grained Visual Recognition

september 2017 by Vaguery

Fine-grained visual recognition typically depends on modeling subtle difference from object parts. However, these parts often exhibit dramatic visual variations such as occlusions, viewpoints, and spatial transformations, making it hard to detect. In this paper, we present a novel attention-based model to automatically, selectively and accurately focus on critical object regions with higher importance against appearance variations. Given an image, two different Convolutional Neural Networks (CNNs) are constructed, where the outputs of two CNNs are correlated through bilinear pooling to simultaneously focus on discriminative regions and extract relevant features. To capture spatial distributions among the local regions with visual attention, soft attention based spatial Long-Short Term Memory units (LSTMs) are incorporated to realize spatially recurrent yet visually selective over local input patterns. All the above intuitions equip our network with the following novel model: two-stream CNN layers, bilinear pooling layer, spatial recurrent layer with location attention are jointly trained via an end-to-end fashion to serve as the part detector and feature extractor, whereby relevant features are localized and extracted attentively. We show the significance of our network against two well-known visual recognition tasks: fine-grained image classification and person re-identification.

image-processing
neural-networks
attention
feature-extraction
deep-learning
architecture
constraint-satisfaction
nudge-targets
consider:feature-discovery
consider:representation
september 2017 by Vaguery

[1709.07153] Large Vocabulary Automatic Chord Estimation Using Deep Neural Nets: Design Framework, System Variations and Limitations

september 2017 by Vaguery

In this paper, we propose a new system design framework for large vocabulary automatic chord estimation. Our approach is based on an integration of traditional sequence segmentation processes and deep learning chord classification techniques. We systematically explore the design space of the proposed framework for a range of parameters, namely deep neural nets, network configurations, input feature representations, segment tiling schemes, and training data sizes. Experimental results show that among the three proposed deep neural nets and a baseline model, the recurrent neural network based system has the best average chord quality accuracy that significantly outperforms the other considered models. Furthermore, our bias-variance analysis has identified a glass ceiling as a potential hindrance to future improvements of large vocabulary automatic chord estimation systems.

sound-recognition
feature-extraction
algorithms
machine-learning
nudge-targets
consider:representation
consider:looking-to-see
september 2017 by Vaguery

[1608.02025] Boundary-based MWE segmentation with text partitioning

september 2017 by Vaguery

This work presents a fine-grained, text-chunking algorithm designed for the task of multiword expressions (MWEs) segmentation. As a lexical class, MWEs include a wide variety of idioms, whose automatic identification are a necessity for the handling of colloquial language. This algorithm's core novelty is its use of non-word tokens, i.e., boundaries, in a bottom-up strategy. Leveraging boundaries refines token-level information, forging high-level performance from relatively basic data. The generality of this model's feature space allows for its application across languages and domains. Experiments spanning 19 different languages exhibit a broadly-applicable, state-of-the-art model. Evaluation against recent shared-task data places text partitioning as the overall, best performing MWE segmentation algorithm, covering all MWE classes and multiple English domains (including user-generated text). This performance, coupled with a non-combinatorial, fast-running design, produces an ideal combination for implementations at scale, which are facilitated through the release of open-source software.

natural-language-processing
algorithms
parsing
nudge-targets
consider:looking-to-see
consider:representation
consider:performance-measures
september 2017 by Vaguery

[1606.06570] Metastability-Containing Circuits

september 2017 by Vaguery

In digital circuits, metastability can cause deteriorated signals that neither are logical 0 or logical 1, breaking the abstraction of Boolean logic. Unfortunately, any way of reading a signal from an unsynchronized clock domain or performing an analog-to-digital conversion incurs the risk of a metastable upset; no digital circuit can deterministically avoid, resolve, or detect metastability (Marino, 1981). Synchronizers, the only traditional countermeasure, exponentially decrease the odds of maintained metastability over time. Trading synchronization delay for an increased probability to resolve metastability to logical 0 or 1, they do not guarantee success.

We propose a fundamentally different approach: It is possible to contain metastability by fine-grained logical masking so that it cannot infect the entire circuit. This technique guarantees a limited degree of metastability in---and uncertainty about---the output.

At the heart of our approach lies a time- and value-discrete model for metastability in synchronous clocked digital circuits. Metastability is propagated in a worst-case fashion, allowing to derive deterministic guarantees, without and unlike synchronizers. The proposed model permits positive results and passes the test of reproducing Marino's impossibility results. We fully classify which functions can be computed by circuits with standard registers. Regarding masking registers, we show that they become computationally strictly more powerful with each clock cycle, resulting in a non-trivial hierarchy of computable functions.

robustness
engineering-design
performance-measure
rather-interesting
nudge-targets
consider:looking-to-see
simulation
consider:representation
We propose a fundamentally different approach: It is possible to contain metastability by fine-grained logical masking so that it cannot infect the entire circuit. This technique guarantees a limited degree of metastability in---and uncertainty about---the output.

At the heart of our approach lies a time- and value-discrete model for metastability in synchronous clocked digital circuits. Metastability is propagated in a worst-case fashion, allowing to derive deterministic guarantees, without and unlike synchronizers. The proposed model permits positive results and passes the test of reproducing Marino's impossibility results. We fully classify which functions can be computed by circuits with standard registers. Regarding masking registers, we show that they become computationally strictly more powerful with each clock cycle, resulting in a non-trivial hierarchy of computable functions.

september 2017 by Vaguery

[1705.01842] A Deep Learning Perspective on the Origin of Facial Expressions

september 2017 by Vaguery

Facial expressions play a significant role in human communication and behavior. Psychologists have long studied the relationship between facial expressions and emotions. Paul Ekman et al., devised the Facial Action Coding System (FACS) to taxonomize human facial expressions and model their behavior. The ability to recognize facial expressions automatically, enables novel applications in fields like human-computer interaction, social gaming, and psychological research. There has been a tremendously active research in this field, with several recent papers utilizing convolutional neural networks (CNN) for feature extraction and inference. In this paper, we employ CNN understanding methods to study the relation between the features these computational networks are using, the FACS and Action Units (AU). We verify our findings on the Extended Cohn-Kanade (CK+), NovaEmotions and FER2013 datasets. We apply these models to various tasks and tests using transfer learning, including cross-dataset validation and cross-task performance. Finally, we exploit the nature of the FER based CNN models for the detection of micro-expressions and achieve state-of-the-art accuracy using a simple long-short-term-memory (LSTM) recurrent neural network (RNN).

computer-vision
deep-learning
face-recognition
rather-interesting
feature-extraction
feature-construction
nudge-targets
consider:looking-to-see
consider:representation
september 2017 by Vaguery

[1704.08676] A quantitative assessment of the effect of different algorithmic schemes to the task of learning the structure of Bayesian Networks

august 2017 by Vaguery

One of the most challenging tasks when adopting Bayesian Networks (BNs) is the one of learning their structure from data. This task is complicated by the huge search space of possible solutions and turned out to be a well-known NP-hard problem and, hence, approximations are required. However, to the best of our knowledge, a quantitative analysis of the performance and characteristics of the different heuristics to solve this problem has never been done before.

For this reason, in this work, we provide a detailed study of the different state-of-the-arts methods for structural learning on simulated data considering both BNs with discrete and continuous variables, and with different rates of noise in the data. In particular, we investigate the characteristics of different widespread scores proposed for the inference and the statistical pitfalls within them.

learning-from-data
machine-learning
statistics
algorithms
rather-interesting
inference
nudge-targets
consider:looking-to-see
consider:representation
For this reason, in this work, we provide a detailed study of the different state-of-the-arts methods for structural learning on simulated data considering both BNs with discrete and continuous variables, and with different rates of noise in the data. In particular, we investigate the characteristics of different widespread scores proposed for the inference and the statistical pitfalls within them.

august 2017 by Vaguery

[1410.3564v1] Randomized Triangle Algorithms for Convex Hull Membership

may 2017 by Vaguery

We present randomized versions of the {\it triangle algorithm} introduced in \cite{kal14}. The triangle algorithm tests membership of a distinguished point p∈ℝm in the convex hull of a given set S of n points in ℝm. Given any {\it iterate} p′∈conv(S), it searches for a {\it pivot}, a point v∈S so that d(p′,v)≥d(p,v). It replaces p′ with the point on the line segment p′v closest to p and repeats this process. If a pivot does not exist, p′ certifies that p∉conv(S). Here we propose two random variations of the triangle algorithm that allow relaxed steps so as to take more effective steps possible in subsequent iterations. One is inspired by the {\it chaos game} known to result in the Sierpinski triangle. The incentive is that randomized iterates together with a property of Sierpinski triangle would result in effective pivots. Bounds on their expected complexity coincides with those of the deterministic version derived in \cite{kal14}.

computational-geometry
algorithms
convex-hulls
rather-interesting
nudge-targets
consider:representation
consider:looking-to-see
may 2017 by Vaguery

[1704.00899] Dynamic Rank Maximal Matchings

may 2017 by Vaguery

We consider the problem of matching applicants to posts where applicants have preferences over posts. Thus the input to our problem is a bipartite graph G = (A U P,E), where A denotes a set of applicants, P is a set of posts, and there are ranks on edges which denote the preferences of applicants over posts. A matching M in G is called rank-maximal if it matches the maximum number of applicants to their rank 1 posts, subject to this the maximum number of applicants to their rank 2 posts, and so on.

We consider this problem in a dynamic setting, where vertices and edges can be added and deleted at any point. Let n and m be the number of vertices and edges in an instance G, and r be the maximum rank used by any rank-maximal matching in G. We give a simple O(r(m+n))-time algorithm to update an existing rank-maximal matching under each of these changes. When r = o(n), this is faster than recomputing a rank-maximal matching completely using a known algorithm like that of Irving et al., which takes time O(min((r + n, r*sqrt(n))m).

operations-research
optimization
rostering
scheduling
dynamic-optimization
to-write-about
consider:representation
consider:multiobjective-optimization
We consider this problem in a dynamic setting, where vertices and edges can be added and deleted at any point. Let n and m be the number of vertices and edges in an instance G, and r be the maximum rank used by any rank-maximal matching in G. We give a simple O(r(m+n))-time algorithm to update an existing rank-maximal matching under each of these changes. When r = o(n), this is faster than recomputing a rank-maximal matching completely using a known algorithm like that of Irving et al., which takes time O(min((r + n, r*sqrt(n))m).

may 2017 by Vaguery

[1703.08052] Dynamic Bernoulli Embeddings for Language Evolution

may 2017 by Vaguery

Word embeddings are a powerful approach for unsupervised analysis of language. Recently, Rudolph et al. (2016) developed exponential family embeddings, which cast word embeddings in a probabilistic framework. Here, we develop dynamic embeddings, building on exponential family embeddings to capture how the meanings of words change over time. We use dynamic embeddings to analyze three large collections of historical texts: the U.S. Senate speeches from 1858 to 2009, the history of computer science ACM abstracts from 1951 to 2014, and machine learning papers on the Arxiv from 2007 to 2015. We find dynamic embeddings provide better fits than classical embeddings and capture interesting patterns about how language changes.

natural-language-processing
representation
rather-interesting
to-write-about
nudge-targets
consider:representation
may 2017 by Vaguery

[1704.00708] No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis

may 2017 by Vaguery

In this paper we develop a new framework that captures the common landscape underlying the common non-convex low-rank matrix problems including matrix sensing, matrix completion and robust PCA. In particular, we show for all above problems (including asymmetric cases): 1) all local minima are also globally optimal; 2) no high-order saddle points exists. These results explain why simple algorithms such as stochastic gradient descent have global converge, and efficiently optimize these non-convex objective functions in practice. Our framework connects and simplifies the existing analyses on optimization landscapes for matrix sensing and symmetric matrix completion. The framework naturally leads to new results for asymmetric matrix completion and robust PCA.

compressed-sensing
matrices
optimization
approximation
rather-interesting
machine-learning
nudge-targets
consider:representation
may 2017 by Vaguery

[1704.08156] Locally optimal 2-periodic sphere packings

may 2017 by Vaguery

The sphere packing problem is an old puzzle. We consider packings with m spheres in the unit cell (m-periodic packings). For the case m = 1 (lattice packings), Voronoi presented an algorithm to enumerate all local optima in a finite computation, which has been implemented in up to d = 8 dimensions. We generalize Voronoi's algorithm to m > 1 and use this new algorithm to enumerate all locally optimal 2-periodic sphere packings in d = 3, 4, and 5. In particular, we show that no 2-periodic packing surpasses the density of the optimal lattice in these dimensions. A partial enumeration is performed in d = 6.

packing
generalization
geometry
optimization
nudge-targets
consider:rediscovery
consider:representation
may 2017 by Vaguery

[1704.07378] Geometry-Based Optimization of One-Way Quantum Computation Measurement Patterns

may 2017 by Vaguery

In one-way quantum computation (1WQC) model, an initial highly entangled state called a graph state is used to perform universal quantum computations by a sequence of adaptive single-qubit measurements and post-measurement Pauli-X and Pauli-Z corrections. The needed computations are organized as measurement patterns, or simply patterns, in the 1WQC model. The entanglement operations in a pattern can be shown by a graph which together with the set of its input and output qubits is called the geometry of the pattern. Since a one-way quantum computation pattern is based on quantum measurements, which are fundamentally nondeterministic evolutions, there must be conditions over geometries to guarantee determinism. Causal flow is a sufficient and generalized flow (gflow) is a necessary and sufficient condition over geometries to identify a dependency structure for the measurement sequences in order to achieve determinism. Previously, three optimization methods have been proposed to simplify 1WQC patterns which are called standardization, signal shifting and Pauli simplification. These optimizations can be performed using measurement calculus formalism by rewriting rules. However, maintaining and searching these rules in the library can be complicated with respect to implementation. Moreover, serial execution of these rules is time consuming due to executing many ineffective commutation rules. To overcome this problem, in this paper, a new scheme is proposed to perform optimization techniques on patterns with flow or gflow only based on their geometries instead of using rewriting rules. Furthermore, the proposed scheme obtains the maximally delayed gflow order for geometries with flow. It is shown that the time complexity of the proposed approach is improved over the previous ones.

quantums
quantum-computing
algorithms
nudge-targets
consider:rediscovery
consider:representation
see-also:Lee's-book
may 2017 by Vaguery

Balanced ternary - Wikipedia

april 2017 by Vaguery

Balanced ternary is a non-standard positional numeral system (a balanced form), useful for comparison logic. While it is a ternary (base 3) number system, in the standard (unbalanced) ternary system, digits have values 0, 1 and 2. The digits in the balanced ternary system have values −1, 0, and 1.

Different sources use different glyphs used to represent the three digits in balanced ternary. In this article, T (which resembles a ligature of the minus sign and 1) represents −1, while 0 and 1 represent themselves. Other conventions include using '−' and '+' to represent −1 and 1 respectively, or using Greek letter theta (Θ), which resembles a minus sign in a circle, to represent −1.

exotic-math
number-theory
nudge-targets
consider:representation
consider:rediscovery
Different sources use different glyphs used to represent the three digits in balanced ternary. In this article, T (which resembles a ligature of the minus sign and 1) represents −1, while 0 and 1 represent themselves. Other conventions include using '−' and '+' to represent −1 and 1 respectively, or using Greek letter theta (Θ), which resembles a minus sign in a circle, to represent −1.

april 2017 by Vaguery

[1703.04381] On the Transformation Capability of Feasible Mechanisms for Programmable Matter

april 2017 by Vaguery

In this work, we study theoretical models of \emph{programmable matter} systems. The systems under consideration consist of spherical modules, kept together by magnetic forces and able to perform two minimal mechanical operations (or movements): \emph{rotate} around a neighbor and \emph{slide} over a line. In terms of modeling, there are n nodes arranged in a 2-dimensional grid and forming some initial \emph{shape}. The goal is for the initial shape A to \emph{transform} to some target shape B by a sequence of movements. Most of the paper focuses on \emph{transformability} questions, meaning whether it is in principle feasible to transform a given shape to another. We first consider the case in which only rotation is available to the nodes. Our main result is that deciding whether two given shapes A and B can be transformed to each other, is in P. We then insist on rotation only and impose the restriction that the nodes must maintain global connectivity throughout the transformation. We prove that the corresponding transformability question is in PSPACE and study the problem of determining the minimum \emph{seeds} that can make feasible, otherwise infeasible transformations. Next we allow both rotations and slidings and prove universality: any two connected shapes A,B of the same order, can be transformed to each other without breaking connectivity. The worst-case number of movements of the generic strategy is Ω(n2). We improve this to O(n) parallel time, by a pipelining strategy, and prove optimality of both by matching lower bounds. In the last part of the paper, we turn our attention to distributed transformations. The nodes are now distributed processes able to perform communicate-compute-move rounds. We provide distributed algorithms for a general type of transformations.

programmable-matter
engineering-design
self-organization
rather-interesting
to-write-about
nudge-targets
consider:planning
consider:representation
april 2017 by Vaguery

[1602.05150v3] Approximation and Hardness for Token Swapping

april 2017 by Vaguery

Given a graph G=(V,E) with V={1,…,n}, we place on every vertex a token T1,…,Tn. A swap is an exchange of tokens on adjacent vertices. We consider the algorithmic question of finding a shortest sequence of swaps such that token Ti is on vertex i. We are able to achieve essentially matching upper and lower bounds, for exact algorithms and approximation algorithms. For exact algorithms, we rule out any 2o(n) algorithm under the ETH. This is matched with a simple 2O(nlogn) algorithm based on a breadth-first search in an auxiliary graph. We show one general 4-approximation and show APX-hardness. Thus, there is a small constant δ>1 such that every polynomial time approximation algorithm has approximation factor at least δ. Our results also hold for a generalized version, where tokens and vertices are colored. In this generalized version each token must go to a vertex with the same color.

computational-complexity
graph-theory
computational-geometry
mathematical-recreations
rather-interesting
feature-construction
to-write-about
nudge-targets
consider:looking-to-see
consider:representation
consider:robustness
april 2017 by Vaguery

[cs/0106019v2] Playing Games with Algorithms: Algorithmic Combinatorial Game Theory

april 2017 by Vaguery

Combinatorial games lead to several interesting, clean problems in algorithms and complexity theory, many of which remain open. The purpose of this paper is to provide an overview of the area to encourage further research. In particular, we begin with general background in Combinatorial Game Theory, which analyzes ideal play in perfect-information games, and Constraint Logic, which provides a framework for showing hardness. Then we survey results about the complexity of determining ideal play in these games, and the related problems of solving puzzles, in terms of both polynomial-time algorithms and computational intractability results. Our review of background and survey of algorithmic results are by no means complete, but should serve as a useful primer.

games
mathematical-recreations
rewriting-systems
rather-interesting
to-write-about
puzzles
nudge-targets
consider:looking-to-see
consider:representation
april 2017 by Vaguery

[cond-mat/0203436] Entropy estimation of symbol sequences

april 2017 by Vaguery

We discuss algorithms for estimating the Shannon entropy h of finite symbol sequences with long range correlations. In particular, we consider algorithms which estimate h from the code lengths produced by some compression algorithm. Our interest is in describing their convergence with sequence length, assuming no limits for the space and time complexities of the compression algorithms. A scaling law is proposed for extrapolation from finite sample lengths. This is applied to sequences of dynamical systems in non-trivial chaotic regimes, a 1-D cellular automaton, and to written English texts.

statistics
information-theory
probability-theory
models
dynamical-systems
nudge-targets
consider:looking-to-see
consider:representation
april 2017 by Vaguery

[1308.0419] Inverse Procedural Modeling of Facade Layouts

april 2017 by Vaguery

In this paper, we address the following research problem: How can we generate a meaningful split grammar that explains a given facade layout? To evaluate if a grammar is meaningful, we propose a cost function based on the description length and minimize this cost using an approximate dynamic programming framework. Our evaluation indicates that our framework extracts meaningful split grammars that are competitive with those of expert users, while some users and all competing automatic solutions are less successful.

grammar
L-systems
generative-models
image-processing
learning-from-data
machine-learning
inverse-problems
nudge-targets
consider:representation
consider:feature-discovery
april 2017 by Vaguery

[1701.06790] Parallel Graph Rewriting with Overlapping Rules

april 2017 by Vaguery

We tackle the problem of simultaneous transformations of networks represented as graphs. Roughly speaking, one may distinguish two kinds of simultaneous or parallel rewrite relations over complex structures such as graphs: (i) those which transform disjoint subgraphs in parallel and hence can be simulated by successive mere sequential and local transformations and (ii) those which transform overlapping subgraphs simultaneously. In the latter situations, parallel transformations cannot be simulated in general by means of successive local rewrite steps. We investigate this last problem in the framework of overlapping graph transformation systems. As parallel transformation of a graph does not produce a graph in general, we propose first some sufficient conditions that ensure the closure of graphs by parallel rewrite relations. Then we mainly introduce and discuss two parallel rewrite relations over graphs. One relation is functional and thus deterministic, the other one is not functional for which we propose sufficient conditions which ensure its confluence.

rewriting-systems
graph-theory
generative-models
representation
dynamical-systems
to-write-about
nudge-targets
consider:looking-to-see
consider:representation
algorithms
april 2017 by Vaguery

[1602.01401] Applications of AI for Magic Squares

april 2017 by Vaguery

In recreational mathematics, a normal magic square is an n×n square matrix whose entries are distinctly the integers 1…n2, such that each row, column, and major and minor traces sum to one constant μ. It has been proven that there are 7,040 fourth order normal magic squares and 2,202,441,792 fifth order normal magic squares, with higher orders unconfirmed. Previous work related to fourth order normal squares has shown that symmetries such as the dihedral group exist and that (under certain conditions) normal magic squares can be categorized into four distinct subsets. With the implementation of an efficient backtracking algorithm along with supervised machine learning techniques for classification, it will be shown that the entire set of fourth order normal magic squares can be generated by expanding the symmetry groups of 95 asymmetric parents. Discussion will suggest that methods employed in this project could similarly apply to higher orders.

combinatorics
magic-squares
constraint-satisfaction
rather-interesting
to-write-about
nudge-targets
consider:representation
consider:looking-to-see
april 2017 by Vaguery

[1412.1545] On Magic Finite Projective Space

april 2017 by Vaguery

This paper studies a generalization of magic squares to finite projective space ℙn(q). We classify at all functions from ℙn(q) into a finite field where the sum along any r-flat is 0. In doing so we show connections to elementary number theory and the modular representation theory of GL(n,q).

mathematical-recreations
combinatorics
rather-interesting
to-write-about
generalization
nudge-targets
consider:looking-to-see
consider:representation
april 2017 by Vaguery

[1107.4120] Generalized packing designs

april 2017 by Vaguery

Generalized t-designs, which form a common generalization of objects such as t-designs, resolvable designs and orthogonal arrays, were defined by Cameron [P.J. Cameron, A generalisation of t-designs, \emph{Discrete Math.}\ {\bf 309} (2009), 4835--4842]. In this paper, we define a related class of combinatorial designs which simultaneously generalize packing designs and packing arrays. We describe the sometimes surprising connections which these generalized designs have with various known classes of combinatorial designs, including Howell designs, partial Latin squares and several classes of triple systems, and also concepts such as resolvability and block colouring of ordinary designs and packings, and orthogonal resolutions and colourings. Moreover, we derive bounds on the size of a generalized packing design and construct optimal generalized packings in certain cases. In particular, we provide methods for constructing maximum generalized packings with t=2 and block size k=3 or 4.

combinatorics
constraint-satisfaction
algorithms
representation
experimental-design
rather-interesting
to-write
nudge-targets
consider:representation
april 2017 by Vaguery

[math/0611293] Descending Dungeons and Iterated Base-Changing

april 2017 by Vaguery

For real numbers a, b> 1, let as a_b denote the result of interpreting a in base b instead of base 10. We define ``dungeons'' (as opposed to ``towers'') to be numbers of the form a_b_c_d_..._e, parenthesized either from the bottom upwards (preferred) or from the top downwards. Among other things, we show that the sequences of dungeons with n-th terms 10_11_12_..._(n-1)_n or n_(n-1)_..._12_11_10 grow roughly like 10^{10^{n log log n}}, where the logarithms are to the base 10. We also investigate the behavior as n increases of the sequence a_a_a_..._a, with n a's, parenthesized from the bottom upwards. This converges either to a single number (e.g. to the golden ratio if a = 1.1), to a two-term limit cycle (e.g. if a = 1.05) or else diverges (e.g. if a = frac{100{99).

number-theory
mathematical-recreations
representation
rather-interesting
to-write-about
nudge-targets
consider:representation
consider:feature-discovery
sequences
april 2017 by Vaguery

[1606.05172] Asynchronous simulation of Boolean networks by monotone Boolean networks

april 2017 by Vaguery

We prove that the fully asynchronous dynamics of a Boolean network f:{0,1}n→{0,1}n without negative loop can be simulated, in a very specific way, by a monotone Boolean network with 2n components. We then use this result to prove that, for every even n, there exists a monotone Boolean network f:{0,1}n→{0,1}n, an initial configuration x and a fixed point y of f such that: (i) y can be reached from x with a fully asynchronous updating strategy, and (ii) all such strategies contains at least 2n2 updates. This contrasts with the following known property: if f:{0,1}n→{0,1}n is monotone, then, for every initial configuration x, there exists a fixed point y such that y can be reached from x with a fully asynchronous strategy that contains at most n updates.

boolean-networks
Kauffmania
numerical-methods
rather-interesting
nudge-targets
distributed-processing
consider:looking-to-see
consider:representation
to-write-about
april 2017 by Vaguery

[1511.02667] An Efficient Multilinear Optimization Framework for Hypergraph Matching

april 2017 by Vaguery

Hypergraph matching has recently become a popular approach for solving correspondence problems in computer vision as it allows to integrate higher-order geometric information. Hypergraph matching can be formulated as a third-order optimization problem subject to the assignment constraints which turns out to be NP-hard. In recent work, we have proposed an algorithm for hypergraph matching which first lifts the third-order problem to a fourth-order problem and then solves the fourth-order problem via optimization of the corresponding multilinear form. This leads to a tensor block coordinate ascent scheme which has the guarantee of providing monotonic ascent in the original matching score function and leads to state-of-the-art performance both in terms of achieved matching score and accuracy. In this paper we show that the lifting step to a fourth-order problem can be avoided yielding a third-order scheme with the same guarantees and performance but being two times faster. Moreover, we introduce a homotopy type method which further improves the performance.

algorithms
numerical-methods
graph-theory
hypergraphs
feature-construction
representation
computer-vision
nudge-targets
consider:looking-to-see
consider:representation
to-write-about
april 2017 by Vaguery

[1703.04823] Classification in biological networks with hypergraphlet kernels

april 2017 by Vaguery

Biological and cellular systems are often modeled as graphs in which vertices represent objects of interest (genes, proteins, drugs) and edges represent relational ties among these objects (binds-to, interacts-with, regulates). This approach has been highly successful owing to the theory, methodology and software that support analysis and learning on graphs. Graphs, however, often suffer from information loss when modeling physical systems due to their inability to accurately represent multiobject relationships. Hypergraphs, a generalization of graphs, provide a framework to mitigate information loss and unify disparate graph-based methodologies. In this paper, we present a hypergraph-based approach for modeling physical systems and formulate vertex classification, edge classification and link prediction problems on (hyper)graphs as instances of vertex classification on (extended, dual) hypergraphs in a semi-supervised setting. We introduce a novel kernel method on vertex- and edge-labeled (colored) hypergraphs for analysis and learning. The method is based on exact and inexact (via hypergraph edit distances) enumeration of small simple hypergraphs, referred to as hypergraphlets, rooted at a vertex of interest. We extensively evaluate this method and show its potential use in a positive-unlabeled setting to estimate the number of missing and false positive links in protein-protein interaction networks.

machine-learning
representation
rather-interesting
hypergraphs
to-write-about
nudge-targets
consider:representation
consider:looking-to-see
bioinformatics
clustering
metrics
april 2017 by Vaguery

[1305.1880] A Heuristic for Magic and Antimagic Graph Labellings

april 2017 by Vaguery

Graph labellings have been a very fruitful area of research in the last four decades. However, despite the staggering number of papers published in the field (over 1000), few general results are available, and most papers deal with particular classes of graphs and methods. Here we approach the problem from the computational viewpoint, and in a quite general way. We present the existence problem of a particular labelling as a combinatorial optimization problem, then we discuss the possible strategies to solve it, and finally we present a heuristic for finding different classes of labellings, like vertex-, edge-, or face-magic, and $(a, d)$-antimagic $(v, e, f)$-labellings. The algorithm has been implemented in C++ and MATLAB, and with its aid we have been able to derive new results for some classes of graphs, in particular, vertex-antimagic edge labellings for small graphs of the type $P_2^r \times P_3^s$, for which no general construction is known so far.

mathematical-recreations
integer-programming
constraint-satisfaction
construction
heuristics
nudge-targets
consider:looking-to-see
consider:representation
april 2017 by Vaguery

[1509.07756] Unraveling the secret of Benjamin Franklin: Constructing Franklin squares of higher order

april 2017 by Vaguery

Benjamin Franklin constructed three squares which have amazing properties, and his method of construction has been a mystery to date. In this article, we divulge his secret and show how to construct such squares for any order.

mathematical-recreations
combinatorics
magic-squares
history
to-write-about
constraint-satisfaction
nudge-targets
consider:representation
consider:looking-to-see
april 2017 by Vaguery

[math/0602300] Deterministic Random Walks on the Integers

april 2017 by Vaguery

Jim Propp's P-machine, also known as the "rotor router model" is a simple deterministic process that simulates a random walk on a graph. Instead of distributing chips to randomly chosen neighbors, it serves the neighbors in a fixed order.

We investigate how well this process simulates a random walk. For the graph being the infinite path, we show that, independent of the starting configuration, at each time and on each vertex, the number of chips on this vertex deviates from the expected number of chips in the random walk model by at most a constant c_1, which is approximately 2.29. For intervals of length L, this improves to a difference of O(log L), for the L_2 average of a contiguous set of intervals even to O(sqrt{log L}). All these bounds are tight.

graph-theory
rotor-router
feature-construction
combinatorics
rather-interesting
to-write-about
nudge-targets
consider:representation
We investigate how well this process simulates a random walk. For the graph being the infinite path, we show that, independent of the starting configuration, at each time and on each vertex, the number of chips on this vertex deviates from the expected number of chips in the random walk model by at most a constant c_1, which is approximately 2.29. For intervals of length L, this improves to a difference of O(log L), for the L_2 average of a contiguous set of intervals even to O(sqrt{log L}). All these bounds are tight.

april 2017 by Vaguery

[1702.06135] Enabling Multi-Source Neural Machine Translation By Concatenating Source Sentences In Multiple Languages

april 2017 by Vaguery

In this paper, we propose a novel and elegant solution to "Multi-Source Neural Machine Translation" (MSNMT) which only relies on preprocessing a N-way multilingual corpus without modifying the Neural Machine Translation (NMT) architecture or training procedure. We simply concatenate the source sentences to form a single long multi-source input sentence while keeping the target side sentence as it is and train an NMT system using this preprocessed corpus. We evaluate our method in resource poor as well as resource rich settings and show its effectiveness (up to 4 BLEU using 2 source languages and up to 6 BLEU using 5 source languages) by comparing against existing methods for MSNMT. We also provide some insights on how the NMT system leverages multilingual information in such a scenario by visualizing attention.

natural-language-processing
representation
neural-networks
rather-interesting
nudge-targets
consider:representation
to-write-about
consider:other-applications
april 2017 by Vaguery

[1704.00224] A Time-Frequency Domain Approach of Heart Rate Estimation From Photoplethysmographic (PPG) Signal

april 2017 by Vaguery

Objective- Heart rate monitoring using wrist type Photoplethysmographic (PPG) signals is getting popularity because of construction simplicity and low cost of wearable devices. The task becomes very difficult due to the presence of various motion artifacts. The objective is to develop algorithms to reduce the effect of motion artifacts and thus obtain accurate heart rate estimation. Methods- Proposed heart rate estimation scheme utilizes both time and frequency domain analyses. Unlike conventional single stage adaptive filter, multi-stage cascaded adaptive filtering is introduced by using three channel accelerometer data to reduce the effect of motion artifacts. Both recursive least squares (RLS) and least mean squares (LMS) adaptive filters are tested. Moreover, singular spectrum analysis (SSA) is employed to obtain improved spectral peak tracking. The outputs from the filter block and SSA operation are logically combined and used for spectral domain heart rate estimation. Finally, a tracking algorithm is incorporated considering neighbouring estimates. Results- The proposed method provides an average absolute error of 1.16 beat per minute (BPM) with a standard deviation of 1.74 BPM while tested on publicly available database consisting of recordings from 12 subjects during physical activities. Conclusion- It is found that the proposed method provides consistently better heart rate estimation performance in comparison to that recently reported by TROIKA, JOSS and SPECTRAP methods. Significance- The proposed method offers very low estimation error and a smooth heart rate tracking with simple algorithmic approach and thus feasible for implementing in wearable devices to monitor heart rate for fitness and clinical purpose.

signal-processing
time-series
engineering-design
rather-interesting
inference
nudge-targets
consider:looking-to-see
consider:representation
april 2017 by Vaguery

[1701.01649] Existence of Some Signed Magic Arrays

april 2017 by Vaguery

We consider the notion of a signed magic array, which is an $m \times n$ rectangular array with the same number of filled cells $s$ in each row and the same number of filled cells $t$ in each column, filled with a certain set of numbers that is symmetric about the number zero, such that every row and column has a zero sum. We attempt to make progress toward a characterization of for which $(m, n, s, t)$ there exists such an array. This characterization is complete in the case where $n = s$ and in the case where $n = m$; we also characterize three-fourths of the cases where $n = 2m$.

number-theory
constraint-satisfaction
mathematical-recreations
nudge-targets
consider:representation
consider:looking-to-see
april 2017 by Vaguery

[1704.00794] Time Series Cluster Kernel for Learning Similarities between Multivariate Time Series with Missing Data

april 2017 by Vaguery

Similarity-based approaches represent a promising direction for time series analysis. However, many such methods rely on parameter tuning and have shortcomings if the time series are multivariate (MTS) and contain missing data. In this paper, we address these challenges within the powerful context of kernel methods by proposing the robust \emph{time series cluster kernel} (TCK). The approach taken is to leverage the missing data handling properties of Gaussian mixture models (GMM) augmented with informative prior distributions. An ensemble learning approach is exploited to ensure robustness to parameters by combining the clustering results of many GMM to form the final kernel.

We evaluate the TCK on synthetic and real data and compare to other state-of-the-art techniques. The experimental results demonstrate that the TCK is robust to parameter choices, provides competitive results for MTS without missing data and outstanding results for missing data.

time-series
clustering
feature-extraction
statistics
algorithms
nudge-targets
consider:representation
We evaluate the TCK on synthetic and real data and compare to other state-of-the-art techniques. The experimental results demonstrate that the TCK is robust to parameter choices, provides competitive results for MTS without missing data and outstanding results for missing data.

april 2017 by Vaguery

**related tags**

Copy this bookmark: