consider:feature-discovery   249

« earlier    

Exactly how bad is the 13 times table? | The Aperiodical
Along the way, OEIS editor Charles R Greathouse IV added this intriguing conjecture:

Conjecture: a(n)≤N
for all n
. Perhaps N
can be taken as 81
.
number-theory  mathematical-recreations  open-questions  to-write-about  consider:feature-discovery 
5 weeks ago by Vaguery
[1709.04109] Empower Sequence Labeling with Task-Aware Neural Language Model
Linguistic sequence labeling is a general modeling approach that encompasses a variety of problems, such as part-of-speech tagging and named entity recognition. Recent advances in neural networks (NNs) make it possible to build reliable models without handcrafted features. However, in many cases, it is hard to obtain sufficient annotations to train these models. In this study, we develop a novel neural framework to extract abundant knowledge hidden in raw texts to empower the sequence labeling task. Besides word-level knowledge contained in pre-trained word embeddings, character-aware neural language models are incorporated to extract character-level knowledge. Transfer learning techniques are further adopted to mediate different components and guide the language model towards the key knowledge. Comparing to previous methods, these task-specific knowledge allows us to adopt a more concise model and conduct more efficient training. Different from most transfer learning methods, the proposed framework does not rely on any additional supervision. It extracts knowledge from self-contained order information of training sequences. Extensive experiments on benchmark datasets demonstrate the effectiveness of leveraging character-level knowledge and the efficiency of co-training. For example, on the CoNLL03 NER task, model training completes in about 6 hours on a single GPU, reaching F1 score of 91.71±0.10 without using any extra annotation.
natural-language-processing  deep-learning  neural-networks  nudge-targets  consider:feature-discovery  consider:representation  to-write-about 
7 weeks ago by Vaguery
Estimating barriers to gene flow from distorted isolation by distance patterns | bioRxiv
In continuous populations with local migration, nearby pairs of individuals have on average more similar genotypes than geographically well separated pairs. A barrier to gene flow distorts this classical pattern of isolation by distance. Genetic similarity is decreased for sample pairs on different sides of the barrier and increased for pairs on the same side near the barrier. Here, we introduce an inference scheme that utilizes this signal to detect and estimate the strength of a linear barrier to gene flow in two-dimensions. We use a diffusion approximation to model the effects of a barrier on the geographical spread of ancestry backwards in time. This approach allows us to calculate the chance of recent coalescence and probability of identity by descent. We introduce an inference scheme that fits these theoretical results to the geographical covariance structure of bialleleic genetic markers. It can estimate the strength of the barrier as well as several demographic parameters. We investigate the power of our inference scheme to detect barriers by applying it to a wide range of simulated data. We also showcase an example application to a Antirrhinum majus (snapdragon) flower color hybrid zone, where we do not detect any signal of a strong genome wide barrier to gene flow.
population-biology  theoretical-biology  rather-interesting  simulation  to-write-about  consider:feature-discovery 
7 weeks ago by Vaguery
[1712.08373] Notes on complexity of packing coloring
A packing k-coloring for some integer k of a graph G=(V,E) is a mapping
φ:V→{1,…,k} such that any two vertices u,v of color φ(u)=φ(v) are in distance at least φ(u)+1. This concept is motivated by frequency assignment problems. The \emph{packing chromatic number} of G is the smallest k such that there exists a packing k-coloring of G.
Fiala and Golovach showed that determining the packing chromatic number for chordal graphs is \NP-complete for diameter exactly 5. While the problem is easy to solve for diameter 2, we show \NP-completeness for any diameter at least 3. Our reduction also shows that the packing chromatic number is hard to approximate within n1/2−ε for any ε>0.
In addition, we design an \FPT algorithm for interval graphs of bounded diameter. This leads us to exploring the problem of finding a partial coloring that maximizes the number of colored vertices.
graph-theory  algorithms  combinatorics  proof  approximation  nudge-targets  consider:looking-to-see  consider:feature-discovery 
9 weeks ago by Vaguery
[1710.02271] Unsupervised Extraction of Representative Concepts from Scientific Literature
This paper studies the automated categorization and extraction of scientific concepts from titles of scientific articles, in order to gain a deeper understanding of their key contributions and facilitate the construction of a generic academic knowledgebase. Towards this goal, we propose an unsupervised, domain-independent, and scalable two-phase algorithm to type and extract key concept mentions into aspects of interest (e.g., Techniques, Applications, etc.). In the first phase of our algorithm we propose PhraseType, a probabilistic generative model which exploits textual features and limited POS tags to broadly segment text snippets into aspect-typed phrases. We extend this model to simultaneously learn aspect-specific features and identify academic domains in multi-domain corpora, since the two tasks mutually enhance each other. In the second phase, we propose an approach based on adaptor grammars to extract fine grained concept mentions from the aspect-typed phrases without the need for any external resources or human effort, in a purely data-driven manner. We apply our technique to study literature from diverse scientific domains and show significant gains over state-of-the-art concept extraction techniques. We also present a qualitative analysis of the results obtained.
natural-language-processing  POS-tagging  algorithms  data-fusion  machine-learning  text-mining  nudge-targets  consider:feature-discovery 
12 weeks ago by Vaguery
[1802.08435] Efficient Neural Audio Synthesis
Sequential models achieve state-of-the-art results in audio, visual and textual domains with respect to both estimating the data distribution and generating high-quality samples. Efficient sampling for this class of models has however remained an elusive problem. With a focus on text-to-speech synthesis, we describe a set of general techniques for reducing sampling time while maintaining high output quality. We first describe a single-layer recurrent neural network, the WaveRNN, with a dual softmax layer that matches the quality of the state-of-the-art WaveNet model. The compact form of the network makes it possible to generate 24kHz 16-bit audio 4x faster than real time on a GPU. Second, we apply a weight pruning technique to reduce the number of weights in the WaveRNN. We find that, for a constant number of parameters, large sparse networks perform better than small dense networks and this relationship holds for sparsity levels beyond 96%. The small number of weights in a Sparse WaveRNN makes it possible to sample high-fidelity audio on a mobile CPU in real time. Finally, we propose a new generation scheme based on subscaling that folds a long sequence into a batch of shorter sequences and allows one to generate multiple samples at once. The Subscale WaveRNN produces 16 samples per step without loss of quality and offers an orthogonal method for increasing sampling efficiency.
audio-synthesis  machine-learning  WaveNet  neural-networks  signal-processing  time-series  generative-models  to-write-about  nudge-targets  recurrent-networks  performance-measure  consider:feature-discovery 
12 weeks ago by Vaguery
[1506.09039] Scalable Discrete Sampling as a Multi-Armed Bandit Problem
Drawing a sample from a discrete distribution is one of the building components for Monte Carlo methods. Like other sampling algorithms, discrete sampling suffers from the high computational burden in large-scale inference problems. We study the problem of sampling a discrete random variable with a high degree of dependency that is typical in large-scale Bayesian inference and graphical models, and propose an efficient approximate solution with a subsampling approach. We make a novel connection between the discrete sampling and Multi-Armed Bandits problems with a finite reward population and provide three algorithms with theoretical guarantees. Empirical evaluations show the robustness and efficiency of the approximate algorithms in both synthetic and real-world large-scale problems.
sampling  inverse-problems  rather-interesting  probability-theory  simulation  engineering-design  nudge-targets  consider:feature-discovery 
february 2018 by Vaguery
[1707.05994] Computing Tutte Paths
Tutte paths are one of the most successful tools for attacking Hamiltonicity problems in planar graphs. Unfortunately, results based on them are non-constructive, as their proofs inherently use an induction on overlapping subgraphs and these overlaps hinder to bound the running time to a polynomial. For special cases however, computational results of Tutte paths are known: For 4-connected planar graphs, Tutte paths are in fact Hamiltonian paths and Chiba and Nishizeki showed how to compute such paths in linear time. For 3-connected planar graphs, Tutte paths have a more complicated structure, and it has only recently been shown that they can be computed in polynomial time. However, Tutte paths are defined for general 2-connected planar graphs and this is what most applications need. Unfortunately, no computational results are known. We give the first efficient algorithm that computes a Tutte path (for the general case of 2-connected planar graphs). One of the strongest existence results about such Tutte paths is due to Sanders, which allows to prescribe the end vertices and an intermediate edge of the desired path. Encompassing and strengthening all previous computational results on Tutte paths, we show how to compute this special Tutte path efficiently. Our method refines both, the results of Thomassen and Sanders, and avoids overlapping subgraphs by using a novel iterative decomposition along 2-separators. Finally, we show that our algorithm runs in quadratic time.
graph-theory  algorithms  representation  rather-interesting  to-understand  nudge-targets  consider:representation  consider:feature-discovery 
january 2018 by Vaguery
[1710.04640] Hard and Easy Instances of L-Tromino Tilings
In this work we study tilings of regions in the square lattice with L-shaped trominoes. Deciding the existence of a tiling with L-trominoes for an arbitrary region in general is NP-complete, nonetheless, we indentify restrictions to the problem where either it remains NP-complete or it has a polynomial time algorithm. First we show that an aztec diamond of order n always has an L-tromino tiling if and only if n(n+1)≡0mod3; if an aztec diamond has at least two defects or holes, however, the problem of deciding a tiling is NP-complete. Then we study tilings of arbitrary regions where only 180∘ rotations of L-trominoes are available. For this particular case we show that deciding the existence of a tiling remains NP-complete, yet, if a region contains certain so-called "forbidden polyominoes" as subregions, then there exists a polynomial time algorithm for deciding a tiling.
polyominoes  tiling  benchmarking  rather-interesting  problem-solving  nudge-targets  consider:feature-discovery  updated 
november 2017 by Vaguery
[1405.2378] Covering Folded Shapes
Can folding a piece of paper flat make it larger? We explore whether a shape S must be scaled to cover a flat-folded copy of itself. We consider both single folds and arbitrary folds (continuous piecewise isometries S→R2). The underlying problem is motivated by computational origami, and is related to other covering and fixturing problems, such as Lebesgue's universal cover problem and force closure grasps. In addition to considering special shapes (squares, equilateral triangles, polygons and disks), we give upper and lower bounds on scale factors for single folds of convex objects and arbitrary folds of simply connected objects.
computational-geometry  to-write  paper-folding  rather-interesting  nudge-targets  consider:algorithms  consider:feature-discovery 
november 2017 by Vaguery
[1411.6371] Folding a Paper Strip to Minimize Thickness
In this paper, we study how to fold a specified origami crease pattern in order to minimize the impact of paper thickness. Specifically, origami designs are often expressed by a mountain-valley pattern (plane graph of creases with relative fold orientations), but in general this specification is consistent with exponentially many possible folded states. We analyze the complexity of finding the best consistent folded state according to two metrics: minimizing the total number of layers in the folded state (so that a "flat folding" is indeed close to flat), and minimizing the total amount of paper required to execute the folding (where "thicker" creases consume more paper). We prove both problems strongly NP-complete even for 1D folding. On the other hand, we prove the first problem fixed-parameter tractable in 1D with respect to the number of layers.
paper-folding  computational-geometry  optimization  rather-interesting  to-write-about  to-simulate  nudge-targets  consider:feature-discovery 
november 2017 by Vaguery
[1501.00561] A linear-time algorithm for the geodesic center of a simple polygon
Given two points in a simple polygon P of n vertices, its geodesic distance is the length of the shortest path that connects them among all paths that stay within P. The geodesic center of P is the unique point in P that minimizes the largest geodesic distance to all other points of P. In 1989, Pollack, Sharir and Rote [Disc. \& Comput. Geom. 89] showed an O(nlogn)-time algorithm that computes the geodesic center of P. Since then, a longstanding question has been whether this running time can be improved (explicitly posed by Mitchell [Handbook of Computational Geometry, 2000]). In this paper we affirmatively answer this question and present a linear time algorithm to solve this problem.
computational-complexity  computational-geometry  optimization  rather-interesting  algorithms  distance  nudge-targets  consider:rediscovery  consider:feature-discovery 
november 2017 by Vaguery
[1604.08797] Ortho-polygon Visibility Representations of Embedded Graphs
An ortho-polygon visibility representation of an n-vertex embedded graph G (OPVR of G) is an embedding-preserving drawing of G that maps every vertex to a distinct orthogonal polygon and each edge to a vertical or horizontal visibility between its end-vertices. The vertex complexity of an OPVR of G is the minimum k such that every polygon has at most k reflex corners. We present polynomial time algorithms that test whether G has an OPVR and, if so, compute one of minimum vertex complexity. We argue that the existence and the vertex complexity of an OPVR of G are related to its number of crossings per edge and to its connectivity. More precisely, we prove that if G has at most one crossing per edge (i.e., G is a 1-plane graph), an OPVR of G always exists while this may not be the case if two crossings per edge are allowed. Also, if G is a 3-connected 1-plane graph, we can compute an OPVR of G whose vertex complexity is bounded by a constant in O(n) time. However, if G is a 2-connected 1-plane graph, the vertex complexity of any OPVR of G may be Ω(n). In contrast, we describe a family of 2-connected 1-plane graphs for which an embedding that guarantees constant vertex complexity can be computed in O(n) time. Finally, we present the results of an experimental study on the vertex complexity of ortho-polygon visibility representations of 1-plane graphs.
graph-layout  computational-geometry  optimization  rather-interesting  to-write-about  nudge-targets  consider:representation  consider:feature-discovery  algorithms  computational-complexity 
november 2017 by Vaguery
[1609.06972] Minimal completely asymmetric (4,n)-regular matchstick graphs
A matchstick graph is a graph drawn with straight edges in the plane such that the edges have unit length, and non-adjacent edges do not intersect. We call a matchstick graph $(m,n)$-regular if every vertex has only degree $m$ or $n$. In this article we present the latest known $(4,n)$-regular matchstick graphs for $4\leq n\leq11$ with a minimum number of vertices and a completely asymmetric structure. We call a matchstick graph completely asymmetric, if the following conditions are complied. 1) The graph is rigid. 2) The graph has no point, rotational or mirror symmetry. 3) The graph has an asymmetric outer shape. 4) The graph can not be decomposed into rigid subgraphs and rearrange to a similar graph which contradicts to any of the other conditions.
computational-geometry  graph-theory  rather-interesting  purdy-pitchers  to-write-about  nudge-targets  consider:looking-to-see  consider:feature-discovery 
october 2017 by Vaguery
[1512.03421] The extended 1-perfect trades in small hypercubes
An extended 1-perfect trade is a pair (T0,T1) of two disjoint binary distance-4 even-weight codes such that the set of words at distance 1 from T0 coincides with the set of words at distance 1 from T1. Such trade is called primary if any pair of proper subsets of T0 and T1 is not a trade. Using a computer-aided approach, we classify nonequivalent primary extended 1-perfect trades of length 10, constant-weight extended 1-perfect trades of length 12, and Steiner trades derived from them. In particular, all Steiner trades with parameters (5,6,12) are classified.
combinatorics  to-understand  graph-theory  strings  optimization  constraint-satisfaction  looking-to-see  nudge-targets  consider:looking-to-see  consider:classification  consider:feature-discovery 
october 2017 by Vaguery
[1006.4176] Unknotting Unknots
A knot is an an embedding of a circle into three-dimensional space. We say that a knot is unknotted if there is an ambient isotopy of the embedding to a standard circle. By representing knots via planar diagrams, we discuss the problem of unknotting a knot diagram when we know that it is unknotted. This problem is surprisingly difficult, since it has been shown that knot diagrams may need to be made more complicated before they may be simplified. We do not yet know, however, how much more complicated they must get. We give an introduction to the work of Dynnikov who discovered the key use of arc--presentations to solve the problem of finding a way to detect the unknot directly from a diagram of the knot. Using Dynnikov's work, we show how to obtain a quadratic upper bound for the number of crossings that must be introduced into a sequence of unknotting moves. We also apply Dynnikov's results to find an upper bound for the number of moves required in an unknotting sequence.
knot-theory  rather-interesting  representation  algorithms  classification  nudge-targets  consider:looking-to-see  consider:feature-discovery 
october 2017 by Vaguery
[1605.08396] Robust Downbeat Tracking Using an Ensemble of Convolutional Networks
In this paper, we present a novel state of the art system for automatic downbeat tracking from music signals. The audio signal is first segmented in frames which are synchronized at the tatum level of the music. We then extract different kind of features based on harmony, melody, rhythm and bass content to feed convolutional neural networks that are adapted to take advantage of each feature characteristics. This ensemble of neural networks is combined to obtain one downbeat likelihood per tatum. The downbeat sequence is finally decoded with a flexible and efficient temporal model which takes advantage of the metrical continuity of a song. We then perform an evaluation of our system on a large base of 9 datasets, compare its performance to 4 other published algorithms and obtain a significant increase of 16.8 percent points compared to the second best system, for altogether a moderate cost in test and training. The influence of each step of the method is studied to show its strengths and shortcomings.
neural-networks  music  deep-learning  rather-interesting  to-write-about  nudge-targets  consider:feature-discovery 
october 2017 by Vaguery
[cs/0302027] Tiling space and slabs with acute tetrahedra
We show it is possible to tile three-dimensional space using only tetrahedra with acute dihedral angles. We present several constructions to achieve this, including one in which all dihedral angles are less than 77.08∘, and another which tiles a slab in space.
computational-geometry  tiling  rather-interesting  constraint-satisfaction  nudge-targets  consider:feature-discovery  consider:simpler-planar-problem 
october 2017 by Vaguery
[1406.5156] Pattern-avoiding permutations and Brownian excursion Part I: Shapes and fluctuations
Permutations that avoid given patterns are among the most classical objects in combinatorics and have strong connections to many fields of mathematics, computer science and biology. In this paper we study the scaling limits of a random permutation avoiding a pattern of length 3 and their relations to Brownian excursion. Exploring this connection to Brownian excursion allows us to strengthen the recent results of Madras and Pehlivan, and Miner and Pak as well as to understand many of the interesting phenomena that had previously gone unexplained.
permutations  combinatorics  review  rather-interesting  feature-extraction  ranimuladom-walks  simulation  nudge-targets  consider:feature-discovery 
october 2017 by Vaguery
[1406.0670] Decision Algorithms for Fibonacci-Automatic Words, with Applications to Pattern Avoidance
We implement a decision procedure for answering questions about a class of infinite words that might be called (for lack of a better name) "Fibonacci-automatic". This class includes, for example, the famous Fibonacci word f = 01001010..., the fixed point of the morphism 0 -> 01 and 1 -> 0. We then recover many results about the Fibonacci word from the literature (and improve some of them), such as assertions about the occurrences in f of squares, cubes, palindromes, and so forth. As an application of our method we prove a new result: there exists an aperiodic infinite binary word avoiding the pattern x x x^R. This is the first avoidability result concerning a nonuniform morphism proven purely mechanically.
strings  combinatorics  classification  nudge-targets  consider:looking-to-see  consider:feature-discovery 
october 2017 by Vaguery

« earlier    

related tags

acoustics  affect  agent-based  algebra  algorithms  approximation  architecture  attention  audio-synthesis  autoencoders  bayesian  benchmarking  benchmarks  bioinformatics  biological-engineering  boolean-networks  causality  cellular-automata  classification  collective-behavior  collective-intelligence  combinatorics  community-detection  community-formation  compass-and-straightedge  complex-systems  complexology  composition  compressed-sensing  computational-complexity  computational-geometry  computer-vision  conjecture  consider:algorithms  consider:cause-and-effect  consider:classification  consider:complex-mixtures  consider:control-theory  consider:engineering-design  consider:generalizations  consider:how-far-can-it-go  consider:interpretability  consider:looking-to-see  consider:modeling  consider:performance-measures  consider:prediction  consider:rediscovery  consider:representation  consider:simple-examples  consider:simpler-planar-problem  consider:stress-testing  constraint-satisfaction  control-theory  convex-hulls  corpus  coupled-oscillators  curse-of-dimensionality  data-fusion  deep-learning  define-your-terms  distance  dynamical-systems  ecology  electromagnetism  emergent-design  empirical-modeling  empirical-models  engineering-design  enumeration  experiment  extreme-values  feature-construction  feature-extraction  financial-engineering  food-webs  formal-languages  game-theory  generative-art  generative-models  geometry  grammar  graph-layout  graph-theory  graphical-models  hard-problems  hey-i-know-this-guy  horse-races  hypergraphs  image-analysis  image-processing  image-segmentation  inference  information-theory  inverse-problems  iteration  kauffmania  knot-theory  l-systems  learning-from-data  like-the-soft-push-scheme  linguistics  looking-to-see  machine-learning  magic-squares  martin-gardner  materials-science  mathematical-programming  mathematical-recreations  matrices  metaheuristics  metrics  models  multiobjective-optimization  music  nanotechnology  natural-language-processing  network-theory  neural-networks  nonlinear-dynamics  nudge-targets  number-theory  numerical-methods  open-questions  operations-research  optimization  outliers  packing  paper-folding  pascal's-triangle  pattern-discovery  pattern-formation  performance-measure  permutations  phase-transitions  physics!  plane-geometry  planning  polyominoes  population-biology  portfolio-theory  pos-tagging  prediction  probability-theory  problem-solving  proof  purdy-pitchers  random-sampling  random-walks  ranimuladom-walks  rather-interesting  recurrent-networks  representation  review  rewriting-systems  robotics  robustness  sampling  sandpiles  satisfiability  self-assembly  self-organization  sentiment-analysis  sequences  signal-processing  simple-games  simulation  social-psychology  space-filling-bearing  stamp-collecting  statistics  strings  structural-biology  substitution-systems  superresolution  symbiosis  symbolic-regression  systems-biology  text-mining  theoretical-biology  tiling  time-series  to-code  to-do  to-simulate  to-understand  to-write-about  to-write  topology  unsupervised-learning  updated  video  visualization  wavelets  wavenet  whoa-again  whoa 

Copy this bookmark:



description:


tags: