feature-construction   202

« earlier    

[1806.01378] Strong Pseudo Transitivity and Intersection Graphs
A directed graph G=(V,E) is {\it strongly pseudo transitive} if there is a partition {A,E−A} of E so that graphs G1=(V,A) and G2=(V,E−A) are transitive, and additionally, if ab∈A and bc∈E implies that ac∈E. A strongly pseudo transitive graph G=(V,E) is strongly pseudo transitive of the first type, if ab∈A and bc∈E implies ac∈A. An undirected graph is co-strongly pseudo transitive (co-strongly pseudo transitive of the first type) if its complement has an orientation which is strongly pseudo transitive (co-strongly pseudo transitive of the first type). Our purpose is show that the results in computational geometry \cite{CFP, Lu} and intersection graph theory \cite{Ga2, ES} can be unified and extended, using the notion of strong pseudo transitivity. As a consequence the general algorithmic framework in \cite{Sh} is applicable to solve the maximum independent set in O(n3) time in a variety of problems, thereby, avoiding case by case lengthily arguments for each problem. We show that the intersection graphs of axis parallel rectangles intersecting a diagonal line from bottom, and half segments are co-strongly pseudo transitive. In addition, we show that the class of the interval filament graphs is co-strongly transitive of the first type, and hence the class of polygon circle graphs which is contained in the class of interval filament graphs (but contains the classes of chordal graphs, circular arc, circle, and outer planar graphs), and the class of incomparability graphs are strongly transitive of the first type. For class of chordal graphs we give two different proofs, using two different characterizations, verifying that they are co-strongly transitive of the first type. We present some containment results.
graph-theory  feature-extraction  feature-construction  algorithms  computational-complexity  nudge-targets  consider:feature-discovery 
2 days ago by Vaguery
Democracy as an information system — Crooked Timber
Democracy is an information system.

That’s the starting place of our new paper: “Common-Knowledge Attacks on Democracy.” In it, we look at democracy through the lens of information security, trying to understand the current waves of Internet disinformation attacks. Specifically, we wanted to explain why the same disinformation campaigns that act as a stabilizing influence in Russia are destabilizing in the United States.

The answer revolves around the different ways autocracies and democracies work as information systems. We start by differentiating between two types of knowledge that societies use in their political systems. The first is common political knowledge, which is the body of information that people in a society broadly agree on. People agree on who the rulers are and what their claim to legitimacy is. People agree broadly on how their government works, even if they don’t like it. In a democracy, people agree about how elections work: how districts are created and defined, how candidates are chosen, and that their votes count­—even if only roughly and imperfectly.
social-norms  democracy  cultural-dynamics  propaganda  public-policy  political-economy  rather-interesting  epidemiology  feature-construction  discriminators  fascism  signaling 
7 days ago by Vaguery
[1706.07900] Tree-Residue Vertex-Breaking: a new tool for proving hardness
In this paper, we introduce a new problem called Tree-Residue Vertex-Breaking (TRVB): given a multigraph $G$ some of whose vertices are marked "breakable," is it possible to convert $G$ into a tree via a sequence of "vertex-breaking" operations (replacing a degree-$k$ breakable vertex by $k$ degree-$1$ vertices, disconnecting the $k$ incident edges)?
We characterize the computational complexity of TRVB with any combination of the following additional constraints: $G$ must be planar, $G$ must be a simple graph, the degree of every breakable vertex must belong to an allowed list $B$, and the degree of every unbreakable vertex must belong to an allowed list $U$. The two results which we expect to be most generally applicable are that (1) TRVB is polynomially solvable when breakable vertices are restricted to have degree at most $3$; and (2) for any $k \ge 4$, TRVB is NP-complete when the given multigraph is restricted to be planar and to consist entirely of degree-$k$ breakable vertices. To demonstrate the use of TRVB, we give a simple proof of the known result that Hamiltonicity in max-degree-$3$ square grid graphs is NP-hard.
We also demonstrate a connection between TRVB and the Hypergraph Spanning Tree problem. This connection allows us to show that the Hypergraph Spanning Tree problem in $k$-uniform $2$-regular hypergraphs is NP-complete for any $k \ge 4$, even when the incidence graph of the hypergraph is planar.
feature-construction  graph-theory  computational-complexity  rather-interesting  explanation  to-write-about 
7 days ago by Vaguery
[1803.09473] code2vec: Learning Distributed Representations of Code
We present a neural model for representing snippets of code as continuous distributed vectors ("code embeddings"). The main idea is to represent a code snippet as a single fixed-length code vector, which can be used to predict semantic properties of the snippet. This is performed by decomposing code to a collection of paths in its abstract syntax tree, and learning the atomic representation of each path simultaneously with learning how to aggregate a set of them. We demonstrate the effectiveness of our approach by using it to predict a method's name from the vector representation of its body. We evaluate our approach by training a model on a dataset of 14M methods. We show that code vectors trained on this dataset can predict method names from files that were completely unobserved during training. Furthermore, we show that our model learns useful method name vectors that capture semantic similarities, combinations, and analogies. Comparing previous techniques over the same data set, our approach obtains a relative improvement of over 75%, being the first to successfully predict method names based on a large, cross-project, corpus. Our trained model, visualizations and vector similarities are available as an interactive online demo at this http URL. The code, data, and trained models are available at this https URL.
representation  genetic-programming  (it-ain't)  deep-learning  neural-networks  feature-construction  to-write-about  discrete-and-continuous-sittin-in-a-tree 
9 days ago by Vaguery
[1812.01717] Towards Accurate Generative Models of Video: A New Metric & Challenges
Recent advances in deep generative models have lead to remarkable progress in synthesizing high quality images. Following their successful application in image processing and representation learning, an important next step is to consider videos. Learning generative models of video is a much harder task, requiring a model to capture the temporal dynamics of a scene, in addition to the visual presentation of objects. Although recent attempts at formulating generative models of video have had some success, current progress is hampered by (1) the lack of qualitative metrics that consider visual quality, temporal coherence, and diversity of samples, and (2) the wide gap between purely synthetic video datasets and challenging real-world datasets in terms of complexity. To this extent we propose Fréchet Video Distance (FVD), a new metric for generative models of video based on FID, and StarCraft 2 Videos (SCV), a collection of progressively harder datasets that challenge the capabilities of the current iteration of generative models for video. We conduct a large-scale human study, which confirms that FVD correlates well with qualitative human judgment of generated videos, and provide initial benchmark results on SCV.
metrics  Frechet-distance  generative-models  representation  rather-interesting  video  feature-construction  to-write-about 
9 days ago by Vaguery
Fréchet distance - Wikipedia
In mathematics, the Fréchet distance is a measure of similarity between curves that takes into account the location and ordering of the points along the curves. It is named after Maurice Fréchet.
measurement  metrics  data-analysis  feature-construction  distance  computational-geometry  to-consider  ReQ 
20 days ago by Vaguery
Ecological theory provides insights about evolutionary computation [PeerJ Preprints]
Evolutionary algorithms often incorporate ecological concepts to help maintain diverse populations and drive continued innovation. However, while there is strong evidence for the value of ecological dynamics, a lack of overarching theoretical framework renders the precise mechanisms behind these results unclear. These gaps in our understanding make it challenging to predict which approaches will be most appropriate for a given problem. Biologists have been developing ecological theory for decades, but the resulting body of work has yet to be translated into an evolutionary computation context. This paper lays the groundwork for such a translation by applying ecological theory to three different selection mechanisms in evolutionary computation: fitness sharing, lexicase selection, and Eco-EA. First, we use ecological ideas to establish a framework that clarifies how these selection schemes are alike and how they differ. We then build upon this framework by using metrics from ecology to gather empirical data about the underlying differences in the population dynamics that these approaches produce. Specifically, we measure interaction networks and phylogenetic diversity within the population to explore long-term stable coexistence. Notably, we find that selection methods affect phylogenetic diversity differently than phenotypic diversity. These results can inform parameter selection, choice of selection scheme, and the development of new selection schemes.
evolutionary-algorithms  selection  looking-to-see  rather-interesting  hey-I-know-these-folks  artificial-life  feature-construction  community-formation 
6 weeks ago by Vaguery
[1810.10577] Cops and Robbers on Toroidal Chess Graphs
We investigate multiple variants of the game Cops and Robbers. Playing it on an n×n toroidal chess graph, the game is varied by defining moves for cops and robbers differently, always mimicking moves of certain chess pieces. In these cases, the cop number is completely determined.
graph-theory  games  rather-interesting  mathematical-recreations  feature-construction  to-write-about  to-simulate  nudge-targets  consider:classification 
6 weeks ago by Vaguery
[1302.1883] Mesh patterns with superfluous mesh
Mesh patterns are a generalization of classical permutation patterns that encompass classical, bivincular, Bruhat-restricted patterns, and some barred patterns. In this paper, we describe all mesh patterns whose avoidance is coincident with classical avoidance, in a sense declaring that the additional data of a mesh was unnecessary for these patterns. We also describe the permutations having the fewest superfluous meshes, and the permutations having the most, enumerating the superfluous meshes in each case.
permutations  combinatorics  representation  to-write-about  mathematical-recreations  pattern-discovery  feature-construction 
6 weeks ago by Vaguery
[1608.06931] Prolific permutations and permuted packings: downsets containing many large patterns
A permutation of n letters is k-prolific if each (n-k)-subset of the letters in its one-line notation forms a unique pattern. We present a complete characterization of k-prolific permutations for each k, proving that k-prolific permutations of m letters exist for every m \ge k^2/2+2k+1, and that none exist of smaller size. Key to these results is a natural bijection between k-prolific permutations and certain "permuted" packings of diamonds.
combinatorics  permutations  representation  tiling  to-understand  enumeration  optimization  packing  feature-construction 
6 weeks ago by Vaguery
[1704.05494] The pinnacle set of a permutation
Peak sets of a permutation record the indices of its peaks. These sets have been studied in a variety of contexts, including recent work by Billey, Burdzy, and Sagan, which enumerated permutations with prescribed peak sets. In this article, we look at a natural analogue of the peak set of a permutation, instead recording the values of the peaks. We define the "pinnacle set" of a permutation w to be the set {w(i) : i is a peak of w}. Although peak sets and pinnacle sets mark the same phenomenon, these objects differ in notable ways. In the work below, we characterize admissible pinnacle sets and study various enumerative questions related to these objects.
combinatorics  permutations  representation  to-understand  feature-construction  to-write-about 
6 weeks ago by Vaguery
[1803.08530] A Coloring Book Approach to Finding Coordination Sequences
An elementary method is described for finding the coordination sequences for a tiling, based on coloring the underlying graph. We illustrate the method by first applying it to the two kinds of vertices (tetravalent and trivalent) in the Cairo (or dual-3^ tiling. The coordination sequence for a tetravalent vertex turns out, surprisingly, to be 1, 4, 8 ,12, 16, ..., the same as for a vertex in the familiar square (or 4^4) tiling. We thought that such a simple fact should have a simple proof, and this article is the result. We also apply the method to obtain coordination sequences for the 3^,, 4.8^2, 3.12^2, and 3^4.6 uniform tilings, as well as the snub-632 and bew tilings. In several cases the results provide proofs for previously conjectured formulas.
combinatorics  feature-construction  representation  rather-interesting  enumeration  to-write-about  mathematical-recreations  consider:random-graphs  consider:non-tilings 
july 2018 by Vaguery
[1806.00521] The lemniscate tree of a random polynomial
To each generic complex polynomial p(z) there is associated a labeled binary tree (here referred to as a "lemniscate tree") that encodes the topological type of the graph of |p(z)|. The branching structure of the lemniscate tree is determined by the configuration (i.e., arrangement in the plane) of the singular components of those level sets |p(z)|=t passing through a critical point.
In this paper, we address the question "How many branches appear in a typical lemniscate tree?" We answer this question first for a lemniscate tree sampled uniformly from the combinatorial class and second for the lemniscate tree arising from a random polynomial generated by i.i.d. zeros. From a more general perspective, these results take a first step toward a probabilistic treatment (within a specialized setting) of Arnold's program of enumerating algebraic Morse functions.
geometry  topology  rather-interesting  probability-theory  data-structures  feature-construction  to-understand  to-write-about 
june 2018 by Vaguery
[1607.01117] Anagram-free Graph Colouring
An anagram is a word of the form WP where W is a non-empty word and P is a permutation of W. We study anagram-free graph colouring and give bounds on the chromatic number. Alon et al. (2002) asked whether anagram-free chromatic number is bounded by a function of the maximum degree. We answer this question in the negative by constructing graphs with maximum degree 3 and unbounded anagram-free chromatic number. We also prove upper and lower bounds on the anagram-free chromatic number of trees in terms of their radius and pathwidth. Finally, we explore extensions to edge colouring and k-anagram-free colouring.
graph-theory  algorithms  graph-coloring  feature-construction  nudge-targets  consider:looking-to-see  computational-complexity 
may 2018 by Vaguery
[1802.08995] Using Information Invariants to Compare Swarm Algorithms and General Multi-Robot Algorithms: A Technical Report
Robotic swarms are decentralized multi-robot systems whose members use local information from proximal neighbors to execute simple reactive control laws that result in emergent collective behaviors. In contrast, members of a general multi-robot system may have access to global information, all- to-all communication or sophisticated deliberative collabora- tion. Some algorithms in the literature are applicable to robotic swarms. Others require the extra complexity of general multi- robot systems. Given an application domain, a system designer or supervisory operator must choose an appropriate system or algorithm respectively that will enable them to achieve their goals while satisfying mission constraints (e.g. bandwidth, energy, time limits). In this paper, we compare representative swarm and general multi-robot algorithms in two application domains - navigation and dynamic area coverage - with respect to several metrics (e.g. completion time, distance trav- elled). Our objective is to characterize each class of algorithms to inform offline system design decisions by engineers or online algorithm selection decisions by supervisory operators. Our contributions are (a) an empirical performance comparison of representative swarm and general multi-robot algorithms in two application domains, (b) a comparative analysis of the algorithms based on the theory of information invariants, which provides a theoretical characterization supported by our empirical results.
swarms  robotics  planning  collective-intelligence  information-theory  feature-construction  rather-interesting  to-write-about 
march 2018 by Vaguery
[1801.01922] Vectorization of Line Drawings via PolyVector Fields
Image tracing is a foundational component of the workflow in graphic design, engineering, and computer animation, linking hand-drawn concept images to collections of smooth curves needed for geometry processing and editing. Even for clean line drawings, modern algorithms often fail to faithfully vectorize junctions, or points at which curves meet; this produces vector drawings with incorrect connectivity. This subtle issue undermines the practical application of vectorization tools and accounts for hesitance among artists and engineers to use automatic vectorization software. To address this issue, we propose a novel image vectorization method based on state-of-the-art mathematical algorithms for frame field processing. Our algorithm is tailored specifically to disambiguate junctions without sacrificing quality.
graphics  algorithms  inference  rather-interesting  feature-construction  nudge-targets  consider:representation  consider:looking-to-see  consider:performance-measures 
march 2018 by Vaguery
[1606.05381] Deep Image Set Hashing
In applications involving matching of image sets, the information from multiple images must be effectively exploited to represent each set. State-of-the-art methods use probabilistic distribution or subspace to model a set and use specific distance measure to compare two sets. These methods are slow to compute and not compact to use in a large scale scenario. Learning-based hashing is often used in large scale image retrieval as they provide a compact representation of each sample and the Hamming distance can be used to efficiently compare two samples. However, most hashing methods encode each image separately and discard knowledge that multiple images in the same set represent the same object or person. We investigate the set hashing problem by combining both set representation and hashing in a single deep neural network. An image set is first passed to a CNN module to extract image features, then these features are aggregated using two types of set feature to capture both set specific and database-wide distribution information. The computed set feature is then fed into a multilayer perceptron to learn a compact binary embedding. Triplet loss is used to train the network by forming set similarity relations using class labels. We extensively evaluate our approach on datasets used for image matching and show highly competitive performance compared to state-of-the-art methods.
algorithms  machine-learning  rather-interesting  feature-construction  nudge-targets  consider:looking-at-code  consider:performance-measures 
march 2018 by Vaguery
[1709.08119] Analogies between the crossing number and the tangle crossing number
Tanglegrams are special graphs that consist of a pair of rooted binary trees with the same number of leaves, and a perfect matching between the two leaf-sets. These objects are of use in phylogenetics and are represented with straightline drawings where the leaves of the two plane binary trees are on two parallel lines and only the matching edges can cross. The tangle crossing number of a tanglegram is the minimum crossing number over all such drawings and is related to biologically relevant quantities, such as the number of times a parasite switched hosts.
Our main results for tanglegrams which parallel known theorems for crossing numbers are as follows. The removal of a single matching edge in a tanglegram with n leaves decreases the tangle crossing number by at most n−3, and this is sharp. Additionally, if γ(n) is the maximum tangle crossing number of a tanglegram with n leaves, we prove 12(n2)(1−o(1))≤γ(n)<12(n2). Further, we provide an algorithm for computing non-trivial lower bounds on the tangle crossing number in O(n4) time. This lower bound may be tight, even for tanglegrams with tangle crossing number Θ(n2).
graph-theory  feature-construction  phylogenetics  metrics  distance-measures  algorithms  rather-interesting  computational-complexity  nudge-targets 
february 2018 by Vaguery
[1710.02034] Weight = nonlinearity for all small weight Boolean functions
Given a Boolean function f, the (Hamming) weight wt(f) and the nonlinearity N(f) are well known to be important in designing functions that are useful in cryptography. The nonlinearity is expensive to compute, in general, so any shortcuts for doing that for particular functions f are significant. It is clear from the definitions that wt(f) \leq N(f). We give a useful and practical sufficient condition for the weight and nonlinearity to be equal, namely wt(f) \leq (2 to the power n-2) if f has n variables. It is shown that this inequality cannot be improved, in general. This condition can also be used to actually compute the nonlinearity in some cases. As an application, we give a simple proof for the value of the nonlinearity of the well known majority function. Previous proofs relied on many detailed results for the Krawtchouk polynomials.
information-theory  Boolean-functions  representation  distance  cryptography  feature-construction  rather-interesting  to-write-about  consider:genetic-programming 
january 2018 by Vaguery
[1705.00441] Learning Topic-Sensitive Word Representations
Distributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling topics and integrating topic distributions for each document we obtain representations that are able to distinguish between different meanings of a given word. Our models yield statistically significant improvements for the lexical substitution task indicating that commonly used single word representations, even when combined with contextual information, are insufficient for this task.
natural-language-processing  feature-construction  neural-networks  algorithms  nudge-targets  consider:looking-to-see  consider:performance-measures 
january 2018 by Vaguery

« earlier    

related tags

(it-ain't)  aesthetics  agent-based  algorithms  alternative-computational-models  approximation  artificial-life  auto  automata  benchmarking  biophysics  boolean-functions  cellular-automata  cheminformatics  classification  clustering  collective-intelligence  combinatorics  community-detection  community-formation  compare-to-pareto-gp-features  complexology  compression  computational-complexity  computational-geometry  computer-science  computer-vision  consider:approximation  consider:classification  consider:counterexamples  consider:evolutionary-algorithms  consider:exploration-vs-exploitation  consider:feature-discovery  consider:for-push  consider:generalization  consider:generalizing  consider:genetic-programming  consider:looking-at-code  consider:looking-to-see  consider:non-tilings  consider:performance-measures  consider:performance-space-analysis  consider:prediction-from-graph  consider:prediction  consider:random-graphs  consider:rediscovery  consider:reds  consider:representation  consider:review  consider:robustness  consider:simulation  consider:variant-overlays  coupled-oscillators  cryptography  cultural-dynamics  curse-of-dimensionality  data-analysis  data-structures  deep-learning  define-your-terms  democracy  digital-humanities  dimension-reduction  discrete-and-continuous-sittin-in-a-tree  discrete-mathematics  discriminators  distance-measures  distance  distributed-processing  dynamic-programming  dynamical-systems  edit-distance  emergent-design  encoders  energy-landscapes  engineering-design  enumeration  epidemiology  evolutionary-algorithms  explanation  extreme-values  face-recognition  fascism  feature-discovery  feature-extraction  fitness-landscapes  formal-languages  frechet-distance  game-theory  games  generative-art  generative-models  genetic-programming  geometry  graph-coloring  graph-layout  graph-theory  graphics  hard-problems  heuristics  hey-i-know-these-folks  hey-i-know-this-guy  hypergraphs  image-analysis  image-processing  image-synthesis  inference  information-theory  interpretability  introspection  knot-theory  lattice-proteins  learning-from-data  looking-to-see  machine-learning  magic-squares  mathematical-programming  mathematical-recreations  matrices  measurement  meta-optimization  metrics  modeling-is-not-mathematics  modeling  natural-language-processing  network-theory  neural-networks  nonlinear-dynamics  nudge-targets  number-theory  numerical-methods  ontology  optimization  out-of-the-box  outliers  packing  parametrization  party-like-its-1999  pattern-discovery  patterns  performance-measure  performance-space-analysis  permutations  pharmaceutical  philosophy-of-engineering  phylogenetics  plane-geometry  planning  political-economy  polygons  probability-theory  proof  propaganda  public-policy  quantum-computing  quantums  random-walks  rather-interesting  recognition  recommendations  representation  req  rigidity  robotics  robustness  rotor-router  sandpiles  selection  sentiment-analysis  signaling  similarity-measures  simulation  social-networks  social-norms  speech-synthesis  statistics  storytelling  strings  structural-biology  superresolution  swarms  tiling  time-series  time-warping  to-consider  to-do  to-simulate  to-understand  to-write-about  topology  tqsk-decoomposition  universality  updated  variable-selection  video  visualization  wavelets 

Copy this bookmark: