consider:performance-measures   197

« earlier    

[1801.01922] Vectorization of Line Drawings via PolyVector Fields
Image tracing is a foundational component of the workflow in graphic design, engineering, and computer animation, linking hand-drawn concept images to collections of smooth curves needed for geometry processing and editing. Even for clean line drawings, modern algorithms often fail to faithfully vectorize junctions, or points at which curves meet; this produces vector drawings with incorrect connectivity. This subtle issue undermines the practical application of vectorization tools and accounts for hesitance among artists and engineers to use automatic vectorization software. To address this issue, we propose a novel image vectorization method based on state-of-the-art mathematical algorithms for frame field processing. Our algorithm is tailored specifically to disambiguate junctions without sacrificing quality.
graphics  algorithms  inference  rather-interesting  feature-construction  nudge-targets  consider:representation  consider:looking-to-see  consider:performance-measures 
4 days ago by Vaguery
[1801.06769] Deep joint rain and haze removal from single images
Rain removal from a single image is a challenge which has been studied for a long time. In this paper, a novel convolutional neural network based on wavelet and dark channel is proposed. On one hand, we think that rain streaks correspond to high frequency component of the image. Therefore, haar wavelet transform is a good choice to separate the rain streaks and background to some extent. More specifically, the LL subband of a rain image is more inclined to express the background information, while LH, HL, HH subband tend to represent the rain streaks and the edges. On the other hand, the accumulation of rain streaks from long distance makes the rain image look like haze veil. We extract dark channel of rain image as a feature map in network. By increasing this mapping between the dark channel of input and output images, we achieve haze removal in an indirect way. All of the parameters are optimized by back-propagation. Experiments on both synthetic and real- world datasets reveal that our method outperforms other state-of- the-art methods from a qualitative and quantitative perspective.
image-processing  machine-learning  deep-learning  algorithms  rather-interesting  consider:performance-measures 
11 days ago by Vaguery
[1606.05381] Deep Image Set Hashing
In applications involving matching of image sets, the information from multiple images must be effectively exploited to represent each set. State-of-the-art methods use probabilistic distribution or subspace to model a set and use specific distance measure to compare two sets. These methods are slow to compute and not compact to use in a large scale scenario. Learning-based hashing is often used in large scale image retrieval as they provide a compact representation of each sample and the Hamming distance can be used to efficiently compare two samples. However, most hashing methods encode each image separately and discard knowledge that multiple images in the same set represent the same object or person. We investigate the set hashing problem by combining both set representation and hashing in a single deep neural network. An image set is first passed to a CNN module to extract image features, then these features are aggregated using two types of set feature to capture both set specific and database-wide distribution information. The computed set feature is then fed into a multilayer perceptron to learn a compact binary embedding. Triplet loss is used to train the network by forming set similarity relations using class labels. We extensively evaluate our approach on datasets used for image matching and show highly competitive performance compared to state-of-the-art methods.
algorithms  machine-learning  rather-interesting  feature-construction  nudge-targets  consider:looking-at-code  consider:performance-measures 
11 days ago by Vaguery
[1709.08878] Generating Sentences by Editing Prototypes
We propose a new generative model of sentences that first samples a prototype sentence from the training corpus and then edits it into a new sentence. Compared to traditional models that generate from scratch either left-to-right or by first sampling a latent sentence vector, our prototype-then-edit model improves perplexity on language modeling and generates higher quality outputs according to human evaluation. Furthermore, the model gives rise to a latent edit vector that captures interpretable semantics such as sentence similarity and sentence-level analogies.
machine-learning  natural-language-processing  rather-interesting  representation  to-write-about  nudge-targets  consider:representation  consider:performance-measures  generative-models 
7 weeks ago by Vaguery
[1705.00441] Learning Topic-Sensitive Word Representations
Distributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling topics and integrating topic distributions for each document we obtain representations that are able to distinguish between different meanings of a given word. Our models yield statistically significant improvements for the lexical substitution task indicating that commonly used single word representations, even when combined with contextual information, are insufficient for this task.
natural-language-processing  feature-construction  neural-networks  algorithms  nudge-targets  consider:looking-to-see  consider:performance-measures 
7 weeks ago by Vaguery
[1709.05725] FlashProfile: Interactive Synthesis of Syntactic Profiles
We address the problem of learning comprehensive syntactic profiles for a set of strings. Real-world datasets, typically curated from multiple sources, often contain data in various formats. Thus any data processing task is preceded by the critical step of data format identification. However, manual inspection of data to identify various formats is infeasible in standard big-data scenarios.
We present a technique for generating comprehensive syntactic profiles in terms of user-defined patterns that also allows for interactive refinement. We define a syntactic profile as a set of succinct patterns that describe the entire dataset. Our approach efficiently learns such profiles, and allows refinement by exposing a desired number of patterns.
Our implementation, FlashProfile, shows a median profiling time of 0.7s over 142 tasks on 74 real datasets. We also show that access to the generated data profiles allow for more accurate synthesis of programs, using fewer examples in programming-by-example workflows.
pattern-discovery  rather-interesting  strings  data-mining  statistics  nudge-targets  consider:looking-to-see  consider:representation  consider:performance-measures 
7 weeks ago by Vaguery
Self-Replicating Functions
There are many exciting directions to explore from here, but I’m going to take a break now, considering this as a nice initial step into the realm of self-replicating functions.
mathematical-recreations  self-similarity  nudge-targets  to-write-about  consider:performance-measures 
12 weeks ago by Vaguery
[1703.07244] A Hybrid Feasibility Constraints-Guided Search to the Two-Dimensional Bin Packing Problem with Due Dates
The two-dimensional non-oriented bin packing problem with due dates packs a set of rectangular items, which may be rotated by 90 degrees, into identical rectangular bins. The bins have equal processing times. An item's lateness is the difference between its due date and the completion time of its bin. The problem packs all items without overlap as to minimize maximum lateness Lmax.
The paper proposes a tight lower bound that enhances an existing bound on Lmax for 24.07% of the benchmark instances and matches it in 30.87% cases. In addition, it models the problem using mixed integer programming (MIP), and solves small-sized instances exactly using CPLEX. It approximately solves larger-sized instances using a two-stage heuristic. The first stage constructs an initial solution via a first-fit heuristic that applies an iterative constraint programming (CP)-based neighborhood search. The second stage, which is iterative too, approximately solves a series of assignment low-level MIPs that are guided by feasibility constraints. It then enhances the solution via a high-level random local search. The approximate approach improves existing upper bounds by 27.45% on average, and obtains the optimum for 33.93% of the instances. Overall, the exact and approximate approaches identify the optimum for 39.07% cases.
The proposed approach is applicable to complex problems. It applies CP and MIP sequentially, while exploring their advantages, and hybridizes heuristic search with MIP. It embeds a new lookahead strategy that guards against infeasible search directions and constrains the search to improving directions only; thus, differs from traditional lookahead beam searches.
operations-research  planning  algorithms  hard-problems  bin-packing  computational-complexity  to-write-about  rather-interesting  nudge-targets  consider:multiobjective-approach  consider:performance-measures 
november 2017 by Vaguery
[1710.01410] Learning Registered Point Processes from Idiosyncratic Observations
A parametric point process model is developed, with modeling based on the assumption that sequential observations often share latent phenomena, while also possessing idiosyncratic effects. An alternating optimization method is proposed to learn a "registered" point process that accounts for shared structure, as well as "warping" functions that characterize idiosyncratic aspects of each observed sequence. Under reasonable constraints, in each iteration we update the sample-specific warping functions by solving a set of constrained nonlinear programming problems in parallel, and update the model by maximum likelihood estimation. The justifiability, complexity and robustness of the proposed method are investigated in detail. Experiments on both synthetic and real-world data demonstrate that the method yields explainable point process models, achieving encouraging results compared to state-of-the-art methods.
modeling-is-not-mathematics  rather-interesting  representation  machine-learning  algorithms  nudge  nudge-targets  consider:representation  consider:performance-measures 
november 2017 by Vaguery
[1611.08035] Automating the Last-Mile for High Performance Dense Linear Algebra
High performance dense linear algebra (DLA) libraries often rely on a general matrix multiply (Gemm) kernel that is implemented using assembly or with vector intrinsics. In particular, the real-valued Gemm kernels provide the overwhelming fraction of performance for the complex-valued Gemm kernels, along with the entire level-3 BLAS and many of the real and complex LAPACK routines. Thus,achieving high performance for the Gemm kernel translates into a high performance linear algebra stack above this kernel. However, it is a monumental task for a domain expert to manually implement the kernel for every library-supported architecture. This leads to the belief that the craft of a Gemm kernel is more dark art than science. It is this premise that drives the popularity of autotuning with code generation in the domain of DLA.
This paper, instead, focuses on an analytical approach to code generation of the Gemm kernel for different architecture, in order to shed light on the details or voo-doo required for implementing a high performance Gemm kernel. We distill the implementation of the kernel into an even smaller kernel, an outer-product, and analytically determine how available SIMD instructions can be used to compute the outer-product efficiently. We codify this approach into a system to automatically generate a high performance SIMD implementation of the Gemm kernel. Experimental results demonstrate that our approach yields generated kernels with performance that is competitive with kernels implemented manually or using empirical search.
matrices  algorithms  rather-interesting  optimization  code-generation  nudge-targets  consider:looking-to-see  consider:performance-measures 
november 2017 by Vaguery
[1406.5895] Universal Lyndon Words
A word w over an alphabet Σ is a Lyndon word if there exists an order defined on Σ for which w is lexicographically smaller than all of its conjugates (other than itself). We introduce and study \emph{universal Lyndon words}, which are words over an n-letter alphabet that have length n! and such that all the conjugates are Lyndon words. We show that universal Lyndon words exist for every n and exhibit combinatorial and structural properties of these words. We then define particular prefix codes, which we call Hamiltonian lex-codes, and show that every Hamiltonian lex-code is in bijection with the set of the shortest unrepeated prefixes of the conjugates of a universal Lyndon word. This allows us to give an algorithm for constructing all the universal Lyndon words.
combinatorics  strings  to-write-about  nudge-targets  consider:rediscovery  consider:performance-measures 
november 2017 by Vaguery
[1702.02017] Pushing the Bounds for Matrix-Matrix Multiplication
A tight lower bound for required I/O when computing a matrix-matrix multiplication on a processor with two layers of memory is established. Prior work obtained weaker lower bounds by reasoning about the number of \textit{phases} needed to perform C:=AB, where each phase is a series of operations involving S reads and writes to and from fast memory, and S is the size of fast memory. A lower bound on the number of phases was then determined by obtaining an upper bound on the number of scalar multiplications performed per phase. This paper follows the same high level approach, but improves the lower bound by considering C:=AB+C instead of C:=AB, and obtains the maximum number of scalar fused multiply-adds (FMAs) per phase instead of scalar additions. Key to obtaining the new result is the decoupling of the per-phase I/O from the size of fast memory. The new lower bound is 2mnk/S‾√−2S. The constant for the leading term is an improvement of a factor 42‾√. A theoretical algorithm that attains the lower bound is given, and how the state-of-the-art Goto's algorithm also in some sense meets the lower bound is discussed.
computational-complexity  algorithms  matrices  rather-interesting  nudge-targets  consider:looking-to-see  consider:representation  consider:performance-measures 
november 2017 by Vaguery
[1710.01992] Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks
Convolutional neural networks have recently demonstrated high-quality reconstruction for single image super-resolution. However, existing methods often require a large number of network parameters and entail heavy computational loads at runtime for generating high-accuracy super-resolution results. In this paper, we propose the deep Laplacian Pyramid Super-Resolution Network for fast and accurate image super-resolution. The proposed network progressively reconstructs the sub-band residuals of high-resolution images at multiple pyramid levels. In contrast to existing methods that involve the bicubic interpolation for pre-processing (which results in large feature maps), the proposed method directly extracts features from the low-resolution input space and thereby entails low computational loads. We train the proposed network with deep supervision using the robust Charbonnier loss functions and achieve high-quality image reconstruction. Furthermore, we utilize the recursive layers to share parameters across as well as within pyramid levels, and thus drastically reduce the number of parameters. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of run-time and image quality.
superresolution  neural-networks  representation  algorithms  image-processing  generative-models  to-write-about  nudge-targets  consider:representation  consider:performance-measures 
november 2017 by Vaguery
[1401.6200] Three Series for the Generalized Golden Mean
As is well-known, the ratio of adjacent Fibonacci numbers tends to phi = (1 + sqrt(5))/2, and the ratio of adjacent Tribonacci numbers (where each term is the sum of the three preceding numbers) tends to the real root eta of X^3 - X^2 - X - 1 = 0. Letting alpha(n) denote the corresponding ratio for the generalized Fibonacci numbers, where each term is the sum of the n preceding, we obtain rapidly converging series for alpha(n), 1/alpha(n), and 1/(2-alpha(n)).
number-theory  approximation  rather-interesting  series  convergence  nudge-targets  consider:evolving-some  consider:performance-measures  consider:convergence  consider:complexity 
october 2017 by Vaguery
[1607.05342] On Integer Programming and the Path-width of the Constraint Matrix
In the classic Integer Programming (IP) problem, the objective is to decide whether, for a given m×n matrix A and an m-vector b=(b1,…,bm), there is a non-negative integer n-vector x such that Ax=b. Solving (IP) is an important step in numerous algorithms and it is important to obtain an understanding of the precise complexity of this problem as a function of natural parameters of the input. Two significant results in this line of research are the pseudo-polynomial time algorithms for (IP) when the number of constraints is a constant [Papadimitriou, J. ACM 1981] and when the branch-width of the column-matroid corresponding to the constraint matrix is a constant [Cunningham and Geelen, IPCO 2007]. In this paper, we prove matching upper and lower bounds for (IP) when the path-width of the corresponding column-matroid is a constant. These lower bounds provide evidence that the algorithm of Cunningham and Geelen, are probably optimal. We also obtain a separate lower bound providing evidence that the algorithm of Papadimitriou is close to optimal.
classification  benchmarking  rather-interesting  mathematical-programming  matrices  feature-construction  nudge-targets  consider:rediscovery  consider:performance-measures  computational-complexity 
october 2017 by Vaguery
[1610.04041] A resource dependent protein synthesis model for evaluating synthetic circuits
Reliable in-silico design of synthetic gene networks necessitates novel approaches to model the process of protein synthesis under the influence of limited resources. We present such a novel protein synthesis model which originates from the Ribosome Flow Model and among other things describes the movement of RNA-polymerase and Ribosomes on mRNA and DNA templates respectively. By analyzing the convergence properties of this model based upon geometric considerations we present additional insights into the dynamic mechanisms of the process of protein synthesis. Further, we exemplarily show how this model can be used to evaluate the performance of synthetic gene circuits under different loading scenarios.
systems-biology  reaction-networks  engineering-design  emergent-design  biological-engineering  to-write-about  performance-measure  nudge-targets  consider:performance-measures 
october 2017 by Vaguery
[1604.03159] Phase Transitions and a Model Order Selection Criterion for Spectral Graph Clustering
One of the longstanding open problems in spectral graph clustering (SGC) is the so-called model order selection problem: automated selection of the correct number of clusters. This is equivalent to the problem of finding the number of connected components or communities in an undirected graph. We propose automated model order selection (AMOS), a solution to the SGC model selection problem under a random interconnection model (RIM) using a novel selection criterion that is based on an asymptotic phase transition analysis. AMOS can more generally be applied to discovering hidden block diagonal structure in symmetric non-negative matrices. Numerical experiments on simulated graphs validate the phase transition analysis, and real-world network data is used to validate the performance of the proposed model selection procedure.
network-theory  clustering  community-detection  algorithms  performance-measure  rather-interesting  consider:looking-to-see  consider:performance-measures 
october 2017 by Vaguery
Math Notes - Futility Closet
Can a sum of such fractions produce any natural number? That’s conjectured — but so far unproven.
number-theory  continued-fractions  mathematical-recreations  open-questions  nudge-targets  consider:looking-to-see  consider:performance-measures 
october 2017 by Vaguery
[1602.04967] Strongly Universal Reversible Gate Sets
It is well-known that the Toffoli gate and the negation gate together yield a universal gate set, in the sense that every permutation of {0,1}n can be implemented as a composition of these gates. Since every bit operation that does not use all of the bits performs an even permutation, we need to use at least one auxiliary bit to perform every permutation, and it is known that one bit is indeed enough. Without auxiliary bits, all even permutations can be implemented. We generalize these results to non-binary logic: If A is a finite set of odd cardinality then a finite gate set can generate all permutations of An for all n, without any auxiliary symbols. If the cardinality of A is even then, by the same argument as above, only even permutations of An can be implemented for large n, and we show that indeed all even permutations can be obtained from a finite universal gate set. We also consider the conservative case, that is, those permutations of An that preserve the weight of the input word. The weight is the vector that records how many times each symbol occurs in the word. It turns out that no finite conservative gate set can, for all n, implement all conservative even permutations of An without auxiliary bits. But we provide a finite gate set that can implement all those conservative permutations that are even within each weight class of An.
automata  non-binary-representation  representation  engineering-design  formal-languages  rather-interesting  distributed-processing  digital-processing  nudge-targets  consider:performance-measures  combinatorics 
october 2017 by Vaguery
[1705.01991] Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU
Attentional sequence-to-sequence models have become the new standard for machine translation, but one challenge of such models is a significant increase in training and decoding cost compared to phrase-based systems. Here, we focus on efficient decoding, with a goal of achieving accuracy close the state-of-the-art in neural machine translation (NMT), while achieving CPU decoding speed/throughput close to that of a phrasal decoder.
We approach this problem from two angles: First, we describe several techniques for speeding up an NMT beam search decoder, which obtain a 4.4x speedup over a very efficient baseline decoder without changing the decoder output. Second, we propose a simple but powerful network architecture which uses an RNN (GRU/LSTM) layer at bottom, followed by a series of stacked fully-connected layers applied at every timestep. This architecture achieves similar accuracy to a deep recurrent model, at a small fraction of the training and decoding cost. By combining these techniques, our best system achieves a very competitive accuracy of 38.3 BLEU on WMT English-French NewsTest2014, while decoding at 100 words/sec on single-threaded CPU. We believe this is the best published accuracy/speed trade-off of an NMT system.
machine-learning  representation  algorithms  deep-learning  neural-networks  approximation  nudge-targets  consider:performance-measures  consider:representation  natural-language-processing 
september 2017 by Vaguery

« earlier    

related tags

3d  aesthetics  algebra  algorithms  approximation  archaeology  architecture  artificial-intelligence  assumptions  automata  benchmarking  bin-packing  bioinformatics  biological-engineering  biologically-inspired  cellular-automata  classification  clustering  code-generation  coding  collective-behavior  collective-intelligence  color-theory  combinatorics  community-detection  complexology  compressed-sensing  compression  computational-complexity  computational-geometry  computer-science  consider:complexity  consider:convergence  consider:evolving-some  consider:feature-discovery  consider:interestingness  consider:looking-at-code  consider:looking-to-see  consider:metaheuristics  consider:multiobjective-approach  consider:multiobjective-versions  consider:novelty-search  consider:parsimony  consider:rediscovery  consider:reds  consider:representation  consider:robustness  consider:stress-testing  consider:symbolic-regression  constraint-satisfaction  construction  continued-fractions  convergence  counting  cryptography  data-analysis  data-fusion  data-geometry  data-mining  database  databases  dataset  decision-making  deep-learning  define-your-terms  developmental-biology  digital-processing  discrete-mathematics  distributed-processing  dna-assembly  dynamical-systems  economics  emergent-design  energy-landscapes  engineering-design  enumeration  error  exploratory-data-analysis  expressive-representations  feature-construction  feature-extraction  fibonacci-everything  financial-engineering  finite-elements  fitness-landscapes  formal-languages  fractals  game-theory  games  generalization  generative-art  generative-models  generative-processes  genetic-algorithm  geometry  good-ol-dykstra  gossip-algorithm  graph-databases  graph-layout  graph-theory  graphic-design  graphics  greedy?  hard-problems  horse-races  how-much-is-that-in-utiles?  hyperheuristics  hyperresolution  hypersets  image-analysis  image-processing  image-segmentation  inference  information-theory  introspection  inverse-problems  latin-squares  learning-from-data  linear-algebra  looking-to-see  machine-learning  magic-squares  materials-science  mathematical-programming  mathematical-recreations  mathematics  matrices  medical-technology  metaheuristics  metamaterials  metrics  modeling-is-not-mathematics  modeling  models-and-modes  molecular-biology  multiobjective-optimization  multitask-learning  natural-language-processing  network-theory  neural-networks  non-binary-representation  nonlinear-dynamics  nonstandard-computation  not-necessarily-good-philosophy-of-science  nudge-targets  nudge  number-theory  numerical-methods  oh-computer-science  ontology  open-problems  open-questions  open-source  operations-research  optimization  optimizers  order-statistics  out-of-the-box  packing  parallel  parsing  pattern-discovery  patterns  performance-measure  permutations  philosophy-of-engineering  philosophy-of-science  phyllotaxis  physics  plane-geometry  planning  portfolio-theory  prediction  preferences  probability-theory  proof  public-policy  purdy-pitchers  queueing-theory  rather-interesting  reaction-networks  reinforcement-learning  reminds-me-of-wim's-paper  representation  rewriting-systems  robustness  salience  sampling  satisfiability  scheduling  self-driving-juggernauts  self-organization  self-similarity  sequences  series  signal-processing  simulation  software-development  statistics  strings  superresolution  systems-biology  tiling  time-series  to-read  to-understand  to-write-about  to-write  tomography  user-centric?  video  visualization 

Copy this bookmark: