13029
[1705.00927] $(22_4)$ and $(26_4)$ configurations of lines
We present a technique to produce arrangements of lines with nice properties. As an application, we construct (224) and (264) configurations of lines. Thus concerning the existence of geometric (n4) configurations, only the case n=23 remains open.
enumeration  counting  design  rather-interesting  workaround  heuristics  combinatorial-explosion  to-write-about  consider:ontology  consider:rediscovery
yesterday
Khanlou | Continuous Integration
Code that isn’t integrated into your team’s mainline branch should be considered a liability at best, and dead at worst. Your job is to get the small, workable units of code merged in as early as possible.
yesterday
[1907.03902] Uncertainty and causal emergence in complex networks
The connectivity of a network conveys information about the dependencies between nodes. We show that this information can be analyzed by measuring the uncertainty (and certainty) contained in paths along nodes and links in a network. Specifically, we derive from first principles a measure known as effective information and describe its behavior in common network models. Networks with higher effective information contain more information within the dependencies between nodes. We show how subgraphs of nodes can be grouped into macro-nodes, reducing the size of a network while increasing its effective information, a phenomenon known as causal emergence. We find that causal emergence is common in simulated and real networks across biological, social, informational, and technological domains. Ultimately, these results show that the emergence of higher scales in networks can be directly assessed, and that these higher scales offer a way to create certainty out of uncertainty.
network-theory  rather-interesting  robustness  looking-to-see  to-write-about  to-understand  feature-construction  consider:causal-gp-dynamics
yesterday
[1908.01817] Sparsity Emerges Naturally in Neural Language Models
Concerns about interpretability, computational resources, and principled inductive priors have motivated efforts to engineer sparse neural models for NLP tasks. If sparsity is important for NLP, might well-trained neural models naturally become roughly sparse? Using the Taxi-Euclidean norm to measure sparsity, we find that frequent input words are associated with concentrated or sparse activations, while frequent target words are associated with dispersed activations but concentrated gradients. We find that gradients associated with function words are more concentrated than the gradients of content words, even controlling for word frequency.
deep-learning  interpretability  approximation  sparse-matrices  to-write-about  machine-learning  yeah-but-it's-still-a-constrained-ontology  consider:genetic-programming
yesterday
Voronoi Percolation
See this article about percolation over triangulations. Construct a graph by picking random points and then running the Delaunay Triangulation. Assign a random color to each point. The percolations keep only the edges connecting points to their own color. Here’s the result:
explorations  rather-interesting  examples  javascript  interactive-exposition  to-emulate
5 days ago
[1903.05802] A generalization of Edelman--Greene insertion for Schubert polynomials
Edelman and Greene generalized the Robinson--Schensted--Knuth correspondence to reduced words in order to give a bijective proof of the Schur positivity of Stanley symmetric functions. Stanley symmetric functions may be regarded as the stable limits of Schubert polynomials, and similarly Schur functions may be regarded as the stable limits of Demazure characters for the general linear group. We modify the Edelman--Greene correspondence to give an analogous, explicit formula for the Demazure character expansion of Schubert polynomials. Our techniques utilize dual equivalence and its polynomial variation, but here we demonstrate how to extract explicit formulas from that machinery which may be applied to other positivity problems as well.
combinatorics  permutations  group-theory  category-theory  to-understand  rewriting-systems
10 days ago
[1908.00971] Physical machine learning outperforms "human learning" in Quantum Chemistry
Two types of approaches to modeling molecular systems have demonstrated high practical efficiency. Density functional theory (DFT), the most widely used quantum chemical method, is a physical approach predicting energies and electron densities of molecules. Recently, numerous papers on machine learning (ML) of molecular properties have also been published. ML models greatly outperform DFT in terms of computational costs, and may even reach comparable accuracy, but they are missing physicality - a direct link to Quantum Physics - which limits their applicability. Here, we propose an approach that combines the strong sides of DFT and ML, namely, physicality and low computational cost. We derive general equations for exact electron densities and energies that can naturally guide applications of ML in Quantum Chemistry. Based on these equations, we build a deep neural network that can compute electron densities and energies of a wide range of organic molecules not only much faster, but also closer to exact physical values than current versions of DFT. In particular, we reached a mean absolute error in energies of molecules with up to eight non-hydrogen atoms as low as 0.9 kcal/mol relative to CCSD(T) values, noticeably lower than those of DFT (approaching ~2 kcal/mol) and ML (~1.5 kcal/mol) methods. A simultaneous improvement in the accuracy of predictions of electron densities and energies suggests that the proposed approach describes the physics of molecules better than DFT functionals developed by "human learning" earlier. Thus, physics-based ML offers exciting opportunities for modeling, with high-theory-level quantum chemical accuracy, of much larger molecular systems than currently possible.
emergent-design  machine-learning  neural-networks  quantums  molecular-design  complexology  to-write-about
12 days ago
Metaphors we believe by: the pantheon of 2019
IT’S 2013, AND I’M LOUNGING on the hardwood floor of my childhood bedroom with a blasphemous book that I smuggled home from college: Sam Harris’ Letter to a Christian Nation. I devour the whole thing in a single sitting. It feels like Harris is letting me in on a giant secret, a Big Truth that my very Christian upbringing had kept hidden from me. God is dead, he says. We killed him a few hundred years ago and replaced him with a sophisticated scientific understanding of life, the universe, and everything.

Fast-forward six years, and I’ve started to realize that the God vs. science situation is a bit more complicated than Sam Harris led me to believe. The more I learn, the more I suspect that rationalists only managed to kill a very narrow and anthropomorphic conception of God. People who study complex systems started using new words to talk about god-like phenomena — metaphors that are more palatable to secular minds. I believe these new words can help scientifically-minded people better understand what it actually felt like to believe in God before science became a Thing. Let’s take a tour through the pantheon of 2019 and explore what these seven “gods” might teach us in our era of ecological crisis and post-truth confusion.
12 days ago
Cybergothic Acid Communism Now | Sarah Jaffe
Capitalist Realism exists as a tight little bomb of a book that no one really has any excuse not to read. But in case anyone hasn’t, the concept threads through the k-punk collection; the idea that we live under the shadow of “there is no alternative,” unable to imagine a better way to organize society, let alone to struggle for one. Such “realism,” Fisher explained, was deeply unreal, particularly as we all live in the shadow of climate catastrophe; the tsk-tsking of the centrist ruling class is death drive posing as maturity, and the power of capitalist realism an expression of class decomposition, the fading of class consciousness. Peering through this gloom, Fisher nonetheless glimpsed some endings. After 2008, he wrote, “Neoliberalism is finished as a project, even if it lurches on, thrashing around like a decorticated terminator.”
12 days ago
Summer Maths Puzzles - Home Page | Isaac Newton Institute for Mathematical Sciences
Each weekday throughout August we will be publishing a new maths-based puzzle.  They won't require sophisticated maths to solve, but equally they won't be easy. Discussing your ideas might help.
12 days ago
Subspace Neural Physics: Fast Data-Driven Interactive Simulation
Abstract: Data-driven methods for physical simulation are an attractive option for interactive applications due to their ability to trade precomputation and memory footprint in exchange for improved runtime performance. Yet, existing data-driven methods fall short of the extreme memory and performance constraints imposed by modern interactive applications like AAA games and virtual reality. Here, performance budgets for physics simulation range from tens to hundreds of micro-seconds per frame, per object. We present a data-driven physical simulation method that meets these constraints. Our method combines subspace simulation techniques with machine learning which, when coupled, enables a very efficient subspace-only physics simulation that supports interactions with external objects – a longstanding challenge for existing subspace techniques. We also present an interpretation of our method as a special case of subspace Verlet integration, where we apply machine learning to efficiently approximate the physical forces of the system directly in the subspace. We propose several practical solutions required to make effective use of such a model, including a novel training methodology required for prediction stability, and a GPU-friendly subspace decompression algorithm to accelerate rendering.
12 days ago
[1604.00222] Ordering of nested square roots of 2 according to Gray code
In this paper we discuss some relations between zeros of Lucas-Lehmer polynomials and Gray code. We study nested square roots of 2 applying a "binary code" that associates bits 0 and 1 to ⊕ and ⊖ signs in the nested form. This gives the possibility to obtain an ordering for the zeros of Lucas-Lehmer polynomials, which assume the form of nested square roots of 2.
number-theory  approximation  dynamical-systems  rather-interesting  Ramanujan  representation  to-simulate  to-write-about
12 days ago
[1403.2882] A Fibonacci control system with application to hyper-redundant manipulators
We study a robot snake model based on a discrete linear control system involving Fibonacci sequence and closely related to the theory of expansions in non-integer bases. The present paper includes an investigation of the reachable workspace, a more general analysis of the control system underlying the model, its reachability and local controllability properties and the relation with expansions in non-integer bases and with iterated function systems.
control-theory  robotics  rather-odd  to-understand  the-words-and-pictures-are-confusing?
12 days ago
[1606.05674] Kinetic models of collective decision-making in the presence of equality bias
We introduce and discuss kinetic models describing the influence of the competence in the evolution of decisions in a multi-agent system. The original exchange mechanism, which is based on the human tendency to compromise and change opinion through self-thinking, is here modified to include the role of the agents' competence. In particular, we take into account the agents' tendency to behave in the same way as if they were as good, or as bad, as their partner: the so-called equality bias. This occurred in a situation where a wide gap separated the competence of group members. We discuss the main properties of the kinetic models and numerically investigate some examples of collective decision under the influence of the equality bias. The results confirm that the equality bias leads the group to suboptimal decisions.
decision-making  models  agent-based  rather-convoluted  economics  game-theory  wisdom-of-crowds  to-understand
12 days ago
[1606.09597] $π$-formulas and Gray code
In previous papers we introduced a class of polynomials which follow the same recursive formula as the Lucas-Lehmer numbers, studying the distribution of their zeros and remarking that this distribution follows a sequence related to the binary Gray code. It allowed us to give an order for all the zeros of every polynomial Ln. In this paper, the zeros, expressed in terms of nested radicals, are used to obtain two formulas for π: the first can be seen as a generalization of the known formula
π=limn→∞2n+1⋅2−2+2+2+...+2‾√‾‾‾‾‾‾‾‾‾‾√‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√n‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾⎷ ,
related to the smallest positive zero of Ln; the second is an exact formula for π achieved thanks to some identities valid for Ln.
numerical-methods  approximation  continued-fractions  (sorta)  number-theory  algorithms  to-simulate  to-write-about  consider:genetic-programming
12 days ago
[1705.06314] Tire tracks and integrable curve evolution
We study a simple model of bicycle motion: a segment of fixed length in multi-dimensional Euclidean space, moving so that the velocity of the rear end is always aligned with the segment. If the front track is prescribed, the trajectory of the rear wheel is uniquely determined via a certain first order differential equation -- the bicycle equation. The same model, in dimension two, describes another mechanical device, the hatchet planimeter.
Here is a sampler of our results. We express the linearized flow of the bicycle equation in terms of the geometry of the rear track; in dimension three, for closed front and rear tracks, this is a version of the Berry phase formula. We show that in all dimensions a sufficiently long bicycle also serves as a planimeter: it measures, approximately, the area bivector defined by the closed front track. We prove that the bicycle equation also describes rolling, without slipping and twisting, of hyperbolic space along Euclidean space. We relate the bicycle problem with two completely integrable systems: the AKNS (Ablowitz, Kaup, Newell and Segur) system and the vortex filament equation. We show that "bicycle correspondence" of space curves (front tracks sharing a common back track) is a special case of a Darboux transformation associated with the AKNS system. We show that the filament hierarchy, encoded as a single generating equation, describes a 3-dimensional bike of imaginary length. We show that a series of examples of "ambiguous" closed bicycle curves (front tracks admitting self bicycle correspondence), found recently F. Wegner, are buckled rings, or solitons of the planar filament equation. As a case study, we give a detailed analysis of such curves, arising from bicycle correspondence with multiply traversed circles.
plane-geometry  rather-interesting  topology  constraint-satisfaction  dynamical-systems  algorithms  consider:looking-to-see  to-simulate  to-write-about  consider:rediscovery  nudge-targets
12 days ago
[1609.04106] The Inverse Gamma Distribution and Benford's Law
According to Benford's Law, many data sets have a bias towards lower leading digits (about 30% are 1's). The applications of Benford's Law vary: from detecting tax, voter and image fraud to determining the possibility of match-fixing in competitive sports. There are many common distributions that exhibit such bias, i.e. they are almost Benford. These include the exponential and the Weibull distributions. Motivated by these examples and the fact that the underlying distribution of factors in protein structure follows an inverse gamma distribution, we determine the closeness of this distribution to a Benford distribution as its parameters change.
probability-theory  Benford's-law  counterintuitive-stuff  algorithms  rather-interesting  to-write-about  to-simulate  consider:rediscovery
12 days ago
Bayesian synthesis of probabilistic programs for automatic data modeling
We present new techniques for automatically constructing probabilistic programs for data analysis, interpretation, and prediction. These techniques work with probabilistic domain-specific data modeling languages that capture key properties of a broad class of data generating processes, using Bayesian inference to synthesize probabilistic programs in these modeling languages given observed data. We provide a precise formulation of Bayesian synthesis for automatic data modeling that identifies sufficient conditions for the resulting synthesis procedure to be sound. We also derive a general class of synthesis algorithms for domain-specific languages specified by probabilistic context-free grammars and establish the soundness of our approach for these languages. We apply the techniques to automatically synthesize probabilistic programs for time series data and multivariate tabular data. We show how to analyze the structure of the synthesized programs to compute, for key qualitative properties of interest, the probability that the underlying data generating process exhibits each of these properties. Second, we translate probabilistic programs in the domain-specific language into probabilistic programs in Venture, a general-purpose probabilistic programming system. The translated Venture programs are then executed to obtain predictions of new time series data and new multivariate data records. Experimental results show that our techniques can accurately infer qualitative structure in multiple real-world data sets and outperform standard data analysis methods in forecasting and predicting new data.
12 days ago
code2vec: learning distributed representations of code
We present a neural model for representing snippets of code as continuous distributed vectors (code embeddings''). The main idea is to represent a code snippet as a single fixed-length code vector, which can be used to predict semantic properties of the snippet. To this end, code is first decomposed to a collection of paths in its abstract syntax tree. Then, the network learns the atomic representation of each path while simultaneously learning how to aggregate a set of them.
We demonstrate the effectiveness of our approach by using it to predict a method's name from the vector representation of its body. We evaluate our approach by training a model on a dataset of 12M methods. We show that code vectors trained on this dataset can predict method names from files that were unobserved during training. Furthermore, we show that our model learns useful method name vectors that capture semantic similarities, combinations, and analogies.
A comparison of our approach to previous techniques over the same dataset shows an improvement of more than 75%, making it the first to successfully predict method names based on a large, cross-project corpus. Our trained model, visualizations and vector similarities are available as an interactive online demo at http://code2vec.org. The code, data and trained models are available at https://github.com/tech-srl/code2vec.
12 days ago
[1902.11214] A Multilayer Structure Facilitates the Production of Antifragile Systems in Boolean Network Models
Antifragility is a property to not only resist stress and but also to benefit from it. Even though antifragile dynamics are found in various real-world complex systems where multiple subsystems interact with each other, the attribute has not been quantitatively explored yet in those complex systems which can be regarded as multilayer networks. Here we study how the multilayer structure affects the antifragility of the whole system. By comparing single-layer and multilayer Boolean networks based on our recently proposed antifragility measure, we found that the multilayer structure facilitated the production of antifragile systems. Our measure and findings can be utilized for many applications from understanding properties of biological systems with multilayer structures to designing more antifragile engineered systems.
Talebism  Kauffmania  boolean-networks  automata  engineering-design  emergent-design  to-write-about  to-clean-up  performance-measure  meh  an-abstraction-doing-a-bit-too-much-work
13 days ago
[1802.10400] A framework for (de)composing with Boolean automata networks
Boolean automata networks (BANs) are a generalisation of Boolean cellular automata. In such, any theorem describing the way BANs compute information is a strong tool that can be applied to a wide range of models of computation. In this paper we explore a way of working with BANs which involves adding external inputs to the base model (via modules), and more importantly, a way to link networks together using the above mentioned inputs (via wirings). Our aim is to develop a powerful formalism for BAN (de)composition. We formulate two results: the first one shows that our modules/wirings definition is complete, the second one uses modules/wirings to prove simulation results amongst BANs.
Kauffmania  boolean-networks  automata  rather-interesting  to-understand  to-simulate  generalization  consider:inverse-problems  modularity
13 days ago
[1801.06620] A high-performance analog Max-SAT solver and its application to Ramsey numbers
We introduce a continuous-time analog solver for MaxSAT, a quintessential class of NP-hard discrete optimization problems, where the task is to find a truth assignment for a set of Boolean variables satisfying the maximum number of given logical constraints. We show that the scaling of an invariant of the solver's dynamics, the escape rate, as function of the number of unsatisfied clauses can predict the global optimum value, often well before reaching the corresponding state. We demonstrate the performance of the solver on hard MaxSAT competition problems. We then consider the two-color Ramsey number R(m,m) problem, translate it to SAT, and apply our algorithm to the still unknown R(5,5). We find edge colorings without monochromatic 5-cliques for complete graphs up to 42 vertices, while on 43 vertices we find colorings with only two monochromatic 5-cliques, the best coloring found so far, supporting the conjecture that R(5,5)=43.
computational-complexity  analog-computing  algorithms  rather-interesting  to-write-about  satisfiability  optimization  combinatorics
13 days ago
[1810.08237] Large-scale Hierarchical Alignment for Data-driven Text Rewriting
We propose a simple unsupervised method for extracting pseudo-parallel monolingual sentence pairs from comparable corpora representative of two different text styles, such as news articles and scientific papers. Our approach does not require a seed parallel corpus, but instead relies solely on hierarchical search over pre-trained embeddings of documents and sentences. We demonstrate the effectiveness of our method through automatic and extrinsic evaluation on text simplification from the normal to the Simple Wikipedia. We show that pseudo-parallel sentences extracted with our method not only supplement existing parallel data, but can even lead to competitive performance on their own.
natural-language-processing  text-mining  translation  rather-interesting  algorithms  representation  machine-learning  clustering  unsupervised-learning  to-understand
13 days ago
[1902.01530] Flip cycles in plabic graphs
Planar bicolored (plabic) graphs are combinatorial objects introduced by Postnikov to give parameterizations of the positroid cells of the totally nonnegative Grassmannian Gr≥0(n,k). Any two plabic graphs for the same positroid cell can be related by a sequence of certain moves. The flip graph has plabic graphs as vertices and has edges connecting the plabic graphs which are related by a single move. A recent result of Galashin shows that plabic graphs can be seen as cross-sections of zonotopal tilings for the cyclic zonotope Z(n,3). Taking this perspective, we show that the fundamental group of the flip graph is generated by cycles of length 4, 5, and 10, and use this result to prove a related conjecture of Dylan Thurston about triple crossing diagrams. We also apply our result to make progress on an instance of the generalized Baues problem.
combinatorics  domino-tiling  graph-theory  dynamical-systems  rather-interesting  representation  to-write-about  group-theory
13 days ago
[1604.04876] (Almost) Efficient Mechanisms for Bilateral Trading
We study the bilateral trade problem: one seller, one buyer and a single, indivisible item for sale. It is well known that there is no fully-efficient and incentive compatible mechanism for this problem that maintains a balanced budget. We design simple and robust mechanisms that obtain approximate efficiency with these properties. We show that even minimal use of statistical data can yield good approximation results. Finally, we demonstrate how a mechanism for this simple bilateral-trade problem can be used as a "black-box" for constructing mechanisms in more general environments.
economics  game-theory  agent-based  to-simulate  consider:genetic-programming  representation  to-write-about  consider:looking-to-see
13 days ago
[1811.03254] Optimal Parallelism Bound for Fully Asynchronous Coordinate Descent with Linear Speedup
When solving massive optimization problems in areas such as machine learning, it is a common practice to seek speedup via massive parallelism. However, especially in an asynchronous environment, there are limits on the possible parallelism. Accordingly, we seek tight bounds on the viable parallelism in asynchronous implementations of coordinate descent.
We focus on asynchronous coordinate descent (ACD) algorithms on convex functions F:ℝn→ℝ of the form
F(x)=f(x) + ∑k=1nΨk(xk),
where f:ℝn→ℝ is a smooth convex function, and each Ψk:ℝ→ℝ is a univariate and possibly non-smooth convex function.
Our approach is to quantify the shortfall in progress compared to the standard sequential stochastic gradient descent. This leads to a truly simple yet optimal analysis of the standard stochastic ACD in a partially asynchronous environment, which already generalizes and improves on the bounds in prior work. We also give a considerably more involved analysis for general asynchronous environments in which the only constraint is that each update can overlap with at most q others, where q is at most the number of processors times the ratio in the lengths of the longest and shortest updates. The main technical challenge is to demonstrate linear speedup in the latter environment. This stems from the subtle interplay of asynchrony and randomization. This improves Liu and Wright's (SIOPT'15) lower bound on the maximum degree of parallelism almost quadratically; the new bound is essentially optimal.
13 days ago
[1903.02748] (Re)constructing code loops
The Parker loop is a central extension of the extended binary Golay code. It is an example of a general class of non-associative structures known as \emph{code loops}, which have been studied from a number of different algebraic and combinatorial perspectives. This expository paper aims to also highlight an experimental approach to computing in code loops, by a combination of a small amount of precomputed information and making use of the rich identities that code loops' twisted cocycles satisfy. As a biproduct one can reconstruct the multiplication in the Parker loop from a mere fragment of its twisted cocycle, and we have found relatively large subspaces of the Golay code over which the Parker loop splits as a direct product.
combinatorics  coding-theory  to-understand  group-theory  dynamical-systems  generative-models
13 days ago
[1907.08226] Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model
Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones.
Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model.
Our framework is based on the Kac-Rice analysis of stationary points and a closed-form analysis of gradient-flow originating from statistical physics. We show that there is a well defined region of parameters where the gradient-flow algorithm finds a good global minimum despite the presence of exponentially many spurious local minima.
We show that this is achieved by surfing on saddles that have strong negative direction towards the global minima, a phenomenon that is connected to a BBP-type threshold in the Hessian describing the critical points of the landscapes.
machine-learning  optimization  fitness-landscapes  metaheuristics  representation  to-understand  topology  dynamical-systems  consider:performance-measures  consider:better-faster
13 days ago
[1107.5667] Fractal bodies invisible in 2 and 3 directions
We study the problem of invisibility for bodies with a mirror surface in the framework of geometrical optics. We show that for any two given directions it is possible to construct a two-dimensional fractal body invisible in these directions. Moreover, there exists a three-dimensional fractal body invisible in three orthogonal directions. The work continues the previous study in [A. Aleksenko and A. Plakhov. Bodies of zero resistance and bodies invisible in one direction. Nonlinearity 22, 1247-1258 (2009)], [A Plakhov and V Roshchina. Invisibility in billiards. Nonlinearity 24, 847-854 (2011)], where two-dimensional bodies invisible in one direction and three-dimensional bodies invisible in one and two orthogonal directions were constructed.
constraint-satisfaction  billiards  rather-interesting  engineering-design  open-problems  performance-measure  looking-to-see  to-write-about
13 days ago
Atomization: the kaleidoscope of meaning | Meaningness
The previous, subcultural mode failed because individual subcultures did not provide enough breadth or depth of meaning; and because cliquish subsocieties made it too difficult to access the narrow meaningness they hoarded.
The global internet exploded that. Everything is equally available everywhere—which is fabulous! Now, there are no boundaries, so bits of culture float free. Unfortunately, with no binding contexts, nothing makes sense. Meanings arrive as bite-sized morsels in a jumbled stream, like sushi flowing past on a conveyer belt, or brilliant shards of colored glass in a kaleidoscope.
With no urge for context to make culture understandable, everything is equally appealing everywhere. The atomized mode returns to the universalism of the countercultural mode—but by default, rather than design. In the 1960s, for the first time, everyone in an American generation listened to the same music, regardless of genre—as an expression of solidarity. Now, everyone in the world listens to the same music, regardless of genre, again—just because it’s trending on YouTube.
cultural-dynamics  cultural-norms  Dunbar-all-the-things  rather-interesting  to-write-about  fluidity  humanities  where-I-come-from...  self-definition
13 days ago
[1807.01672] Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization
Adversarial self-play in two-player games has delivered impressive results when used with reinforcement learning algorithms that combine deep neural networks and tree search. Algorithms like AlphaZero and Expert Iteration learn tabula-rasa, producing highly informative training data on the fly. However, the self-play training strategy is not directly applicable to single-player games. Recently, several practically important combinatorial optimisation problems, such as the travelling salesman problem and the bin packing problem, have been reformulated as reinforcement learning problems, increasing the importance of enabling the benefits of self-play beyond two-player games. We present the Ranked Reward (R2) algorithm which accomplishes this by ranking the rewards obtained by a single agent over multiple games to create a relative performance metric. Results from applying the R2 algorithm to instances of a two-dimensional and three-dimensional bin packing problems show that it outperforms generic Monte Carlo tree search, heuristic algorithms and integer programming solvers. We also present an analysis of the ranked reward mechanism, in particular, the effects of problem instances with varying difficulty and different ranking thresholds.
machine-learning  reinforcement-learning  algorithms  self-play  mechanism-design  optimization  performance-measure  to-understand  operations-research  metaheuristics
13 days ago
[1804.02584] Random Order Contention Resolution Schemes
Contention resolution schemes have proven to be an incredibly powerful concept which allows to tackle a broad class of problems. The framework has been initially designed to handle submodular optimization under various types of constraints, that is, intersections of exchange systems (including matroids), knapsacks, and unsplittable flows on trees. Later on, it turned out that this framework perfectly extends to optimization under uncertainty, like stochastic probing and online selection problems, which further can be applied to mechanism design.
We add to this line of work by showing how to create contention resolution schemes for intersection of matroids and knapsacks when we work in the random order setting. More precisely, we do know the whole universe of elements in advance, but they appear in an order given by a random permutation. Upon arrival we need to irrevocably decide whether to take an element or not. We bring a novel technique for analyzing procedures in the random order setting that is based on the martingale theory. This unified approach makes it easier to combine constraints, and we do not need to rely on the monotonicity of contention resolution schemes.
Our paper fills the gaps, extends, and creates connections between many previous results and techniques. The main application of our framework is a k+4+ε approximation ratio for the Bayesian multi-parameter unit-demand mechanism design under the constraint of k matroids intersection, which improves upon the previous bounds of 4k−2 and e(k+1). Other results include improved approximation ratios for stochastic k-set packing and submodular stochastic probing over arbitrary non-negative submodular objective function, whereas previous results required the objective to be monotone.
operations-research  approximation  subproblems  rather-interesting  to-understand  online-learning  mechanism-design  machine-learning  uncertainty
13 days ago
[1711.04226] AON: Towards Arbitrarily-Oriented Text Recognition
Recognizing text from natural images is a hot research topic in computer vision due to its various applications. Despite the enduring research of several decades on optical character recognition (OCR), recognizing texts from natural images is still a challenging task. This is because scene texts are often in irregular (e.g. curved, arbitrarily-oriented or seriously distorted) arrangements, which have not yet been well addressed in the literature. Existing methods on text recognition mainly work with regular (horizontal and frontal) texts and cannot be trivially generalized to handle irregular texts. In this paper, we develop the arbitrary orientation network (AON) to directly capture the deep features of irregular texts, which are combined into an attention-based decoder to generate character sequence. The whole network can be trained end-to-end by using only images and word-level annotations. Extensive experiments on various benchmarks, including the CUTE80, SVT-Perspective, IIIT5k, SVT and ICDAR datasets, show that the proposed AON-based method achieves the-state-of-the-art performance in irregular datasets, and is comparable to major existing methods in regular datasets.
OCR  image-processing  image-segmentation  machine-learning  invariant-models  generalization  to-write-about  to-simulate
13 days ago
[1905.03427] Variable Neighborhood Search for the Bin Packing Problem with Compatible Categories
Bin Packing with Conflicts (BPC) are problems in which items with compatibility constraints must be packed in the least number of bins, not exceeding the capacity of the bins and ensuring that non-conflicting items are packed in each bin. In this work, we introduce the Bin Packing Problem with Compatible Categories (BPCC), a variant of the BPC in which items belong to conflicting or compatible categories, in opposition to the item-by-item incompatibility found in previous literature. It is a common problem in the context of last mile distribution to nanostores located in densely populated areas. To efficiently solve real-life sized instances of the problem, we propose a Variable Neighborhood Search (VNS) metaheuristic algorithm. Computational experiments suggest that the algorithm yields good solutions in very short times while compared to linear integer programming running on a high-performance computing environment.
operations-research  optimization  constraint-satisfaction  heuristics  metaheuristics  to-read  to-simulate
13 days ago
[1810.02909] On the Art and Science of Machine Learning Explanations
This text discusses several popular explanatory methods that go beyond the error measurements and plots traditionally used to assess machine learning models. Some of the explanatory methods are accepted tools of the trade while others are rigorously derived and backed by long-standing theory. The methods, decision tree surrogate models, individual conditional expectation (ICE) plots, local interpretable model-agnostic explanations (LIME), partial dependence plots, and Shapley explanations, vary in terms of scope, fidelity, and suitable application domain. Along with descriptions of these methods, this text presents real-world usage recommendations supported by a use case and public, in-depth software examples for reproducibility.
13 days ago
[1908.00868] Machine Learning as Ecology
Machine learning methods have had spectacular success on numerous problems. Here we show that a prominent class of learning algorithms - including Support Vector Machines (SVMs) -- have a natural interpretation in terms of ecological dynamics. We use these ideas to design new online SVM algorithms that exploit ecological invasions, and benchmark performance using the MNIST dataset. Our work provides a new ecological lens through which we can view statistical learning and opens the possibility of designing ecosystems for machine learning.
machine-learning  analogies  via:cshalizi  but-I-am-less-skeptical  a-good-analogy-is-a-blessing-to-the-mind  to-read
13 days ago
[1412.8615] On Randomized Algorithms for Matching in the Online Preemptive Model
We investigate the power of randomized algorithms for the maximum cardinality matching (MCM) and the maximum weight matching (MWM) problems in the online preemptive model. In this model, the edges of a graph are revealed one by one and the algorithm is required to always maintain a valid matching. On seeing an edge, the algorithm has to either accept or reject the edge. If accepted, then the adjacent edges are discarded. The complexity of the problem is settled for deterministic algorithms.
Almost nothing is known for randomized algorithms. A lower bound of 1.693 is known for MCM with a trivial upper bound of 2. An upper bound of 5.356 is known for MWM. We initiate a systematic study of the same in this paper with an aim to isolate and understand the difficulty. We begin with a primal-dual analysis of the deterministic algorithm due to McGregor. All deterministic lower bounds are on instances which are trees at every step. For this class of (unweighted) graphs we present a randomized algorithm which is 2815-competitive. The analysis is a considerable extension of the (simple) primal-dual analysis for the deterministic case. The key new technique is that the distribution of primal charge to dual variables depends on the "neighborhood" and needs to be done after having seen the entire input. The assignment is asymmetric: in that edges may assign different charges to the two end-points. Also the proof depends on a non-trivial structural statement on the performance of the algorithm on the input tree.
The other main result of this paper is an extension of the deterministic lower bound of Varadaraja to a natural class of randomized algorithms which decide whether to accept a new edge or not using independent random choices.
algorithms  online-algorithms  dynamic-programming  machine-learning  graph-theory  to-understand  mechanism-design  to-write-about  something-in-it
16 days ago
[1801.03462] Online Maximum Matching with Recourse
We study the online maximum matching problem in a model in which the edges are associated with a known recourse parameter k. An online algorithm for this problem has to maintain a valid matching while edges of the underlying graph are presented one after the other. At any moment the algorithm can decide to include an edge into the matching or to exclude it, under the restriction that at most k such actions per edge take place, where k is typically a small constant. This problem was introduced and studied in the context of general online packing problems with recourse by Avitabile et al. [AMP13], whereas the special case k=2 was studied by Boyar et al. [BFKL17].
In the first part of this paper, we consider the {\em edge arrival} model, in which an arriving edge never disappears from the graph. Here, we first show an improved analysis on the performance of the algorithm AMP given in [AMP13], by exploiting the structure of the matching problem. In addition, we extend the result of [BFKL17] and show that the greedy algorithm has competitive ratio 3/2 for every even k and ratio 2 for every odd k. Moreover, we present and analyze an improvement of the greedy algorithm which we call L-Greedy, and we show that for small values of k it outperforms the algorithm of [AMP13]. In terms of lower bounds, we show that no deterministic algorithm better than 1+1/(k−1) exists, improving upon the lower bound of 1+1/k shown in [AMP13].
The second part of the paper is devoted to the \emph{edge arrival/departure model}, which is the fully dynamic variant of online matching with recourse. The analysis of L-Greedy and AMP carry through in this model; moreover we show a lower bound of k2−3k+6k2−4k+7 for all even k≥4. For k∈{2,3}, the competitive ratio is 3/2.
online-algorithms  online-learning  rather-interesting  incremental-algorithms  decision-making  algorithms  machine-learning  combinatorics  to-understand  to-follow-downstream  to-write-about  ReQ  consider:backtracking  consider:performance-measures
16 days ago
[1608.02153] OCR of historical printings with an application to building diachronic corpora: A case study using the RIDGES herbal corpus
This article describes the results of a case study that applies Neural Network-based Optical Character Recognition (OCR) to scanned images of books printed between 1487 and 1870 by training the OCR engine OCRopus [@breuel2013high] on the RIDGES herbal text corpus [@OdebrechtEtAlSubmitted]. Training specific OCR models was possible because the necessary *ground truth* is available as error-corrected diplomatic transcriptions. The OCR results have been evaluated for accuracy against the ground truth of unseen test sets. Character and word accuracies (percentage of correctly recognized items) for the resulting machine-readable texts of individual documents range from 94% to more than 99% (character level) and from 76% to 97% (word level). This includes the earliest printed books, which were thought to be inaccessible by OCR methods until recently. Furthermore, OCR models trained on one part of the corpus consisting of books with different printing dates and different typesets *(mixed models)* have been tested for their predictive power on the books from the other part containing yet other fonts, mostly yielding character accuracies well above 90%. It therefore seems possible to construct generalized models trained on a range of fonts that can be applied to a wide variety of historical printings still giving good results. A moderate postcorrection effort of some pages will then enable the training of individual models with even better accuracies. Using this method, diachronic corpora including early printings can be constructed much faster and cheaper than by manual transcription. The OCR methods reported here open up the possibility of transforming our printed textual cultural heritage into electronic text by largely automatic means, which is a prerequisite for the mass conversion of scanned books.
OCR  image-processing  natural-language-processing  algorithms  machine-learning  rather-interesting  commodity-software  digital-humanities  to-write-about  consider:swarms  consider:stochastic-resonance
16 days ago
[1812.11224] Planar maps, random walks and circle packing
These are lecture notes of the 48th Saint-Flour summer school, July 2018, on the topic of planar maps, random walks and the circle packing theorem.
packing  graph-theory  plane-geometry  representation  algorithms  combinatorics  review  to-read  dynamical-systems
16 days ago
[1608.02895] The power of online thinning in reducing discrepancy
Consider an infinite sequence of independent, uniformly chosen points from [0,1]d. After looking at each point in the sequence, an overseer is allowed to either keep it or reject it, and this choice may depend on the locations of all previously kept points. However, the overseer must keep at least one of every two consecutive points. We call a sequence generated in this fashion a \emph{two-thinning} sequence. Here, the purpose of the overseer is to control the discrepancy of the empirical distribution of points, that is, after selecting n points, to reduce the maximal deviation of the number of points inside any axis-parallel hyper-rectangle of volume A from nA. Our main result is an explicit low complexity two-thinning strategy which guarantees discrepancy of O(log2d+1n) for all n with high probability (compare with Θ(nloglogn‾‾‾‾‾‾‾‾‾‾√) without thinning). The case d=1 of this result answers a question of Benjamini.
We also extend the construction to achieve the same asymptotic bound for (1+β)-thinning, a set-up in which rejecting is only allowed with probability β independently for each point. In addition, we suggest an improved and simplified strategy which we conjecture to guarantee discrepancy of O(logd+1n) (compare with θ(logdn), the best known construction of a low discrepancy sequence). Finally, we provide theoretical and empirical evidence for our conjecture, and provide simulations supporting the viability of our construction for applications.
online-algorithms  decision-making  probability-theory  online-learning  optimization  rather-interesting  to-understand  consider:selection  consider:performance-measures  consider:data-balancing  to-write-about  consider:approximation
16 days ago
[1807.01132] The power of thinning in balanced allocation
Balls are sequentially allocated into n bins as follows: for each ball, an independent, uniformly random bin is generated. An overseer may then choose to either allocate the ball to this bin, or else the ball is allocated to a new independent uniformly random bin. The goal of the overseer is to reduce the load of the most heavily loaded bin after Θ(n) balls have been allocated. We provide an asymptotically optimal strategy yielding a maximum load of (1+o(1))8lognloglogn‾‾‾‾‾‾‾√ balls.
queueing-theory  operations-research  optimization  online-algorithms  algorithms  branching-processes  probability-theory  to-simulate  to-write-about  consider:genetic-programming
16 days ago
[1812.07933] Dynamic Programming Approach to Template-based OCR
In this paper we propose a dynamic programming solution to the template-based recognition task in OCR case. We formulate a problem of optimal position search for complex objects consisting of parts forming a sequence. We limit the distance between every two adjacent elements with predefined upper and lower thresholds. We choose the sum of penalties for each part in given position as a function to be minimized. We show that such a choice of restrictions allows a faster algorithm to be used than the one for the general form of deformation penalties. We named this algorithm Dynamic Squeezeboxes Packing (DSP) and applied it to solve the two OCR problems: text fields extraction from an image of document Visual Inspection Zone (VIZ) and license plate segmentation. The quality and the performance of resulting solutions were experimentally proved to meet the requirements of the state-of-the-art industrial recognition systems.
OCR  image-segmentation  image-processing  algorithms  optimization  mathematical-programming  to-write-about  to-compare
16 days ago
[1812.03155] Kernelization of Packing Problems
Kernelization algorithms are polynomial-time reductions from a problem to itself that guarantee their output to have a size not exceeding some bound. For example, d-Set Matching for integers d>2 is the problem of finding a matching of size at least k in a given d-uniform hypergraph and has kernels with O(k^d) edges. Bodlaender et al. [JCSS 2009], Fortnow and Santhanam [JCSS 2011], Dell and Van Melkebeek [JACM 2014] developed a framework for proving lower bounds on the kernel size for certain problems, under the complexity-theoretic hypothesis that coNP is not contained in NP/poly. Under the same hypothesis, we show tight lower bounds for the kernelization of d-Set Matching and other packing problems.
Our bounds are tight for d-Set Matching: It does not have kernels with O(k^{d-{\epsilon}}) edges for any {\epsilon}>0 unless the hypothesis fails. By reduction, this transfers to a bound of O(k^{d-1-{\epsilon}}) for the problem of finding k vertex-disjoint cliques of size d in standard graphs. Obtaining tight bounds for graph packing problems is challenging: We make first progress in this direction by showing non-trivial kernels with O(k^2.5) edges for the problem of finding k vertex-disjoint paths of three edges each. If the paths have d edges each, we improve the straightforward O(k^{d+1}) kernel to a uniform polynomial kernel where the exponent of the kernel size is independent of k.
Most of our lower bound proofs follow a general scheme that we discover: To exclude kernels of size O(k^{d-{\epsilon}}) for a problem in d-uniform hypergraphs, one should reduce from a carefully chosen d-partite problem that is still NP-hard. As an illustration, we apply this scheme to the vertex cover problem, which allows us to replace the number-theoretical construction by Dell and Van Melkebeek [JACM 2014] with shorter elementary arguments.
operations-research  optimization  representation  combinatorics  rather-interesting  computational-complexity  to-write-about  to-simulate  consider:rediscovery
16 days ago
[1908.00089] A Model of Random Industrial SAT
One of the most studied models of SAT is random SAT. In this model, instances are composed from clauses chosen uniformly randomly and independently of each other. This model may be unsatisfactory in that it fails to describe various features of SAT instances, arising in real-world applications. Various modifications have been suggested to define models of industrial SAT. Here, we focus on community-structured SAT. Namely, the set of variables consists of a number of disjoint communities, and clauses tend to consist of variables from the same community. Thus, we suggest a model of random community-structured SAT. There has been a lot of work on the satisfiability threshold of random k-SAT, starting with the calculation of the threshold of 2-SAT, up to the recent result that the threshold exists for sufficiently large k. In this paper, we endeavor to study the satisfiability threshold for random industrial SAT. Our main result is that the threshold of random community-structured SAT tends to be smaller than its counterpart for random SAT. Moreover, under some conditions, this threshold even vanishes.
16 days ago
[1109.0516] The bifurcation locus for numbers of bounded type
We define a family B(t) of compact subsets of the unit interval which generalizes the sets of numbers whose continued fraction expansion has bounded digits. We study how the set B(t) changes as one moves the parameter t, and see that the family undergoes period-doubling bifurcations and displays the same transition pattern from periodic to chaotic behavior as the usual family of quadratic polynomials. The set E of bifurcation parameters is a fractal set of measure zero and Hausdorff dimension 1. We also show that the Hausdorff dimension of B(t) varies continuously with the parameter, and the dimension of each individual set equals the dimension of a corresponding section of the bifurcation set E.
continued-fractions  number-theory  dynamical-systems  representation  rather-interesting  to-understand  to-simulate  to-write-about  consider:feature-discovery
17 days ago
[1004.3025] Outer Billiards and the Pinwheel Map
In this paper we establish a kind of bijection between the orbits of a polygonal outer billiards system and the orbits of a related (and simpler to analyze) system called the pinwheel map. One consequence of the result is that the outer billiards system has unbounded orbits if and only if the pinwheel map has unbounded orbits. As the pinwheel map is much easier to analyze directly, we think that this bijection will be helpful in attacking some of the main questions about polyonal outer billiards.
plane-geometry  construction  dynamical-systems  algorithms  iterated-systems  to-write-about  to-simulate
17 days ago
[1410.5115] Remarks on the the circumcenter of mass
Suppose that to every non-degenerate simplex Delta in n-dimensional Euclidean space a `center' C(Delta) is assigned so that the following assumptions hold: (i) The map that assigns C(Delta) to Delta commutes with similarities and is invariant under the permutations of the vertices of the simplex; (ii) The map that assigns Vol(Delta) C(Delta) to Delta is polynomial in the coordinates of the vertices of the simplex. Then C(Delta) is an affine combination of the center of mass and the circumcenter of Delta (with the coefficients independent of the simplex). The motivation for this theorem comes from the recent study of the circumcenter of mass of simplicial polytopes by the authors and by A. Akopyan.
plane-geometry  construction  algorithms  feature-construction  statistics  nudge-targets  consider:looking-to-see
17 days ago
[1906.05588] The effect of random dispersal on competitive exclusion -- A review
Does a high dispersal rate provide a competitive advantage when risking competitive exclusion? To this day, the theoretical literature cannot answer this question in full generality. The present paper focuses on the simplest mathematical model with two populations differing only in dispersal abilities and whose one-dimensional territories are spatially segregated. Although the motion of the border between the two territories remains elusive in general, all cases investigated in the literature concur: either the border does not move at all because of some environmental heterogeneity or the fast diffuser chases the slow diffuser. Counterintuitively, it is better to randomly explore the hostile enemy territory, even if it means certain death of some individuals, than to ''stay united''. This directly contradicts a celebrated result on the intermediate competition case, emphasizing the importance of the competition intensity. Overall, the larger picture remains unclear and the optimal strategy regarding dispersal remains ambiguous. Several open problems worthy of a special attention are raised.
theoretical-biology  ecology  diffy-Qs  models  rather-interesting  to-write-about  to-simulate  the-other-way-I-mean
18 days ago
PsyArXiv Preprints | How do scientists update their beliefs? An investigation of N scientists engaged in replication research
How, if at all, do scientists adjust their beliefs based on observed evidence? Few studies have examined how beliefs change over the course of a study, from developing the protocol to collecting and analyzing original data. N scientists who were contributing to one or more of six multi-lab replication projects were asked to estimate their degree of belief in the original effect and to estimate the true effect size at three timepoints: before data collection, after learning their own results, and after learning the results of all of the other laboratories contributing replication studies. We examine how beliefs changed in response to data, whether scientists are disproportionately influenced by their own study results relative to those of other labs, and whether people optimally update their prior beliefs according to the observed evidence.
science-studies  social-dynamics  sociology  learning  to-read  a-bit-drafty
21 days ago
JCA 14.3-4, p. 191-212 – Old City Publishing
The density classification problem is one of the most studied problems in the context of the computational abilities of cellular automata. Since this problem cannot be solved in the classical sense, we consider a weaker version, by slightly relaxing the assumptions on the output specification. In this paper, we discuss this relaxed problem for two-dimensional Affine Continuous Cellular Automata (ACCAs). We focus on finding the most performant rules solving this problem among the density-conserving ones by evaluating ACCAs experimentally for a predefined set of initial configurations.
cellular-automata  benchmarking  rather-interesting  out-of-the-box  to-simulate  to-write-about  consider:algorothm-structure
21 days ago
The theory of games as a tool for the social epistemologist - Philsci-Archive
Traditionally, epistemologists have distinguished between epistemic and pragmatic goals. In so doing, they presume that much of game theory is irrelevant to epistemic enterprises. I will show that this is a mistake. Even if we restrict attention to purely epistemic motivations, members of epistemic groups will face a multitude of strategic choices. I illustrate several contexts where individuals who are concerned solely with the discovery of truth will nonetheless face difficult game theoretic problems. Examples of purely epistemic coordination problems and social dilemmas will be presented. These show that there is a far deeper connection between economics and epistemology than previous appreciated.
economics  philosophy  political-economy  models-and-modes  pragmatism  utility  rather-interesting  science-studies  to-read  game-theory  social-psychology
21 days ago
Mazes, Puzzles, and Proofs |
Working backwards is indeed a wonderful tool. But for many problems, what works best is to start at both ends and work your way toward the middle, the way Greek engineer Eupalinos of Megara did when he led the excavation of a thousand-meter-long tunnel on the island of Samos over two millennia ago. For instance, when a chess-player seeks to trap her opponent, she thinks about where the pieces are now, and where she wants them to be, and how she can get them there.

The Eupalinos tactic is also well-suited to the mathematical exercises that we math professors assign students who are just beginning to learn the craft of writing proofs, the way a piano teacher assigns finger-exercises to novices.
23 days ago
[1506.08415] PLG2: Multiperspective Processes Randomization and Simulation for Online and Offline Settings
Process mining represents an important field in BPM and data mining research. Recently, it has gained importance also for practitioners: more and more companies are creating business process intelligence solutions. The evaluation of process mining algorithms requires, as any other data mining task, the availability of large amount of real-world data. Despite the increasing availability of such datasets, they are affected by many limitations, in primis the absence of a "gold standard" (i.e., the reference model).
This paper extends an approach, already available in the literature, for the generation of random processes. Novelties have been introduced throughout the work and, in particular, they involve the complete support for multiperspective models and logs (i.e., the control-flow perspective is enriched with time and data information) and for online settings (i.e., generation of multiperspective event streams and concept drifts). The proposed new framework is able to almost entirely cover the spectrum of possible scenarios that can be observed in the real-world. The proposed approach is implemented as a publicly available Java application, with a set of APIs for the programmatic execution of experiments.
process-mining  learning-by-watching  machine-learning  statistics  modeling  dynamical-systems  what-gets-measured  to-write-about  representation  discrete-event-simulators  to-simulate  time-series
25 days ago
[1804.02101] Modeling Popularity in Asynchronous Social Media Streams with Recurrent Neural Networks
Understanding and predicting the popularity of online items is an important open problem in social media analysis. Considerable progress has been made recently in data-driven predictions, and in linking popularity to external promotions. However, the existing methods typically focus on a single source of external influence, whereas for many types of online content such as YouTube videos or news articles, attention is driven by multiple heterogeneous sources simultaneously - e.g. microblogs or traditional media coverage. Here, we propose RNN-MAS, a recurrent neural network for modeling asynchronous streams. It is a sequence generator that connects multiple streams of different granularity via joint inference. We show RNN-MAS not only to outperform the current state-of-the-art Youtube popularity prediction system by 17%, but also to capture complex dynamics, such as seasonal trends of unseen influence. We define two new metrics: promotion score quantifies the gain in popularity from one unit of promotion for a Youtube video; the loudness level captures the effects of a particular user tweeting about the video. We use the loudness level to compare the effects of a video being promoted by a single highly-followed user (in the top 1% most followed users) against being promoted by a group of mid-followed users. We find that results depend on the type of content being promoted: superusers are more successful in promoting Howto and Gaming videos, whereas the cohort of regular users are more influential for Activism videos. This work provides more accurate and explainable popularity predictions, as well as computational tools for content producers and marketers to allocate resources for promotion campaigns.
social-networks  prediction  machine-learning  marketing  cart-before-the-horse  to-write-about  to-simulate  consider:performance-measures  consider:false-positives
25 days ago
Who Mourns the Tenth Heegner Number? |
There’s an episode1 of a science-fiction television series in which space travelers land on a planet peopled by their own descendants. The descendants explain that the travelers will try to leave the planet and fail, accidentally stranding themselves several centuries in the past. Armed with this knowledge, the travelers can try to thwart their destiny; but are they willing to try if their successful escape would doom their descendants, leaving the travelers with the memory of descendants who, thanks to their escape, never were?

This is science fiction, but it’s also math. More specifically, it’s proof by contradiction. As Ben Blum-Smith recently wrote on Twitter: “Sufficiently long contradiction proofs *make me sad*! When you stick with the mathematical characters long enough, you start to get attached, and then they disappear, never to have existed in the first place.”
mathematical-recreations  mathematics  number-theory  the-mangle-in-practice  exploration  history-of-science  rather-interesting  to-write-about  see-if-GP-makes-these-mistakes
25 days ago
[1811.01910] Evolutionary Data Measures: Understanding the Difficulty of Text Classification Tasks
Classification tasks are usually analysed and improved through new model architectures or hyperparameter optimisation but the underlying properties of datasets are discovered on an ad-hoc basis as errors occur. However, understanding the properties of the data is crucial in perfecting models. In this paper we analyse exactly which characteristics of a dataset best determine how difficult that dataset is for the task of text classification. We then propose an intuitive measure of difficulty for text classification datasets which is simple and fast to calculate. We show that this measure generalises to unseen data by comparing it to state-of-the-art datasets and results. This measure can be used to analyse the precise source of errors in a dataset and allows fast estimation of how difficult a dataset is to learn. We searched for this measure by training 12 classical and neural network based models on 78 real-world datasets, then use a genetic algorithm to discover the best measure of difficulty. Our difficulty-calculating code ( this https URL ) and datasets ( this http URL ) are publicly available.
25 days ago
[1809.04563] Finding Higher Order Mutants Using Variational Execution
Mutation testing is an effective but time consuming method for gauging the quality of a test suite. It functions by repeatedly making changes, called mutants, to the source code and checking whether the test suite fails (i.e., whether the mutant is killed). Recent work has shown cases in which applying multiple changes, called a higher order mutation, is more difficult to kill than a single change, called a first order mutation. Specifically, a special kind of higher order mutation, called a strongly subsuming higher order mutation (SSHOM), can enable equivalent accuracy in assessing the quality of the test suite with fewer executions of tests. Little is known about these SSHOMs, as they are difficult to find. Our goal in this research is to identify a faster, more reliable method for finding SSHOMs in order to characterize them in the future. We propose an approach based on variational execution to find SSHOMs. Preliminary results indicate that variational execution performs better than the existing genetic algorithm in terms of speed and completeness of results. Out of a set of 33 first order mutations, our variational execution approach finds all 38 SSHOMs in 4.5 seconds, whereas the genetic algorithm only finds 36 of the 38 SSHOMs in 50 seconds.
software-development  testing  evolutionary-algorithms  rather-interesting  complementation  to-write-about
25 days ago
[1811.00430] GA Based Q-Attack on Community Detection
Community detection plays an important role in social networks, since it can help to naturally divide the network into smaller parts so as to simplify network analysis. However, on the other hand, it arises the concern that individual information may be over-mined, and the concept community deception thus is proposed to protect individual privacy on social networks. Here, we introduce and formalize the problem of community detection attack and develop efficient strategies to attack community detection algorithms by rewiring a small number of connections, leading to individual privacy protection. In particular, we first give two heuristic attack strategies, i.e., Community Detection Attack (CDA) and Degree Based Attack (DBA), as baselines, utilizing the information of detected community structure and node degree, respectively. And then we propose a Genetic Algorithm (GA) based Q-Attack, where the modularity Q is used to design the fitness function. We launch community detection attack based on the above three strategies against three modularity based community detection algorithms on two social networks. By comparison, our Q-Attack method achieves much better attack effects than CDA and DBA, in terms of the larger reduction of both modularity Q and Normalized Mutual Information (NMI). Besides, we find that the adversarial networks obtained by Q-Attack on a specific community detection algorithm can be still effective on others, no matter whether they are modularity based or not, indicating its strong transferability.
25 days ago
[1601.00013] A single hidden layer feedforward network with only one neuron in the hidden layer can approximate any univariate function
The possibility of approximating a continuous function on a compact subset of the real line by a feedforward single hidden layer neural network with a sigmoidal activation function has been studied in many papers. Such networks can approximate an arbitrary continuous function provided that an unlimited number of neurons in a hidden layer is permitted. In this paper, we consider constructive approximation on any finite interval of ℝ by neural networks with only one neuron in the hidden layer. We construct algorithmically a smooth, sigmoidal, almost monotone activation function σ providing approximation to an arbitrary continuous function within any degree of accuracy. This algorithm is implemented in a computer program, which computes the value of σ at any reasonable point of the real axis.
neural-networks  approximation  models-and-modes  old-results-in-new-clothes  to-simulate  to-write-about  perceptrons
25 days ago
[1806.09644] How to hear the shape of a billiard table
The bounce spectrum of a polygonal billiard table is the collection of all bi-infinite sequences of edge labels corresponding to billiard trajectories on the table. We give methods for reconstructing from the bounce spectrum of a polygonal billiard table both the cyclic ordering of its edge labels and the sizes of its angles. We also show that it is impossible to reconstruct the exact shape of a polygonal billiard table from any finite collection of finite words from its bounce spectrum.
billiards  dynamical-systems  spectra  rather-interesting  inverse-problems  to-understand  to-write-about  to-simulate  consider:looking-to-see  consider:classification  impossibility-proof  consider:approximation
26 days ago
[1906.07774] Information matrices and generalization
This work revisits the use of information criteria to characterize the generalization of deep learning models. In particular, we empirically demonstrate the effectiveness of the Takeuchi information criterion (TIC), an extension of the Akaike information criterion (AIC) for misspecified models, in estimating the generalization gap, shedding light on why quantities such as the number of parameters cannot quantify generalization. The TIC depends on both the Hessian of the loss H and the covariance of the gradients C. By exploring the similarities and differences between these two matrices as well as the Fisher information matrix F, we study the interplay between noise and curvature in deep models. We also address the question of whether C is a reasonable approximation to F, as is commonly assumed.
machine-learning  neural-networks  generalization  performance-measure  rather-interesting  to-understand  to-generalize  (hah)  to-write-about
26 days ago
[0911.1984] Perfect Retroreflectors and Billiard Dynamics
We construct semi-infinite billiard domains which reverse the direction of most incoming particles. We prove that almost all particles will leave the open billiard domain after a finite number of reflections. Moreover, with high probability the exit velocity is exactly opposite to the entrance velocity, and the particle's exit point is arbitrarily close to its initial position. The method is based on asymptotic analysis of statistics of entrance times to a small interval for irrational circle rotations. The rescaled entrance times have a limiting distribution in a limit when the number of iterates tends to infinity and the length of the interval vanishes. The proof of the main results follows from the study of related limiting distributions and their regularity properties.
billiards  plane-geometry  dynamical-systems  engineering-design  rather-interesting  existence-proof  to-write-about  consider:looking-to-see  to-simulate
26 days ago
[1811.08225] Self Organizing Classifiers: First Steps in Structured Evolutionary Machine Learning
Learning classifier systems (LCSs) are evolutionary machine learning algorithms, flexible enough to be applied to reinforcement, supervised and unsupervised learning problems with good performance. Recently, self organizing classifiers were proposed which are similar to LCSs but have the advantage that in its structured population no balance between niching and fitness pressure is necessary. However, more tests and analysis are required to verify its benefits. Here, a variation of the first algorithm is proposed which uses a parameterless self organizing map (SOM). This algorithm is applied in challenging problems such as big, noisy as well as dynamically changing continuous input-action mazes (growing and compressing mazes are included) with good performance. Moreover, a genetic operator is proposed which utilizes the topological information of the SOM's population structure, improving the results. Thus, the first steps in structured evolutionary machine learning are shown, nonetheless, the problems faced are more difficult than the state-of-art continuous input-action multi-step ones.
26 days ago
[1609.04439] Unrestricted State Complexity of Binary Operations on Regular and Ideal Languages
We study the state complexity of binary operations on regular languages over different alphabets. It is known that if L′m and Ln are languages of state complexities m and n, respectively, and restricted to the same alphabet, the state complexity of any binary boolean operation on L′m and Ln is mn, and that of product (concatenation) is m2n−2n−1. In contrast to this, we show that if L′m and Ln are over different alphabets, the state complexity of union and symmetric difference is (m+1)(n+1), that of difference is mn+m, that of intersection is mn, and that of product is m2n+2n−1. We also study unrestricted complexity of binary operations in the classes of regular right, left, and two-sided ideals, and derive tight upper bounds. The bounds for product of the unrestricted cases (with the bounds for the restricted cases in parentheses) are as follows: right ideals m+2n−2+2n−1 (m+2n−2); left ideals mn+m+n (m+n−1); two-sided ideals m+2n (m+n−1). The state complexities of boolean operations on all three types of ideals are the same as those of arbitrary regular languages, whereas that is not the case if the alphabets of the arguments are the same. Finally, we update the known results about most complex regular, right-ideal, left-ideal, and two-sided-ideal languages to include the unrestricted cases.
formal-languages  automata  to-understand  to-write-about  consider:looking-to-see  nudge-targets  consider:representation
26 days ago
[1809.09561] Evaluating stochastic seeding strategies in networks
When trying to maximize the adoption of a behavior in a population connected by a social network, it is common to strategize about where in the network to seed the behavior, often with an element of randomness. Selecting seeds uniformly at random is a basic but compelling strategy in that it distributes seeds broadly throughout the network. A more sophisticated stochastic strategy, one-hop targeting, is to select random network neighbors of random individuals; this exploits a version of the friendship paradox, whereby the friend of a random individual is expected to have more friends than a random individual, with the hope that seeding a behavior at more connected individuals leads to more adoption. Many seeding strategies have been proposed, but empirical evaluations have demanded large field experiments designed specifically for this purpose and have yielded relatively imprecise comparisons of strategies. Here we show how stochastic seeding strategies can be evaluated more efficiently in such experiments, how they can be evaluated "off-policy" using existing data arising from experiments designed for other purposes, and how to design more efficient experiments. In particular, we consider contrasts between stochastic seeding strategies and analyze nonparametric estimators adapted from policy evaluation and importance sampling. We use simulations on real networks to show that the proposed estimators and designs can dramatically increase precision while yielding valid inference. We then apply our proposed estimators to two field experiments, one that assigned households to an intensive marketing intervention and one that assigned students to an anti-bullying intervention.
social-networks  feature-construction  rather-interesting  social-engineering  network-theory  to-understand  to-write-about  epidemiology  cultural-dynamics  to-simulate
26 days ago
[1510.07742] Iterating evolutes and involutes
We study iterations of two classical constructions, the evolutes and involutes of plane curves, and we describe the limiting behavior of both constructions on a class of smooth curves with singularities given by their support functions.
Next we study two kinds of discretizations of these constructions: the curves are replaced by polygons, and the evolutes are formed by the circumcenters of the triples of consecutive vertices, or by the incenters of the triples of consecutive sides. The space of polygons is a vector bundle over the space of the side directions, and both kinds of evolutes define vector bundle morphisms. In both cases, we describe the linear maps of the fibers. In the first case, the induced map of the base is periodic, whereas, in the second case, it is an averaging transformation. We also study the dynamics of the related inverse constructions, the involutes of polygons.
In addition to the theoretical study, we performed numerous computer experiments; some of the observations remain unexplained.
plane-geometry  dynamical-systems  construction  to-understand  rather-interesting  to-write-about  consider:genetic-programming  consider:closed-forms  consider:classification
26 days ago
[1805.04038] Packing and domination parameters in digraphs
Given a digraph D=(V,A), a set B⊂V is a packing set in D if there are no arcs joining vertices of B and for any two vertices x,y∈B the sets of in-neighbors of x and y are disjoint. The set S is a dominating set (an open dominating set) in D if every vertex not in S (in V) has an in-neighbor in S. Moreover, a dominating set S is called a total dominating set if the subgraph induced by S has no isolated vertices. The packing sets of maximum cardinality and the (total, open) dominating sets of minimum cardinality in digraphs are studied in this article. We prove that the two optimal sets concerning packing and domination achieve the same value for directed trees, and give some applications of it. We also show analogous equalities for all connected contrafunctional digraphs, and characterize all such digraphs D for which such equalities are satisfied. Moreover, sharp bounds on the maximum and the minimum cardinalities of packing and dominating sets, respectively, are given for digraphs. Finally, we present solutions for two open problems, concerning total and open dominating sets of minimum cardinality, pointed out in [Australas. J. Combin. 39 (2007), 283--292].
graph-theory  feature-construction  combinatorics  consider:looking-to-see  consider:extreme-values  to-simulate
26 days ago
[1808.09167] Hirzebruch-type inequalities viewed as tools in combinatorics
The main purpose of this survey is to provide an introduction, algebro-topological in nature, to Hirzebuch-type inequalities for plane curve arrangements in the complex projective plane. These inequalities gain more and more interest in many combinatorial problems related to point or line arrangements in the plane. We would like to present a summary of the technicalities and also some recent applications, for instance in the context of Weak Dirac's Conjecture. We advertise also some open problems and questions.
combinatorics  line-arrangements  plane-geometry  constraint-satisfaction  rather-interesting  to-understand  proof  heuristics  consider:looking-to-see  to-write-about
26 days ago
[1712.07822] Geometrical Insights for Implicit Generative Modeling
Learning algorithms for implicit generative models can optimize a variety of criteria that measure how the data distribution differs from the implicit model distribution, including the Wasserstein distance, the Energy distance, and the Maximum Mean Discrepancy criterion. A careful look at the geometries induced by these distances on the space of probability measures reveals interesting differences. In particular, we can establish surprising approximate global convergence guarantees for the 1-Wasserstein distance,even when the parametric generator has a nonconvex parametrization.
machine-learning  representation  approximation  define-your-terms  looking-to-see  rather-interesting  mathematical-programming  to-write-about  may-not-apply-outside-the-silo
26 days ago
[1802.07481] Celer: a Fast Solver for the Lasso with Dual Extrapolation
Convex sparsity-inducing regularizations are ubiquitous in high-dimensional machine learning, but solving the resulting optimization problems can be slow. To accelerate solvers, state-of-the-art approaches consist in reducing the size of the optimization problem at hand. In the context of regression, this can be achieved either by discarding irrelevant features (screening techniques) or by prioritizing features likely to be included in the support of the solution (working set techniques). Duality comes into play at several steps in these techniques. Here, we propose an extrapolation technique starting from a sequence of iterates in the dual that leads to the construction of improved dual points. This enables a tighter control of optimality as used in stopping criterion, as well as better screening performance of Gap Safe rules. Finally, we propose a working set strategy based on an aggressive use of Gap Safe screening rules. Thanks to our new dual point construction, we show significant computational speedups on multiple real-world problems.
machine-learning  statistics  horse-races  performance-measure  computational-complexity  to-read  algorithms
26 days ago
How the Transformers broke NLP leaderboards - Hacking semantics
If leaderboards are to highlight the actual progress, we need to incentivize new architectures rather than teams outspending each other. Obviously, huge pretrained models are valuable, but unless the authors show that their system consistently behaves differently from its competition with comparable data & compute, it is not clear whether they are presenting a model or a resource.

Furthermore, much of this research is not reproducible: nobody is going to spend \$250,000 just to repeat XLNet training. Given the fact that its ablation study showed only 1-2% gain over BERT in 3 datasets out of 4 (Yang et al., 2019), we don’t actually know for sure that its masking strategy is more successful than BERT’s.

At the same time, the development of leaner models is dis-incentivized, as their task is fundamentally harder and the leaderboard-oriented community only rewards the SOTA. That, in its turn, prices out of competitions academic teams, which will not result in students becoming better engineers when they graduate.

Last but not the least, huge DL models are often overparametrized (Frankle & Carbin, 2019; Wu, Fan, Baevski, Dauphin, & Auli, 2019). As an example, the smaller version of BERT achieves better scores on a number of syntax-testing experiments than the larger one (Goldberg, 2019). The fact that DL models require a lot of compute is not necessarily a bad thing in itself, but wasting compute is not ideal for the environment (Strubell, Ganesh, & McCallum, 2019).
benchmarking  machine-learning  horse-races  performance-measure  multiobjective-optimization  (use-it)
26 days ago
Tutorial:Introduction to GeoGebraScript - GeoGebra Manual
Sometimes the ways students can interact with your worksheets using the basic features provided by GeoGebra are not powerful enough or too difficult to be used by your students, especially younger ones. GeoGebraScript is there to help you creating simple-to-use constructions with more sophisticated interactivity. It is a simple way to write scripts (small programs) in GeoGebra.
26 days ago
[1711.03247] The nonsmooth landscape of phase retrieval
We consider a popular nonsmooth formulation of the real phase retrieval problem. We show that under standard statistical assumptions, a simple subgradient method converges linearly when initialized within a constant relative distance of an optimal solution. Seeking to understand the distribution of the stationary points of the problem, we complete the paper by proving that as the number of Gaussian measurements increases, the stationary points converge to a codimension two set, at a controlled rate. Experiments on image recovery problems illustrate the developed algorithm and theory.
inverse-problems  feature-extraction  approximation  rather-interesting  nonlinear-programming  to-write-about  image-processing  signal-processing
26 days ago
Reference:File Format - GeoGebra Manual
Modifying .ggb or .ggt files (namely the .xml files within them) is clearly a task for the most tech-savvy users of GeoGebra. Whether you want to touch the .xml because you want to modify something which can't be modified by GeoGebra at the moment, like the definition of a custom tool, or you want to trick GeoGebra or just experiment you should take some tips for your journey:
GeoGebra  project  to-do  to-understand  XML  generative-art
26 days ago
The Most Common Annotation Symbols in Early Medieval Western Manuscripts (a cheat sheet) – Mittelalter
This cheat sheet originally came into being on a request of colleagues and students who wanted to have a quick reference guide to the early medieval annotation symbols found in Latin manuscripts. The first version, which can still be found on Academia.edu, was produced and circulated in 2016. This is a second version (in fact a 2.1 version) which is updated and which contains a clearer identification of special classes of signs. Apart from the all-important critical signs, I now set apart the characteristic insular and Greek signs as well as identifying several signs that seem to me clearly post-Carolingian, in fact almost certainly not appearing in the evidence before the late Middle Ages and the early modern period. The main purpose of this cheat sheet is to provide scholars with the names of specific graphic symbols that can be encountered in early medieval Latin manuscripts, so that it will be easier for all of us to identify particular annotation symbols and talk about them. The names assigned to specific graphic symbols are, for the most part, taken from ancient and medieval written sources. In some cases, these sources do not reveal what were the names of particular signs and I have, thus, taken the liberty to assign them a conventional name.
history  marginalia  rather-interesting  Middle-Ages  manuscripts  archivists  paratexts
28 days ago
[1710.08485] Making Faces: Universal Inverse Design of Surfaces with Thin Nematic Elastomer Sheets
Programmable shape-shifting materials can take different physical forms to achieve multifunctionality in a dynamic and controllable manner. Although morphing a shape from 2D to 3D via programmed inhomogeneous local deformations has been demonstrated in various ways, the inverse problem -- programming a sheet to take an arbitrary desired 3D shape -- is much harder yet critical to realize specific functions. Here, we address this inverse problem in thin liquid crystal elastomer (LCE) sheets, where the shape is preprogrammed by precise and local control of the molecular orientation of the liquid crystal monomers. We show how blueprints for arbitrary surface geometries as well as local extrinsic curvatures can be generated using approximate numerical methods. Backed by faithfully alignable and rapidly lockable LCE chemistry, we precisely embed our designs in LCE sheets using advanced top-down microfabrication techniques. We thus successfully produce flat sheets that, upon thermal activation, take an arbitrary desired shape, such as a face. The general design principles presented here for creating an arbitrary 3D shape will allow for exploration of unmet needs in flexible electronics, metamaterials, aerospace and medical devices, and more.
materials-science  indistinguishable-from-magic  inverse-problems  origami  self-assembly  engineering-design  to-write-about  rather-interesting  consider:genetic-programming
5 weeks ago
[1006.2782] Outer Billiards, Arithmetic Graphs, and the Octagon
Outer Billiards is a geometrically inspired dynamical system based on a convex shape in the plane.
When the shape is a polygon, the system has a combinatorial flavor. In the polygonal case, there is a natural acceleration of the map, a first return map to a certain strip in the plane. The arithmetic graph is a geometric encoding of the symbolic dynamics of this first return map.
In the case of the regular octagon, the case we study, the arithmetic graphs associated to periodic orbits are polygonal paths in R^8. We are interested in the asymptotic shapes of these polygonal paths, as the period tends to infinity. We show that the rescaled limit of essentially any sequence of these graphs converges to a fractal curve that simultaneously projects one way onto a variant of the Koch snowflake and another way onto a variant of the Sierpinski carpet. In a sense, this gives a complete description of the asymptotic behavior of the symbolic dynamics of the first return map.
What makes all our proofs work is an efficient (and basically well known) renormalization scheme for the dynamics.
billiards  dynamical-systems  rather-interesting  consider:looking-to-see  to-write-about  to-animate
5 weeks ago
Multiplying Non-Numbers
In last last week's episode of PBS Infinite Series, we talked about different flavors of multiplication (like associativity and commutativity) to think about when multiplying things that aren't numbers. My examples of multiplying non-numbers were vectors and matrices, which come from the land of algebra. Today I'd like to highlight another example: