**Vaguery + visualization**
297

Genome graphs and the evolution of genome inference

3 days ago by Vaguery

The human reference genome is part of the foundation of modern human biology and a monumental scientific achievement. However, because it excludes a great deal of common human variation, it introduces a pervasive reference bias into the field of human genomics. To reduce this bias, it makes sense to draw on representative collections of human genomes, brought together into reference cohorts. There are a number of techniques to represent and organize data gleaned from these cohorts, many using ideas implicitly or explicitly borrowed from graph-based models. Here, we survey various projects underway to build and apply these graph-based structures—which we collectively refer to as genome graphs—and discuss the improvements in read mapping, variant calling, and haplotype determination that genome graphs are expected to produce.

via:arthegall
review
bioinformatics
clustering
visualization
data-analysis
rather-interesting
consider:nonbiological-genomes
3 days ago by Vaguery

A nice proof for the Law of Cosines | Continuous Everywhere but Differentiable Nowhere

11 days ago by Vaguery

And I kinda told them what to do… Meh. I was jumping way ahead to get to the formula. We weren’t savoring the thinking to get to the formula. Now we are.

That being said, I ran across something quite beautiful. A stunning proof of the Law of Cosines (at least for acute triangles) on the site trigonography.

geometry
pedagogy
visualization
rather-interesting
visual-proof
feature-construction
explanation
consider:the-mangle
That being said, I ran across something quite beautiful. A stunning proof of the Law of Cosines (at least for acute triangles) on the site trigonography.

11 days ago by Vaguery

Are Pop Lyrics Getting More Repetitive?

7 weeks ago by Vaguery

In 1977, the great computer scientist Donald Knuth published a paper called The Complexity of Songs, which is basically one long joke about the repetitive lyrics of newfangled music (example quote: "the advent of modern drugs has led to demands for still less memory, and the ultimate improvement of Theorem 1 has consequently just been announced").

I'm going to try to test this hypothesis with data. I'll be analyzing the repetitiveness of a dataset of 15,000 songs that charted on the Billboard Hot 100 between 1958 and 2017.

visualization
graphic-design
data-analysis
essay
looking-to-see
javascript
rather-interesting
via:cdzombak
I'm going to try to test this hypothesis with data. I'll be analyzing the repetitiveness of a dataset of 15,000 songs that charted on the Billboard Hot 100 between 1958 and 2017.

7 weeks ago by Vaguery

[1805.09280] Dungeons and Dragons: Combinatorics for the $dP_3$ Quiver

8 weeks ago by Vaguery

In this paper, we utilize the machinery of cluster algebras, quiver mutations, and brane tilings to study a variety of historical enumerative combinatorics questions all under one roof. Previous work by the second author and REU students [Zha, LMNT14], and more recently of both authors [LM17], analyzed the cluster algebra associated to the cone over dP3, the del Pezzo surface of degree 6 (ℂℙ2 blown up at three points). By investigating sequences of toric mutations, those occurring only at vertices with two incoming and two outgoing arrows, in this cluster algebra, we obtained a family of cluster variables that could be parameterized by ℤ3 and whose Laurent expansions had elegant combinatorial interpretations in terms of dimer partition functions (in most cases). While the earlier work [Zha, LMNT14, LM17] focused exclusively on one possible initial seed for this cluster algebra, there are in total four relevant initial seeds (up to graph isomorphism). In the current work, we explore the combinatorics of the Laurent expansions from these other initial seeds and how this allows us to relate enumerations of perfect matchings on Dungeons to Dragons.

combinatorics
tiling
visualization
representation
to-understand
enumeration
algebra
to-write-about
8 weeks ago by Vaguery

[1710.00992] DimReader: Axis lines that explain non-linear projections

9 weeks ago by Vaguery

Non-linear dimensionality reduction (NDR) methods such as LLE and t-SNE are popular with visualization researchers and experienced data analysts, but present serious problems of interpretation. In this paper, we present DimReader, a technique that recovers readable axes from such techniques. DimReader is based on analyzing infinitesimal perturbations of the dataset with respect to variables of interest. The perturbations define exactly how we want to change each point in the original dataset and we measure the effect that these changes have on the projection. The recovered axes are in direct analogy with the axis lines (grid lines) of traditional scatterplots. We also present methods for discovering perturbations on the input data that change the projection the most. The calculation of the perturbations is efficient and easily integrated into programs written in modern programming languages. We present results of DimReader on a variety of NDR methods and datasets both synthetic and real-life, and show how it can be used to compare different NDR methods. Finally, we discuss limitations of our proposal and situations where further research is needed.

user-interface
visualization
dimension-reduction
rather-interesting
data-analysis
explanation
the-mangle-in-practice
to-write-about
to-do
9 weeks ago by Vaguery

Flowers for Julia | Fronkonstin

12 weeks ago by Vaguery

To color the points, I pick a random palette from the top list of COLOURLovers site using the colourlovers package. Since each flower involves a huge amount of calculations, I use Reduce to make this process efficiently. More examples:

fractals
visualization
color
details-of-note
12 weeks ago by Vaguery

ChrisKnott/Algojammer: An experimental code editor for writing algorithms

november 2018 by Vaguery

Algojammer is an experimental, proof-of-concept code editor for writing algorithms in Python. It was mainly written to assist with solving the kind of algorithm problems that feature in competitions like Google Code Jam, Topcoder and HackerRank.

algorithms
python
visualization
rather-interesting
text-editor
simulation
to-write-about
november 2018 by Vaguery

Exploded View Diagrams of Mathematical Surfaces - U.C. Berkeley Computer Graphics Research

august 2018 by Vaguery

We present a technique for visualizing complicated mathematical surfaces that is inspired by hand-designed topological illustrations. Our approach generates exploded views that expose the internal structure of such a surface by partitioning it into parallel slices, which are separated from each other along a single linear explosion axis. Our contributions include a set of simple, prescriptive design rules for choosing an explosion axis and placing cutting planes, as well as automatic algorithms for applying these rules. First we analyze the input shape to select the explosion axis based on the detected rotational and reflective symmetries of the input model. We then partition the shape into slices that are designed to help viewers better understand how the shape of the surface and its cross-sections vary along the explosion axis. Our algorithms work directly on triangle meshes, and do not depend on any specific parameterization of the surface. We generate exploded views for a variety of mathematical surfaces using our system.

visualization
mathematics
topology
rather-interesting
algorithms
to-do
august 2018 by Vaguery

John Williamson on Twitter: "First 1e6 integers, represented as binary vectors indicating their prime factors, and laid out using the sparse matrix support in @leland_mcinnes's UMAP dimensionality reduction algorithm. This is from a 1000000x78628 (!) bina

mathematics number-theory visualization rather-interesting graph-theory graph-layout dimension-reduction to-write-about

august 2018 by Vaguery

mathematics number-theory visualization rather-interesting graph-theory graph-layout dimension-reduction to-write-about

august 2018 by Vaguery

Three Little Circles

may 2018 by Vaguery

Once upon a time, there were three little circles.

d3
tutorial
to-understand
javascript
visualization
for-a-project
may 2018 by Vaguery

Bird Migration Patterns - Western Hemisphere Dataset | Science On a Sphere

march 2018 by Vaguery

This dataset shows the migration of 118 species of terrestrial bird populations in the Western Hemisphere. Each dot represents the estimated location of the center of each species’ population for each day of the year. These estimations come from millions of observations from the eBird citizen-science database. eBird is a real-time, online checklist program, launched in 2002 by the Cornell Lab of Ornithology and National Audubon Society, that allows birdwatchers to enter their observations.

via:twitter
ethology
migration
visualization
biology
rather-interesting
march 2018 by Vaguery

[1709.07097] Persistence Flamelets: multiscale Persistent Homology for kernel density exploration

march 2018 by Vaguery

In recent years there has been noticeable interest in the study of the "shape of data". Among the many ways a "shape" could be defined, topology is the most general one, as it describes an object in terms of its connectivity structure: connected components (topological features of dimension 0), cycles (features of dimension 1) and so on. There is a growing number of techniques, generally denoted as Topological Data Analysis, aimed at estimating topological invariants of a fixed object; when we allow this object to change, however, little has been done to investigate the evolution in its topology. In this work we define the Persistence Flamelets, a multiscale version of one of the most popular tool in TDA, the Persistence Landscape. We examine its theoretical properties and we show how it could be used to gain insights on KDEs bandwidth parameter.

data-analysis
feature-extraction
representation
topology
rather-interesting
algorithms
visualization
to-understand
exploratory-data-analysis
march 2018 by Vaguery

[1802.03426] UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

march 2018 by Vaguery

UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP as described has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning.

clustering
visualization
machine-learning
algorithms
performance-measure
rather-interesting
to-write-about
march 2018 by Vaguery

[1802.08370] Do WaveNets Dream of Acoustic Waves?

february 2018 by Vaguery

Various sources have reported the WaveNet deep learning architecture being able to generate high-quality speech, but to our knowledge there haven't been studies on the interpretation or visualization of trained WaveNets. This study investigates the possibility that WaveNet understands speech by unsupervisedly learning an acoustically meaningful latent representation of the speech signals in its receptive field; we also attempt to interpret the mechanism by which the feature extraction is performed. Suggested by singular value decomposition and linear regression analysis on the activations and known acoustic features (e.g. F0), the key findings are (1) activations in the higher layers are highly correlated with spectral features; (2) WaveNet explicitly performs pitch extraction despite being trained to directly predict the next audio sample and (3) for the said feature analysis to take place, the latent signal representation is converted back and forth between baseband and wideband components.

speech-synthesis
audio
signal-processing
neural-networks
rather-interesting
recurrent-networks
time-series
visualization
to-write-about
february 2018 by Vaguery

Interpretable Machine Learning

february 2018 by Vaguery

Machine learning has a huge potential to improve products, processes and research. But machines usually don’t give an explanation for their predictions, which hurts trust and creates a barrier for the adoption of machine learning. This book is about making machine learning models and their decisions interpretable.

modeling
statistics
rather-interesting
interpretability
explanation
to-write-about
visualization
machine-learning
book
february 2018 by Vaguery

My Quantum Circuit Simulator: Quirk

january 2018 by Vaguery

I've been working on a quantum circuit simulator that runs in your browser. It's called Quirk. Quirk is open source (github repository: Strilanc/Quirk), and there's a live instance you can play with at algorithmicassertions.com/quirk:

quantums
quantum-computing
simulation
javascript
visualization
lovely
to-write-about
january 2018 by Vaguery

[1712.09913] Visualizing the Loss Landscape of Neural Nets

january 2018 by Vaguery

Neural network training relies on our ability to find "good" minimizers of highly non-convex loss functions. It is well known that certain network architecture designs (e.g., skip connections) produce loss functions that train easier, and well-chosen training parameters (batch size, learning rate, optimizer) produce minimizers that generalize better. However, the reasons for these differences, and their effect on the underlying loss landscape, is not well understood. In this paper, we explore the structure of neural loss functions, and the effect of loss landscapes on generalization, using a range of visualization methods. First, we introduce a simple "filter normalization" method that helps us visualize loss function curvature, and make meaningful side-by-side comparisons between loss functions. Then, using a variety of visualizations, we explore how network architecture affects the loss landscape, and how training parameters affect the shape of minimizers.

visualization
neural-networks
machine-learning
to-understand
to-do
to-write-about
january 2018 by Vaguery

myPhysicsLab Home Page

january 2018 by Vaguery

Click on one of the physics simulations below... you'll see them animating in real time, and be able to interact with them by dragging objects or changing parameters like gravity.

simulation
visualization
online-toys
to-write-about
to-do
january 2018 by Vaguery

[1712.07811] Multi-dimensional Graph Fourier Transform

january 2018 by Vaguery

Many signals on Cartesian product graphs appear in the real world, such as digital images, sensor observation time series, and movie ratings on Netflix. These signals are "multi-dimensional" and have directional characteristics along each factor graph. However, the existing graph Fourier transform does not distinguish these directions, and assigns 1-D spectra to signals on product graphs. Further, these spectra are often multi-valued at some frequencies. Our main result is a multi-dimensional graph Fourier transform that solves such problems associated with the conventional GFT. Using algebraic properties of Cartesian products, the proposed transform rearranges 1-D spectra obtained by the conventional GFT into the multi-dimensional frequency domain, of which each dimension represents a directional frequency along each factor graph. Thus, the multi-dimensional graph Fourier transform enables directional frequency analysis, in addition to frequency analysis with the conventional GFT. Moreover, this rearrangement resolves the multi-valuedness of spectra in some cases. The multi-dimensional graph Fourier transform is a foundation of novel filterings and stationarities that utilize dimensional information of graph signals, which are also discussed in this study. The proposed methods are applicable to a wide variety of data that can be regarded as signals on Cartesian product graphs. This study also notes that multivariate graph signals can be regarded as 2-D univariate graph signals. This correspondence provides natural definitions of the multivariate graph Fourier transform and the multivariate stationarity based on their 2-D univariate versions.

visualization
Fourier-spectra
feature-extraction
rather-interesting
to-understand
to-write-about
consider:uses-in-GP
january 2018 by Vaguery

Visualizing the Uncertainty in Data | FlowingData

january 2018 by Vaguery

Statistics is a game where you figure out these uncertainties and make estimated judgements based on your calculations. But standard errors, confidence intervals, and likelihoods often lose their visual space in data graphics, which leads to judgements based on simplified summaries expressed as means, medians, or extremes.

That’s no good. You miss out on the interesting stuff. The important stuff. So here are some visualization options for the uncertainties in your data, each with its pros, cons, and examples.

uncertainty
visualization
semiotics
rather-interesting
to-write-about
what-about:DarkSky-wiggle-charts
That’s no good. You miss out on the interesting stuff. The important stuff. So here are some visualization options for the uncertainties in your data, each with its pros, cons, and examples.

january 2018 by Vaguery

[1712.06179] Organic Visualization of Document Evolution

january 2018 by Vaguery

Recent availability of data of writing processes at keystroke-granularity has enabled research on the evolution of document writing. A natural step is to develop systems that can actually show this data and make it understandable. Here we propose a data structure that captures a document's fine-grained history and an organic visualization that serves as an interface to it. We evaluate a proof-of-concept implementation of the system through a pilot study with documents written by students at a public university. Our results are promising and reveal facets such as general strategies adopted, local edition density and hierarchical structure of the final text.

visualization
metrics
social-media
rather-interesting
to-write-about
consider:for-GP
january 2018 by Vaguery

Visualizing Intersecting Sets

october 2017 by Vaguery

Understanding relationships between sets is an important analysis task. The major challenge in this context is the combinatorial explosion of the number of set intersections if the number of sets exceeds a trivial threshold. To address this, we introduce UpSet, a novel visualization technique for the quantitative analysis of sets, their intersections, and aggregates of intersections.

visualization
set-theory
data-analysis
rather-interesting
to-write-about
to-do
october 2017 by Vaguery

How Neill Blomkamp and Unity are shaping the future of filmmaking - The Verge

october 2017 by Vaguery

The advantage for Oats is that the software helped streamline the studio’s filmmaking efforts. “The biggest thing for me was the cameras,” Blomkamp says. “If you’re working live-action, you have no choice but to work with what you shot six months earlier.” And if a shot comes across badly, directors are stuck with the results, unless they have the time and resources to go back and reshoot. With Unity, Blomkamp explains, he can make shot adjustments immediately. In this particular short film, he says Oats started the film with harsh overhead lighting, but later changed the position of the sun, which led to softer, more appealing visuals.

animation
visualization
filmmaking
rather-interesting
via:?
october 2017 by Vaguery

Story Curves

october 2017 by Vaguery

A nonlinear narrative is a storytelling device that portrays events of a story out of chronological order, e.g., in reverse order or going back and forth between past and future events. Story curves visualize the nonlinear narrative of a movie by showing the order in which events are told in the movie and comparing them to their actual chronological order, resulting in possibly meandering visual patterns in the curve. We also developed Story Explorer, an interactive tool that visualizes a story curve together with complementary information such as characters and settings. Story Explorer further provides a script curation interface that allows users to specify the chronological order of events in movies. We used Story Explorer to analyze 10 popular nonlinear movies and describe the spectrum of narrative patterns that we discovered, including some novel patterns not previously described in the literature.

PDF

digital-humanities
visualization
narrative
rather-interesting
to-understand
to-write-about
october 2017 by Vaguery

[1606.06159] BiFold visualization of bipartite datasets

september 2017 by Vaguery

The emerging domain of data-enabled science necessitates development of algorithms and tools for knowledge discovery. Human interaction with data through well-constructed graphical representation can take special advantage of our visual ability to identify patterns. We develop a data visualization framework, called BiFold, for exploratory analysis of bipartite datasets that describe binary relationships between groups of objects. Typical data examples would include voting records, organizational memberships, and pairwise associations, or other binary datasets. BiFold provides a low dimensional embedding of data that represents similarity by visual nearness, analogous to Multidimensional Scaling (MDS). The unique and new feature of BiFold is its ability to simultaneously capture both within-group and between-group relationships among objects, enhancing knowledge discovery. We benchmark BiFold using the {\it Southern Women Dataset}, where social groups are now visually evident. We construct BiFold plots for two US voting datasets: For the presidential election outcomes since 1976, BiFold illustrates the evolving geopolitical structures that underlie these election results. For Senate congressional voting, BiFold identifies a partisan coordinate, separating senators into two parties while simultaneously visualizing a bipartisan-coalition coordinate which captures the ultimate fate of the bills (pass/fail). Finally, we consider a global cuisine dataset of the association between recipes and food ingredients. BiFold allows us to visually compare and contrast cuisines while also allowing identification of signature ingredients of individual cuisines.

data-analysis
visualization
rather-interesting
to-write-about
consider:looking-to-see
algorithms
plots
statistics
september 2017 by Vaguery

[1705.00594] A System for Accessible Artificial Intelligence

september 2017 by Vaguery

While artificial intelligence (AI) has become widespread, many commercial AI systems are not yet accessible to individual researchers nor the general public due to the deep knowledge of the systems required to use them. We believe that AI has matured to the point where it should be an accessible technology for everyone. We present an ongoing project whose ultimate goal is to deliver an open source, user-friendly AI system that is specialized for machine learning analysis of complex data in the biomedical and health care domains. We discuss how genetic programming can aid in this endeavor, and highlight specific examples where genetic programming has automated machine learning analyses in previous projects.

hey-I-know-this-guy
user-experience
machine-learning
visualization
user-interface
one-ring
to-write-about
to-go-see
september 2017 by Vaguery

graph-tool: Efficent network analysis with python

september 2017 by Vaguery

Graph-tool is an efficient Python module for manipulation and statistical analysis of graphs (a.k.a. networks). Contrary to most other python modules with similar functionality, the core data structures and algorithms are implemented in C++, making extensive use of template metaprogramming, based heavily on the Boost Graph Library. This confers it a level of performance that is comparable (both in memory usage and computation time) to that of a pure C/C++ library.

graphs
visualization
library
python
networks
to-learn
september 2017 by Vaguery

Models Are Stupid, and We Need More of Them [PDF]

september 2017 by Vaguery

It is my belief that the widespread inability to grasp the solution to the Monty Hall problem stems from a failure to properly model the scenario. You should switch doors because regardless of which door you picked initially, the host can always show you one with a goat. Being shown a goat therefore has no bearing on the probability that your initial choice was correct. Since that probability is 1/3, there is a 2/3 chance that you were wrong and the cash is behind the remaining door. Thus, two out of three times, switching is the right move. The common intuition that the choice is instead a 50-50 split between two options is erroneous.

Readers of this chapter are likely to be interested in social behaviors and their underlying psychological mechanisms. These systems tend to be quite a bit more complicated than a simple game show problem. This should concern us. Being an expert does not inoculate us from the failure of our limited imaginations, which evolved to solve problems quite different from those of interest to behavioral scientists. We could use some help.

models
philosophy-of-science
psychology
complex-systems
define-your-terms
via:?
complexology
visualization
learning-by-watching
Readers of this chapter are likely to be interested in social behaviors and their underlying psychological mechanisms. These systems tend to be quite a bit more complicated than a simple game show problem. This should concern us. Being an expert does not inoculate us from the failure of our limited imaginations, which evolved to solve problems quite different from those of interest to behavioral scientists. We could use some help.

september 2017 by Vaguery

[1709.01456] Improved Bounds for Drawing Trees on Fixed Points with L-shaped Edges

september 2017 by Vaguery

Let T be an n-node tree of maximum degree 4, and let P be a set of n points in the plane with no two points on the same horizontal or vertical line. It is an open question whether T always has a planar drawing on P such that each edge is drawn as an orthogonal path with one bend (an "L-shaped" edge). By giving new methods for drawing trees, we improve the bounds on the size of the point set P for which such drawings are possible to: O(n1.55) for maximum degree 4 trees; O(n1.22) for maximum degree 3 (binary) trees; and O(n1.142) for perfect binary trees.

Drawing ordered trees with L-shaped edges is harder---we give an example that cannot be done and a bound of O(nlogn) points for L-shaped drawings of ordered caterpillars, which contrasts with the known linear bound for unordered caterpillars.

graph-layout
computational-geometry
algorithms
rather-interesting
representation
visualization
hard-problems
nudge-targets
constraint-satisfaction
consider:feature-discovery
Drawing ordered trees with L-shaped edges is harder---we give an example that cannot be done and a bound of O(nlogn) points for L-shaped drawings of ordered caterpillars, which contrasts with the known linear bound for unordered caterpillars.

september 2017 by Vaguery

Safely Footed Spiderwebs – The Inner Frame

september 2017 by Vaguery

The first three are equiangular still, making angles of 10, 90 and 240 degrees at the corners, respectively. The spiderwebs are conformal images of polar coordinates on the disk, thus illustrating the Schwarz-Christoffel formula for circular polygons. The bat down below is a neat optical illusion, too: Would you think that the vertices are at the corners of an equilateral triangle?

mathematical-recreations
visualization
to-write-about
geometry
algebra
september 2017 by Vaguery

jebberjeb/viz.cljc: Generate images from Graphviz dot strings in Clojure and Clojurescript

september 2017 by Vaguery

This library provides one interface viz.core/image for both Clojure and Clojurescript. For Clojure, the dependency on the Graphviz dot binary is not necessary (as it is with other libraries). For Clojurescript, this library eliminates the need to separately include Viz.js.

Everything in this library is self contained, as it includes and uses Viz.js. This also ensures complete consistency (formatting nuances, etc) between Clojure and Clojurescript.

graphviz
programming
Clojure
library
visualization
ClojureScript
Everything in this library is self contained, as it includes and uses Viz.js. This also ensures complete consistency (formatting nuances, etc) between Clojure and Clojurescript.

september 2017 by Vaguery

Extractor attractor – Almost looks like work

august 2017 by Vaguery

Recently the extractor fan in my bathroom has started malfunctioning, occasionally grinding and stalling. The infuriating thing is that the grinding noise isn’t perfectly periodic – it is approximately so, but there are occasionally long gaps and the short gaps vary slightly. This lack of predictability makes the noise incredibly annoying, and hard to tune out. Before getting it fixed, I decided to investigate it a bit further.

The terminally curious may listen to the sound here:

https://www.dropbox.com/s/4xh1gmrjry10eky/FanSound.ts?dl=0

This was recorded from my phone, you can also hear me puttering around in the background.

After dumping the audio data, I looked at the waveform and realised it was quite difficult to extract the temporal locations of the grinding noises from the volume alone. As a good physicist I therefore had another look in the frequency domain, making a spectrogram.

mathematical-recreations
looking-to-see
data-analysis
visualization
physics
nonlinear-dynamics
amusing
The terminally curious may listen to the sound here:

https://www.dropbox.com/s/4xh1gmrjry10eky/FanSound.ts?dl=0

This was recorded from my phone, you can also hear me puttering around in the background.

After dumping the audio data, I looked at the waveform and realised it was quite difficult to extract the temporal locations of the grinding noises from the volume alone. As a good physicist I therefore had another look in the frequency domain, making a spectrogram.

august 2017 by Vaguery

Vega-Lite: A High-Level Visualization Grammar

august 2017 by Vaguery

Vega-Lite is a high-level visualization grammar. It provides a concise JSON syntax for supporting rapid generation of visualizations to support analysis. Vega-Lite specifications can be compiled to Vega specifications.

Vega-Lite specifications describe visualizations as mappings from data to properties of graphical marks (e.g., points or bars). It automatically produces visualization components including axes, legends, and scales. It then determines properties of these components based on a set of carefully designed rules. This approach allows Vega-Lite specifications to be succinct and expressive, but also provide user control. As Vega-Lite is designed for analysis, it supports data transformations such as aggregation, binning, filtering, sorting, and visual transformations including stacking and faceting.

Get started

Try online

Read our introduction article to Vega-Lite 1 on Medium, look at our talk about the new features in Vega-Lite 2, check out the documentation and take a look at our example gallery.

visualization
javascript
DSL
charts
rather-interesting
to-learn
Vega-Lite specifications describe visualizations as mappings from data to properties of graphical marks (e.g., points or bars). It automatically produces visualization components including axes, legends, and scales. It then determines properties of these components based on a set of carefully designed rules. This approach allows Vega-Lite specifications to be succinct and expressive, but also provide user control. As Vega-Lite is designed for analysis, it supports data transformations such as aggregation, binning, filtering, sorting, and visual transformations including stacking and faceting.

Get started

Try online

Read our introduction article to Vega-Lite 1 on Medium, look at our talk about the new features in Vega-Lite 2, check out the documentation and take a look at our example gallery.

august 2017 by Vaguery

[1703.07915] Perspective: Energy Landscapes for Machine Learning

may 2017 by Vaguery

Machine learning techniques are being increasingly used as flexible non-linear fitting and prediction tools in the physical sciences. Fitting functions that exhibit multiple solutions as local minima can be analysed in terms of the corresponding machine learning landscape. Methods to explore and visualise molecular potential energy landscapes can be applied to these machine learning landscapes to gain new insight into the solution space involved in training and the nature of the corresponding predictions. In particular, we can define quantities analogous to molecular structure, thermodynamics, and kinetics, and relate these emergent properties to the structure of the underlying landscape. This Perspective aims to describe these analogies with examples from recent applications, and suggest avenues for new interdisciplinary research.

machine-learning
introspection
rather-interesting
fitness-landscapes
energy-landscapes
visualization
to-write-about
consider:performance-measures
algorithms
feature-construction
may 2017 by Vaguery

Describing the Local Structure of Sequence Graphs | bioRxiv

april 2017 by Vaguery

Analysis of genetic variation using graph structures is an emerging paradigm of genomics. However, defining genetic sites on sequence graphs remains an open problem. Paten's invention of the ultrabubble and snarl, special subgraphs of sequence graphs which can identified with efficient algorithms, represents important first step to segregating graphs into genetic sites. We extend the theory of ultrabubbles to a special subclass where every detail of the ultrabubble can be described in a series and parallel arrangement of genetic sites. We furthermore introduce the concept of bundle structures, which allows us to recognize the graph motifs created by additional combinations of variation in the graph, including but not limited to runs of abutting single nucleotide variants. We demonstrate linear-time identification of bundles in a bidirected graph. These two advances build on initial work on ultrabubbles in bidirected graphs, and define a more granular concept of genetic site.

bioinformatics
visualization
representation
rather-odd
to-understand
ultrabubbles
april 2017 by Vaguery

Why Momentum Really Works

april 2017 by Vaguery

Here’s a popular story about momentum [1, 2, 3]: gradient descent is a man walking down a hill. He follows the steepest path downwards; his progress is slow, but steady. Momentum is a heavy ball rolling down the same hill. The added inertia acts both as a smoother and an accelerator, dampening oscillations and causing us to barrel through narrow valleys, small humps and local minima.

This standard story isn’t wrong, but it fails to explain many important behaviors of momentum. In fact, momentum can be understood far more precisely if we study it on the right model.

One nice model is the convex quadratic. This model is rich enough to reproduce momentum’s local dynamics in real problems, and yet simple enough to be understood in closed form. This balance gives us powerful traction for understanding this algorithm.

via:arthegall
visualization
rather-interesting
to-write-about
simulation
javascript
interactivity
nonlinear-dynamics
This standard story isn’t wrong, but it fails to explain many important behaviors of momentum. In fact, momentum can be understood far more precisely if we study it on the right model.

One nice model is the convex quadratic. This model is rich enough to reproduce momentum’s local dynamics in real problems, and yet simple enough to be understood in closed form. This balance gives us powerful traction for understanding this algorithm.

april 2017 by Vaguery

[1311.6763] Outer Billiards on Regular Polygons

march 2017 by Vaguery

In 1973, J. Moser proposed that his Twist Theorem could be used to show that orbits of the outer billiards map on a sufficiently smooth closed curve were always bounded. Five years later Moser asked the same question for a convex polygon. In 1987 F. Vivaldi and A. Shaidenko showed that all orbits for a regular polygon must be bounded. R. Schwartz recently showed that a quadrilateral known as a Penrose Kite has unbounded orbits and he proposed that 'most' convex polygons support unbounded orbits.

Except for a few special cases, very little is known about the dynamics of the outer billiards map on regular polygons. In this paper we present a unified approach to the analysis of regular polygons - using the canonical 'resonances' which are shared by all regular N-gons. In the case of the regular pentagon and regular octagon these resonances exist on all scales and the fractal structure is well documented, but these are the only non-trivial cases that have been analyzed. We present a partial analysis of the regular heptagon, but the limiting structure is poorly understood and this does not bode well for the remaining regular polygons. The minimal polynomial for the vertices of a regular N-gon has degree Phi(N)/2 where Phi is the Euler totient function, so N = 5, 7 and 11 are respectively quadratic, cubic and quintic. In the words of R. Schwartz, "A case such as N = 11 seems beyond the reach of current technology."

Some of the graphics have embedded high-resolution versions so this file is about 39Mb in size. This file and a smaller version can be downloaded at dynamicsofpolygons.org. Just click on PDFs.

This paper is dedicated to the memory of Eugene Gutkin (1946-2013) who made fundamental contributions to both inner and outer billiards.

billiards
dynamical-systems
chaos
rather-interesting
to-write-about
visualization
Except for a few special cases, very little is known about the dynamics of the outer billiards map on regular polygons. In this paper we present a unified approach to the analysis of regular polygons - using the canonical 'resonances' which are shared by all regular N-gons. In the case of the regular pentagon and regular octagon these resonances exist on all scales and the fractal structure is well documented, but these are the only non-trivial cases that have been analyzed. We present a partial analysis of the regular heptagon, but the limiting structure is poorly understood and this does not bode well for the remaining regular polygons. The minimal polynomial for the vertices of a regular N-gon has degree Phi(N)/2 where Phi is the Euler totient function, so N = 5, 7 and 11 are respectively quadratic, cubic and quintic. In the words of R. Schwartz, "A case such as N = 11 seems beyond the reach of current technology."

Some of the graphics have embedded high-resolution versions so this file is about 39Mb in size. This file and a smaller version can be downloaded at dynamicsofpolygons.org. Just click on PDFs.

This paper is dedicated to the memory of Eugene Gutkin (1946-2013) who made fundamental contributions to both inner and outer billiards.

march 2017 by Vaguery

[1501.06328] A weakly universal cellular automaton with 2 states on the tiling {11,3}

march 2017 by Vaguery

In this paper, we construct a weakly universal cellular automaton with two states only on the tiling {11,3}. The cellular automaton is rotation invariant and it is a true planar one.

cellular-automata
computer-science
universal-computation
mathematical-recreations
rather-interesting
to-write-about
visualization
march 2017 by Vaguery

[0806.0928] Drawing Binary Tanglegrams: An Experimental Evaluation

march 2017 by Vaguery

A binary tanglegram is a pair <S,T> of binary trees whose leaf sets are in one-to-one correspondence; matching leaves are connected by inter-tree edges. For applications, for example in phylogenetics or software engineering, it is required that the individual trees are drawn crossing-free. A natural optimization problem, denoted tanglegram layout problem, is thus to minimize the number of crossings between inter-tree edges.

The tanglegram layout problem is NP-hard and is currently considered both in application domains and theory. In this paper we present an experimental comparison of a recursive algorithm of Buchin et al., our variant of their algorithm, the algorithm hierarchy sort of Holten and van Wijk, and an integer quadratic program that yields optimal solutions.

graph-layout
algorithms
visualization
optimization
rather-interesting
to-write-about
nudge-targets
consider:looking-to-see
The tanglegram layout problem is NP-hard and is currently considered both in application domains and theory. In this paper we present an experimental comparison of a recursive algorithm of Buchin et al., our variant of their algorithm, the algorithm hierarchy sort of Holten and van Wijk, and an integer quadratic program that yields optimal solutions.

march 2017 by Vaguery

mathrecreation: polynomial grid division examples

march 2017 by Vaguery

There are not enough examples of polynomial division using the grid method out there. To remedy that, I have posted about 100 billion examples for your viewing pleasure. Please check ‘em out: https://dmackinnon1.github.io/polygrid/

Jokes aside, I was looking for a small JavaScript project, and this one looked like it would be fun. It was, and I learned a few things by building it. The page will generate a small number of examples, but you can get a fresh batch by reloading. Each example is calculated on the fly, and rendered using MathJax. Currently, the displayed calculations look like this:

matrices
javascript
visualization
nudge-targets
to-write-about
algorithms
Jokes aside, I was looking for a small JavaScript project, and this one looked like it would be fun. It was, and I learned a few things by building it. The page will generate a small number of examples, but you can get a fresh batch by reloading. Each example is calculated on the fly, and rendered using MathJax. Currently, the displayed calculations look like this:

march 2017 by Vaguery

[1603.06252] Grouping Time-varying Data for Interactive Exploration

march 2017 by Vaguery

We present algorithms and data structures that support the interactive analysis of the grouping structure of one-, two-, or higher-dimensional time-varying data while varying all defining parameters. Grouping structures characterise important patterns in the temporal evaluation of sets of time-varying data. We follow Buchin et al. [JoCG 2015] who define groups using three parameters: group-size, group-duration, and inter-entity distance. We give upper and lower bounds on the number of maximal groups over all parameter values, and show how to compute them efficiently. Furthermore, we describe data structures that can report changes in the set of maximal groups in an output-sensitive manner. Our results hold in ℝd for fixed d.

clustering
feature-construction
rather-interesting
to-understand
consider:performance-space-analysis
compare-to-Pareto-GP-features
visualization
approximation
to-write-about
algorithms
computational-geometry
march 2017 by Vaguery

[1702.07815] Subquadratic Algorithms for the Diameter and the Sum of Pairwise Distances in Planar Graphs

march 2017 by Vaguery

We show how to compute for n-vertex planar graphs in O(n11/6polylog(n)) expected time the diameter and the sum of the pairwise distances. The algorithms work for directed graphs with real weights and no negative cycles. In O(n15/8polylog(n)) expected time we can also compute the number of pairs of vertices at distance smaller than a given threshold. These are the first algorithms for these problems using time O(nc) for some constant c<2, even when restricted to undirected, unweighted planar graphs.

graph-theory
algorithms
computational-complexity
purdy-pitchers
rather-interesting
to-write-about
visualization
consider:looking-to-see
consider:robustness
consider:violating-constraints
march 2017 by Vaguery

[1703.00983] ASAP: Automatic Smoothing for Attention Prioritization in Streaming Time Series Visualization

march 2017 by Vaguery

Time series visualization of streaming telemetry (i.e., charting of key metrics such as server load over time) is increasingly prevalent in recent application deployments. Existing systems simply plot the raw data streams as they arrive, potentially obscuring large-scale deviations due to local variance and noise. We propose an alternative: to better prioritize attention in time series exploration and monitoring visualizations, smooth the time series as much as possible to remove noise while still retaining large-scale structure. We develop a new technique for automatically smoothing streaming time series that adaptively optimizes this trade-off between noise reduction (i.e., variance) and outlier retention (i.e., kurtosis). We introduce metrics to quantitatively assess the quality of the choice of smoothing parameter and provide an efficient streaming analytics operator, ASAP, that optimizes these metrics by combining techniques from stream processing, user interface design, and signal processing via a novel autocorrelation-based pruning strategy and pixel-aware preaggregation. We demonstrate that ASAP is able to improve users' accuracy in identifying significant deviations in time series by up to 38.4% while reducing response times by up to 44.3%. Moreover, ASAP delivers these results several orders of magnitude faster than alternative optimization strategies.

time-series
visualization
feature-extraction
smoothing
rather-interesting
algorithms
via:arthegall
to-write-about
compare:Pareto-GP-shifts
march 2017 by Vaguery

[1309.1779] Fractal dimension versus process complexity

february 2017 by Vaguery

Complexity measures are designed to capture complex behavior and quantify *how* complex, according to that measure, that particular behavior is. It can be expected that different complexity measures from possibly entirely different fields are related to each other in a non-trivial fashion. Here we study small Turing machines (TMs) with two symbols, and two and three states. For any particular such machine τ and any particular input x we consider what we call the 'space-time' diagram which is the collection of consecutive tape configurations of the computation τ(x). In our setting, we define fractal dimension of a Turing machine as the limiting fractal dimension of the corresponding space-time diagram. It turns out that there is a very strong relation between the fractal dimension of a Turing machine of the above-specified type and its runtime complexity. In particular, a TM with three states and two colors runs in at most linear time iff its dimension is 2, and its dimension is 1 iff it runs in super-polynomial time and it uses polynomial space. If a TM runs in time O(xn) we have empirically verified that the corresponding dimension is (n+1)/n, a result that we can only partially prove. We find the results presented here remarkable because they relate two completely different complexity measures: the geometrical fractal dimension on the one side versus the time complexity of a computation on the other side.

computational-complexity
visualization
fractals
rather-interesting
mathematical-recreations
to-write-about
measurement
february 2017 by Vaguery

Inside an AI 'brain' - What does machine learning look like?

february 2017 by Vaguery

One aspect all recent machine learning frameworks have in common - TensorFlow, MxNet, Caffe, Theano, Torch and others - is that they use the concept of a computational graph as a powerful abstraction. A graph is simply the best way to describe the models you create in a machine learning system. These computational graphs are made up of vertices (think neurons) for the compute elements, connected by edges (think synapses), which describe the communication paths between vertices.

visualization
very-nice
to-write-about
graph-theory
deep-learning
february 2017 by Vaguery

Research Blog: Open sourcing the Embedding Projector: a tool for visualizing high dimensional data

december 2016 by Vaguery

Recent advances in Machine Learning (ML) have shown impressive results, with applications ranging from image recognition, language translation, medical diagnosis and more. With the widespread adoption of ML systems, it is increasingly important for research scientists to be able to explore how the data is being interpreted by the models. However, one of the main challenges in exploring this data is that it often has hundreds or even thousands of dimensions, requiring special tools to investigate the space.

To enable a more intuitive exploration process, we are open-sourcing the Embedding Projector, a web application for interactive visualization and analysis of high-dimensional data recently shown as an A.I. Experiment, as part of TensorFlow. We are also releasing a standalone version at projector.tensorflow.org, where users can visualize their high-dimensional data without the need to install and run TensorFlow.

visualization
dimension-reduction
data-analysis
tools
open-source
google
To enable a more intuitive exploration process, we are open-sourcing the Embedding Projector, a web application for interactive visualization and analysis of high-dimensional data recently shown as an A.I. Experiment, as part of TensorFlow. We are also releasing a standalone version at projector.tensorflow.org, where users can visualize their high-dimensional data without the need to install and run TensorFlow.

december 2016 by Vaguery

[1603.02518] A New Method to Visualize Deep Neural Networks

august 2016 by Vaguery

We present a method for visualising the response of a deep neural network to a specific input. For image data for instance our method will highlight areas that provide evidence in favor of, and against choosing a certain class. The method overcomes several shortcomings of previous methods and provides great additional insight into the decision making process of convolutional networks, which is important both to improve models and to accelerate the adoption of such methods in e.g. medicine. In experiments on ImageNet data, we illustrate how the method works and can be applied in different ways to understand deep neural nets.

deep-learning
neural-networks
visualization
interactivity
user-experience
explanation
august 2016 by Vaguery

[1607.06444] The Complexity of Drawing Graphs on Few Lines and Few Planes

august 2016 by Vaguery

It is well known that any graph admits a crossing-free straight-line drawing in ℝ3 and that any planar graph admits the same even in ℝ2. For d∈{2,3}, let ρ1d(G) denote the minimum number of lines in ℝd that together can accommodate all edges of a drawing of G, where ρ12(G) is defined for planar graphs. We investigate the complexity of computing these parameters and obtain the following hardness and algorithmic results.

- For d∈{2,3}, we prove that deciding whether ρ1d(G)≤k for a given graph G and integer k is equivalent to the decision problem of the existential first-order theory of the ordered field ℝ. This means that both problems are complete for the complexity class ∃ℝ recently identified in computational geometry. The result concerning ρ13 holds even if G is restricted to be planar.

- Since NP⊆∃ℝ, deciding ρ1d(G)≤k is NP-hard for d∈{2,3}. On the positive side, we show that the problem is fixed-parameter tractable with respect to k. - Since ∃ℝ⊆PSPACE, both ρ12(G) and ρ13(G) are computable in polynomial space. On the negative side, we show that constructing a drawing optimal with respect to ρ12 or ρ13 requires exponential space in the worst case, if the vertices are drawn at points with rational coordinates.

- Let ρ23(G) be the minimum number of planes in ℝ3 needed to accommodate a straight-line drawing of a graph G. We prove that deciding whether ρ23(G)≤k is NP-hard for any fixed k. Hence, the problem is not fixed-parameter tractable with respect to k unless P=NP.

graph-layout
computational-complexity
algorithms
rather-interesting
visualization
party-pitchers
combinatorics
representation
- For d∈{2,3}, we prove that deciding whether ρ1d(G)≤k for a given graph G and integer k is equivalent to the decision problem of the existential first-order theory of the ordered field ℝ. This means that both problems are complete for the complexity class ∃ℝ recently identified in computational geometry. The result concerning ρ13 holds even if G is restricted to be planar.

- Since NP⊆∃ℝ, deciding ρ1d(G)≤k is NP-hard for d∈{2,3}. On the positive side, we show that the problem is fixed-parameter tractable with respect to k. - Since ∃ℝ⊆PSPACE, both ρ12(G) and ρ13(G) are computable in polynomial space. On the negative side, we show that constructing a drawing optimal with respect to ρ12 or ρ13 requires exponential space in the worst case, if the vertices are drawn at points with rational coordinates.

- Let ρ23(G) be the minimum number of planes in ℝ3 needed to accommodate a straight-line drawing of a graph G. We prove that deciding whether ρ23(G)≤k is NP-hard for any fixed k. Hence, the problem is not fixed-parameter tractable with respect to k unless P=NP.

august 2016 by Vaguery

[1605.08749] Visual Model Validation via Inline Replication

august 2016 by Vaguery

Data visualizations typically show retrospective views of an existing dataset with little or no focus on repeatability. However, consumers of these tools often use insights gleaned from retrospective visualizations as the basis for decisions about future events. In this way, visualizations often serve as visual predictive models despite the fact that they are typically designed to present historical views of the data. This "visual predictive model" approach, however, can lead to invalid inferences. In this paper, we describe an approach to visual model validation called Inline Replication (IR) which, similar to the cross-validation technique used widely in machine learning, provides a nonparametric and broadly applicable technique for visual model assessment and repeatability. This paper describes the overall IR process and outlines how it can be integrated into both traditional and emerging "big data" visualization pipelines. Examples are provided showing IR integrated within common visualization techniques (such as bar charts and linear regression lines) as well as a more fully-featured visualization system designed for complex exploratory analysis tasks.

visualization
data-analysis
user-interface
user-experience
statistics
exploratory-data-analysis
nudge-targets
consider:gp-approach
august 2016 by Vaguery

[1606.09488] A weakly universal universal cellular automaton in the heptagrid

july 2016 by Vaguery

In this paper, we construct a weakly universal cellular automaton in the heptagrid, the tessellation {7,3} which is not rotation invariant but which is truly planar. This result, under these conditions, cannot be improved for the tessellations {p,3}.

cellular-automata
artificial-life
rather-interesting
visualization
to-explore
july 2016 by Vaguery

[1604.01674] OFFl models: novel schema for dynamical modeling of biological systems

july 2016 by Vaguery

Flow diagrams are a common tool used to help build and interpret models of dynamical systems, often in biological contexts such as consumer-resource models and similar compartmental models. Typically, their usage is intuitive and informal. Here, we present a formalized version of flow diagrams as a kind of weighted directed graph which follow a strict grammar, which translate into a system of ordinary differential equations (ODEs) by a single unambiguous rule, and which have an equivalent representation as a relational database. (We abbreviate this schema of "ODEs and formalized flow diagrams" as OFFl.) Drawing a diagram within this strict grammar encourages a mental discipline on the part of the modeler in which all dynamical processes of a system are thought of as interactions between dynamical species that draw parcels from one or more source species and deposit them into target species according to a set of transformation rules. From these rules, the net rate of change for each species can be derived. The modeling schema can therefore be understood as both an epistemic and practical heuristic for modeling, serving both as an organizational framework for the model building process and as a mechanism for deriving ODEs. All steps of the schema beyond the initial scientific (intuitive, creative) abstraction of natural observations into model variables are algorithmic and easily carried out by a computer, thus enabling the future development of a dedicated software implementation. Such tools would empower the modeler to consider significantly more complex models than practical limitations might have otherwise proscribed, since the modeling framework itself manages that complexity on the modeler's behalf. In this report, we describe the chief motivations for OFFl, outline its implementation, and utilize a range of classic examples from ecology and epidemiology to showcase its features.

models-and-modes
representation
visualization
systems-biology
formalization
amusing
theoretical-biology
systems-thinking
july 2016 by Vaguery

[1606.00667] Converting virtual link diagrams to normal ones

june 2016 by Vaguery

A virtual link diagram is called normal if the associated abstract link diagram is checkerboard colorable, and a virtual link is normal if it has a normal diagram as a representative.In this paper, we introduce a method of converting a virtual link diagram to a normal virtual link diagram by use of the double covering technique. We show that the normal virtual link diagrams obtained from two equivalent virtual link diagrams are related by generalized Reidemeister moves and Kauffman flypes.

knot-theory
visualization
representation
rather-interesting
nudge-targets
consider:representation
consider:FIELD
june 2016 by Vaguery

[1606.06488] Discretized Approaches to Schematization

june 2016 by Vaguery

To produce cartographic maps, simplification is typically used to reduce complexity of the map to a legible level. With schematic maps, however, this simplification is pushed far beyond the legibility threshold and is instead constrained by functional need and resemblance. Moreover, stylistic geometry is often used to convey the schematic nature of the map. In this paper we explore discretized approaches to computing a schematic shape S for a simple polygon P. We do so by overlaying a plane graph G on P as the solution space for the schematic shape. Topological constraints imply that S should describe a simple polygon. We investigate two approaches, simple map matching and connected face selection, based on commonly used similarity metrics.

With the former, S is a simple cycle C in G and we quantify resemblance via the Fr\'echet distance. We prove that it is NP-hard to compute a cycle that approximates the minimal Fr\'echet distance over all simple cycles in a plane graph G. This result holds even if G is a partial grid graph, if area preservation is required and if we assume a given sequence of turns is specified.

With the latter, S is a connected face set in G, quantifying resemblance via the symmetric difference. Though the symmetric difference seems a less strict measure, we prove that it is NP-hard to compute the optimal face set. This result holds even if G is full grid graph or a triangular or hexagonal tiling, and if area preservation is required. Moreover, it is independent of whether we allow the set of faces to have holes or not.

computational-geometry
approximation
visualization
graph-theory
rather-interesting
algorithms
summarization
nudge-targets
consider:feature-discovery
With the former, S is a simple cycle C in G and we quantify resemblance via the Fr\'echet distance. We prove that it is NP-hard to compute a cycle that approximates the minimal Fr\'echet distance over all simple cycles in a plane graph G. This result holds even if G is a partial grid graph, if area preservation is required and if we assume a given sequence of turns is specified.

With the latter, S is a connected face set in G, quantifying resemblance via the symmetric difference. Though the symmetric difference seems a less strict measure, we prove that it is NP-hard to compute the optimal face set. This result holds even if G is full grid graph or a triangular or hexagonal tiling, and if area preservation is required. Moreover, it is independent of whether we allow the set of faces to have holes or not.

june 2016 by Vaguery

[1603.07610v1] Going Out of Business: Auction House Behavior in the Massively Multi-Player Online Game

march 2016 by Vaguery

The in-game economies of massively multi-player online games (MMOGs) are complex systems that have to be carefully designed and managed. This paper presents the results of an analysis of auction house data from the MMOG Glitch, across a 14 month time period, the entire lifetime of the game. The data comprise almost 3 million data points, over 20,000 unique players and more than 650 products. Furthermore, an interactive visualization, based on Sankey flow diagrams, is presented which shows the proportion of the different clusters across each time bin, as well as the flow of players between clusters. The diagram allows evaluation of migration of players between clusters as a function of time, as well as churn analysis. The presented work provides a template analysis and visualization model for progression-based or temporal-based analysis of player behavior broadly applicable to games.

experiment
economics
rather-interesting
MMORPG
games
network-theory
visualization
nice
march 2016 by Vaguery

[1602.08084] Ribbonlength of folded ribbon unknots in the plane

march 2016 by Vaguery

We study Kauffman's model of folded ribbon knots: knots made of a thin strip of paper folded flat in the plane. The ribbonlength is the length to width ratio of such a ribbon, and it turns out that the way the ribbon is folded influences the ribbonlength. We give an upper bound of ncot(π/n) for the ribbonlength of n-stick unknots. We prove that the minimum ribbonlength for a 3-stick unknot with the same type of fold at each vertex is 33‾√, and such a minimizer is an equilateral triangle. We end the paper with a discussion of projection stick number and ribbonlength.

knot-theory
visualization
discrete-mathematics
rather-interesting
representation
algorithms
nudge-targets
consider:representation
march 2016 by Vaguery

[1411.1350] Geometric Network Comparison

december 2015 by Vaguery

Network analysis has a crucial need for tools to compare networks and assess the significance of differences between networks. We propose a principled statistical approach to network comparison that approximates networks as probability distributions on negatively curved manifolds. We outline the theory, as well as implement the approach on simulated networks.

network-theory
via:cshalizi
inverse-problems
visualization
classification
nudge-targets
consider:looking-to-see
consider:stress-testing
short-term
december 2015 by Vaguery

[1507.08379] VMF-SNE: Embedding for Spherical Data

november 2015 by Vaguery

T-SNE is a well-known approach to embedding high-dimensional data and has been widely used in data visualization. The basic assumption of t-SNE is that the data are non-constrained in the Euclidean space and the local proximity can be modelled by Gaussian distributions. This assumption does not hold for a wide range of data types in practical applications, for instance spherical data for which the local proximity is better modelled by the von Mises-Fisher (vMF) distribution instead of the Gaussian. This paper presents a vMF-SNE embedding algorithm to embed spherical data. An iterative process is derived to produce an efficient embedding. The results on a simulation data set demonstrated that vMF-SNE produces better embeddings than t-SNE for spherical data.

dimension-reduction
approximation
clustering
rather-interesting
statistics
algorithms
nudge-targets
visualization
november 2015 by Vaguery

[1502.05461] Visualizing Object Detection Features

november 2015 by Vaguery

We introduce algorithms to visualize feature spaces used by object detectors. Our method works by inverting a visual feature back to multiple natural images. We found that these visualizations allow us to analyze object detection systems in new ways and gain new insight into the detector's failures. For example, when we visualize the features for high scoring false alarms, we discovered that, although they are clearly wrong in image space, they do look deceptively similar to true positives in feature space. This result suggests that many of these false alarms are caused by our choice of feature space, and supports that creating a better learning algorithm or building bigger datasets is unlikely to correct these errors. By visualizing feature spaces, we can gain a more intuitive understanding of recognition systems.

deep-learning
image-processing
image-segmentation
visualization
rather-interesting
algorithms
feature-extraction
nudge-targets
consider:rediscovery
november 2015 by Vaguery

[1504.02442] A Visual Formalism for Interacting Systems

november 2015 by Vaguery

Interacting systems are increasingly common. Many examples pervade our everyday lives: automobiles, aircraft, defense systems, telephone switching systems, financial systems, national governments, and so on. Closer to computer science, embedded systems and Systems of Systems are further examples of interacting systems. Common to all of these is that some "whole" is made up of constituent parts, and these parts interact with each other. By design, these interactions are intentional, but it is the unintended interactions that are problematic. The Systems of Systems literature uses the terms "constituent systems" and "constituents" to refer to systems that interact with each other. That practice is followed here. This paper presents a visual formalism, Swim Lane Event-Driven Petri Nets, that is proposed as a basis for Model-Based Testing (MBT) of interacting systems. In the absence of available tools, this model can only support the offline form of Model-Based Testing.

visualization
formalization
UML
diagrams
meh
concurrency
november 2015 by Vaguery

[1503.01034] Quantomatic: A Proof Assistant for Diagrammatic Reasoning

november 2015 by Vaguery

Monoidal algebraic structures consist of operations that can have multiple outputs as well as multiple inputs, which have applications in many areas including categorical algebra, programming language semantics, representation theory, algebraic quantum information, and quantum groups. String diagrams provide a convenient graphical syntax for reasoning formally about such structures, while avoiding many of the technical challenges of a term-based approach. Quantomatic is a tool that supports the (semi-)automatic construction of equational proofs using string diagrams. We briefly outline the theoretical basis of Quantomatic's rewriting engine, then give an overview of the core features and architecture and give a simple example project that computes normal forms for commutative bialgebras.

formal-languages
visualization
rather-interesting
formalization
logic-programming
nudge-targets
consider:rediscovery
november 2015 by Vaguery

[1511.00422] Abelian logic gates

november 2015 by Vaguery

An abelian processor is an automaton whose output is independent of the order of its inputs. Bond and Levine have proved that a network of abelian processors performs the same computation regardless of processing order (subject only to a halting condition). We prove that any finite abelian processor can be emulated by a network of certain very simple abelian processors, which we call gates. The most fundamental gate is a "toppler", which absorbs input particles until their number exceeds some given threshold, at which point it topples, emitting one particle and returning to its initial state. With the exception of an adder gate, which simply combines two streams of particles, each of our gates has only one input wire. Our results can be reformulated in terms of the functions computed by processors, and one consequence is that any increasing function from N^k to N^l that is the sum of a linear function and a periodic function can be expressed in terms of floors of quotients by integers, and addition.

computer-science
rather-interesting
visualization
wow-that-got-odd-real-quick
Peter-Winkler
nudge-targets
consider:looking-to-see
group-theory
proof
november 2015 by Vaguery

[1412.7367] A Framework for Evaluating Complex Networks Measurements

november 2015 by Vaguery

A good deal of current research in complex networks involves the characterization and/or classification of the topological properties of given structures, which has motivated several respective measurements. This letter proposes a framework for evaluating the quality of complex network measurements in terms of their effective resolution, degree of degeneracy and discriminability. The potential of the suggested approach is illustrated with respect to comparing the characterization of several model and real-world networks by using concentric and symmetry measurements. The results indicate a markedly superior performance for the latter type of mapping.

rather-interesting
network-theory
community-detection
statistics
algorithms
visualization
performance-measure
feature-extraction
nudge-targets
consider:visualization
november 2015 by Vaguery

[1505.00343] A first-order logic for string diagrams

november 2015 by Vaguery

Equational reasoning with string diagrams provides an intuitive means of proving equations between morphisms in a symmetric monoidal category. This can be extended to proofs of infinite families of equations using a simple graphical syntax called !-box notation. While this does greatly increase the proving power of string diagrams, previous attempts to go beyond equational reasoning have been largely ad hoc, owing to the lack of a suitable logical framework for diagrammatic proofs involving !-boxes. In this paper, we extend equational reasoning with !-boxes to a fully-fledged first order logic called with conjunction, implication, and universal quantification over !-boxes. This logic, called !L, is then rich enough to properly formalise an induction principle for !-boxes. We then build a standard model for !L and give an example proof of a theorem for non-commutative bialgebras using !L, which is unobtainable by equational reasoning alone.

formalization
visualization
representation
mathematics
proof
rather-interesting
category-theory
nudge-targets
logic-programming
consider:representation
november 2015 by Vaguery

[1506.06668] Fairy Lights in Femtoseconds: Aerial and Volumetric Graphics Rendered by Focused Femtosecond Laser Combined with Computational Holographic Fields

november 2015 by Vaguery

We present a method of rendering aerial and volumetric graphics using femtosecond lasers. A high-intensity laser excites a physical matter to emit light at an arbitrary 3D position. Popular applications can then be explored especially since plasma induced by a femtosecond laser is safer than that generated by a nanosecond laser. There are two methods of rendering graphics with a femtosecond laser in air: Producing holograms using spatial light modulation technology, and scanning of a laser beam by a galvano mirror. The holograms and workspace of the system proposed here occupy a volume of up to 1 cm^3; however, this size is scalable depending on the optical devices and their setup. This paper provides details of the principles, system setup, and experimental evaluation, and discussions on scalability, design space, and applications of this system. We tested two laser sources: an adjustable (30-100 fs) laser which projects up to 1,000 pulses per second at energy up to 7 mJ per pulse, and a 269-fs laser which projects up to 200,000 pulses per second at an energy up to 50 uJ per pulse. We confirmed that the spatiotemporal resolution of volumetric displays, implemented with these laser sources, is 4,000 and 200,000 dots per second. Although we focus on laser-induced plasma in air, the discussion presented here is also applicable to other rendering principles such as fluorescence and microbubble in solid/liquid materials.

engineering-design
indistinguishable-from-magic
3d
display
visualization
animation
lasers
november 2015 by Vaguery

[1403.6025] Web-Based Visualization of Very Large Scientific Astronomy Imagery

november 2015 by Vaguery

Visualizing and navigating through large astronomy images from a remote location with current astronomy display tools can be a frustrating experience in terms of speed and ergonomics, especially on mobile devices. In this paper, we present a high performance, versatile and robust client-server system for remote visualization and analysis of extremely large scientific images. Applications of this work include survey image quality control, interactive data query and exploration, citizen science, as well as public outreach. The proposed software is entirely open source and is designed to be generic and applicable to a variety of datasets. It provides access to floating point data at terabyte scales, with the ability to precisely adjust image settings in real-time. The proposed clients are light-weight, platform-independent web applications built on standard HTML5 web technologies and compatible with both touch and mouse-based devices. We put the system to the test and assess the performance of the system and show that a single server can comfortably handle more than a hundred simultaneous users accessing full precision 32 bit astronomy data.

visualization
user-experience
user-interface
computer-science
big-data
rather-interesting
practical-problems
devops
web-design
november 2015 by Vaguery

Understanding Society: A survey of agent-based models

september 2015 by Vaguery

Federico Bianchi and Flaminio Squazzoni have published a very useful survey of the development and uses of agent-based models in the social sciences over the past twenty-five years in WIREs Comput Stat 2015 (link). The article is a very useful reference and discussion for anyone interested in the applicability of ABM within sociology.

hey-I-know-this-guy
agent-based
review
rather-interesting
visualization
evolutionary-economics
september 2015 by Vaguery

[1406.7331] Graphical Constructions for the sl(3), so(3) and G2 Invariants for Virtual Knots, Virtual Braids and Free Knots

september 2015 by Vaguery

We construct graph-valued analogues of the Kuperberg sl(3) and G2 invariants for virtual knots. The restriction of the sl(3) or G2 invariants for classical knots coincides with the usual Homflypt sl(3) invariant and G2 invariants. For virtual knots and graphs these invariants provide new graphical information that allows one to prove minimality theorems and to construct new invariants for free knots (unoriented and unlabeled Gauss codes taken up to abstract Reidemeister moves). A novel feature of this approach is that some knots are of sufficient complexity that they evaluate themselves in the sense that the invariant is the knot itself seen as a combinatorial structure. The paper generalizes these structures to virtual braids and discusses the relationship with the original Penrose bracket for graph colorings.

knot-theory
graph-theory
visualization
ontology
representation
nudge-targets
consider:search-moves
september 2015 by Vaguery

[1405.0193] The complex planetary synchronization structure of the solar system

september 2015 by Vaguery

The complex planetary synchronization structure of the solar system, which since Pythagoras of Samos (ca. 570-495 BC) is known as the music of the spheres, is briefly reviewed from the Renaissance up to contemporary research. Copernicus' heliocentric model from 1543 suggested that the planets of our solar system form a kind of mutually ordered and quasi-synchronized system. From 1596 to 1619 Kepler formulated preliminary mathematical relations of approximate commensurabilities among the planets, which were later reformulated in the Titius-Bode rule (1766-1772) that successfully predicted the orbital position of Ceres and Uranus. Following the discovery of the ~11 yr sunspot cycle, in 1859 Wolf suggested that the observed solar variability could be approximately synchronized with the orbital movements of Venus, Earth, Jupiter and Saturn. Modern research have further confirmed that: (1) the planetary orbital periods can be approximately deduced from a simple system of resonant frequencies; (2) the solar system oscillates with a specific set of gravitational frequencies, and many of them (e.g. within the range between 3 yr and 100 yr) can be approximately constructed as harmonics of a base period of ~178.38 yr; (3) solar and climate records are also characterized by planetary harmonics from the monthly to the millennia time scales. This short review concludes with an emphasis on the contribution of the author's research on the empirical evidences and physical modeling of both solar and climate variability based on astronomical harmonics. The general conclusion is that the solar system works as a resonator characterized by a specific harmonic planetary structure that synchronizes also the Sun's activity and the Earth's climate.

rather-interesting
astronomy
nonlinear-dynamics
visualization
september 2015 by Vaguery

[1504.01727] Some Elementary Aspects of 4-dimensional Geometry

august 2015 by Vaguery

We indicate that Heron's formula (which relates the square of the area of a triangle to a quartic function of its edge lengths) can be interpreted as a scissors congruence in 4-dimensional space. In the process of demonstrating this, we examine a number of decompositions of hypercubes, hyper-parallelograms, and other elementary 4-dimensional solids.

rather-interesting
mathematics
visualization
purty-pitchers
geometry
review
history
august 2015 by Vaguery

[1304.5232] Neural network spectral robustness under perturbations of the underlying graph

august 2015 by Vaguery

Recent studies have been using graph theoretical approaches to model complex networks (such as social, infrastructural or biological networks), and how their hardwired circuitry relates to their dynamic evolution in time. Understanding how configuration reflects on the coupled behavior in a system of dynamic nodes can be of great importance, for example in the context of how the brain connectome is affecting brain function. However, the connectivity patterns that appear in brain networks, and their individual effects on network dynamics, are far from being fully understood.

We study the connections between edge configuration and dynamics in a simple oriented network composed of two interconnected cliques (representative of brain feedback regulatory circuitry). In this paper, our main goal is to study the spectra of the graph adjacency and Laplacian matrices, with a focus on three aspects in particular: (1) the sensitivity/robustness the spectrum in response to varying the intra and inter-modular edge density, (2) the effects on the spectrum of perturbing the edge configuration, while keeping the densities fixed and (3) the effects of increasing the network size. We study some tractable aspects analytically, then simulate more general results numerically. This paper aims to clarify, from analytical and modeling perspectives, the underpinnings of our related work, which further addresses how graph properties affect the network's temporal dynamics and phase transitions.

We propose that this type of results may be helpful when studying small networks such as macroscopic brain circuits. We suggest potential applications to understanding synaptic restructuring in learning networks, and the effects of network configuration to function of emotion-regulatory neural circuits.

graph-theory
robustness
network-theory
nudge-targets
statistics
visualization
We study the connections between edge configuration and dynamics in a simple oriented network composed of two interconnected cliques (representative of brain feedback regulatory circuitry). In this paper, our main goal is to study the spectra of the graph adjacency and Laplacian matrices, with a focus on three aspects in particular: (1) the sensitivity/robustness the spectrum in response to varying the intra and inter-modular edge density, (2) the effects on the spectrum of perturbing the edge configuration, while keeping the densities fixed and (3) the effects of increasing the network size. We study some tractable aspects analytically, then simulate more general results numerically. This paper aims to clarify, from analytical and modeling perspectives, the underpinnings of our related work, which further addresses how graph properties affect the network's temporal dynamics and phase transitions.

We propose that this type of results may be helpful when studying small networks such as macroscopic brain circuits. We suggest potential applications to understanding synaptic restructuring in learning networks, and the effects of network configuration to function of emotion-regulatory neural circuits.

august 2015 by Vaguery

[1412.4096] Materials Cartography: Representing and Mining Material Space Using Structural and Electronic Fingerprints

august 2015 by Vaguery

As the proliferation of high-throughput approaches in materials science is increasing the wealth of data in the field, the gap between accumulated-information and derived-knowledge widens. We address the issue of scientific discovery in materials databases by introducing novel analytical approaches based on structural and electronic materials fingerprints. The framework is employed to (i) query large databases of materials using similarity concepts, (ii) map the connectivity of the materials space (i.e., as a materials cartogram) for rapidly identifying regions with unique organizations/properties, and (iii) develop predictive Quantitative Materials Structure-Property Relation- ships (QMSPR) models for guiding materials design. In this study, we test these fingerprints by seeking target material properties. As a quantitative example, we model the critical temperatures of known superconductors. Our novel materials fingerprinting and materials cartography approaches contribute to the emerging field of materials informatics by enabling effective computational tools to analyze, visualize, model, and design new materials.

materials-science
visualization
feature-extraction
clustering
information-theory
nudge-targets
consider:rediscovery
august 2015 by Vaguery

[1408.3600] Uncovering the nutritional landscape of food

august 2015 by Vaguery

Recent progresses in data-driven analysis methods, including network-based approaches, are revolutionizing many classical disciplines. These techniques can also be applied to food and nutrition, which must be studied to design healthy diets. Using nutritional information from over 1,000 raw foods, we systematically evaluated the nutrient composition of each food in regards to satisfying daily nutritional requirements. The nutrient balance of a food was quantified herein as nutritional fitness, using the food's frequency of occurrence in nutritionally adequate food combinations. Nutritional fitness offers prioritization of recommendable foods within a global network of foods, in which foods are connected based on the similarities of their nutrient compositions. We identified a number of key nutrients, such as choline and alpha-linolenic acid, whose levels in foods can critically affect the foods' nutritional fitness. Analogously, pairs of nutrients can have the same effect. In fact, two nutrients can impact the nutritional fitness synergistically, although the individual nutrients alone may not. This result, involving the tendency among nutrients to show correlations in their abundances across foods, implies a hidden layer of complexity when exploring for foods whose balance of nutrients within pairs holistically helps meet nutritional requirements. Interestingly, foods with high nutritional fitness successfully maintain this nutrient balance. This effect expands our scope to a diverse repertoire of nutrient-nutrient correlations, integrated under a common network framework that yields unexpected yet coherent associations between nutrients. Our nutrient-profiling approach combined with a network-based analysis provides a more unbiased, global view of the relationships between foods and nutrients, and can be extended towards nutritional policies, food marketing, and personalized nutrition.

nutrition
data-analysis
rather-interesting
visualization
exploratory-data-analysis
looking-to-see
pattern-discovery
august 2015 by Vaguery

[1501.00304] Contact Representations of Graphs in 3D

july 2015 by Vaguery

We study contact representations of graphs in which vertices are represented by axis-aligned polyhedra in 3D and edges are realized by non-zero area common boundaries between corresponding polyhedra. We show that for every 3-connected planar graph, there exists a simultaneous representation of the graph and its dual with 3D boxes. We give a linear-time algorithm for constructing such a representation. This result extends the existing primal-dual contact representations of planar graphs in 2D using circles and triangles. While contact graphs in 2D directly correspond to planar graphs, we next study representations of non-planar graphs in 3D. In particular we consider representations of optimal 1-planar graphs. A graph is 1-planar if there exists a drawing in the plane where each edge is crossed at most once, and an optimal n-vertex 1-planar graph has the maximum (4n - 8) number of edges. We describe a linear-time algorithm for representing optimal 1-planar graphs without separating 4-cycles with 3D boxes. However, not every optimal 1-planar graph admits a representation with boxes. Hence, we consider contact representations with the next simplest axis-aligned 3D object, L-shaped polyhedra. We provide a quadratic-time algorithm for representing optimal 1-planar graph with L-shaped polyhedra.

graph-theory
representation
computational-geometry
algorithms
rather-interesting
nudge-targets
consider:rediscovery
visualization
feature-construction
july 2015 by Vaguery

Welcome! — Toyplot 0.6.0 documentation

july 2015 by Vaguery

Welcome to Toyplot, the kid-sized plotting toolkit for Python with grownup-sized goals:

Develop beautiful interactive, animated plots that embrace the unique capabilities of electronic publishing and support repoducibility.

Create the best possible data graphics “out-of-the-box”, maximizing data ink and minimizing chartjunk.

Provide a clean, minimalist interface that scientists and engineers will love.

Read more about our ideas for Toyplot and scientific publishing here.

python
iPython-notebook
library
visualization
via:many
plotting
lovely
Develop beautiful interactive, animated plots that embrace the unique capabilities of electronic publishing and support repoducibility.

Create the best possible data graphics “out-of-the-box”, maximizing data ink and minimizing chartjunk.

Provide a clean, minimalist interface that scientists and engineers will love.

Read more about our ideas for Toyplot and scientific publishing here.

july 2015 by Vaguery

[1311.1911] Visualizing the Effects of a Changing Distance on Data Using Continuous Embeddings

july 2015 by Vaguery

Most ML methods, from clustering to classification, rely on a distance function to describe relationships between datapoints. For complex datasets it is hard to avoid making some arbitrary choices when defining a distance function. To compare images, one must choose a spatial scale, for signals, a temporal scale. The right scale is hard to pin down and it is preferable when results do not depend too tightly on the exact value one picked. Topological data analysis seeks to address this issue by focusing on the notion of neighbourhood instead of distance. Here, we show that in some cases a simpler solution is available. One can check how strongly distance relationships depend on a hyperparameter using dimensionality reduction. We formulate a variant of dynamical multi-dimensional scaling (MDS), which embeds datapoints as curves. The resulting algorithm is based on the Concave-Convex Procedure (CCCP) and provides a simple and efficient way of visualizing changes and invariances in distance patterns as a hyperparameter is varied. We also present a variant to analyze the dependence on multiple hyperparameters. We provide a cMDS algorithm that is straightforward to implement, use and extend. To illustrate the possibilities of cMDS, we apply cMDS to several real-world data sets.

clustering
metrics
data-analysis
statistics
algorithms
horse-races
visualization
july 2015 by Vaguery

UpSet

june 2015 by Vaguery

UpSet is an interactive, web based visualization technique designed to analyze set-based data. UpSet visualizes both, set intersections and their properties, and the items (elements) in the dataset.

set-theory
data-analysis
visualization
software
javascript
june 2015 by Vaguery

[1504.02381] Inferring the mesoscale structure of layered, edge-valued and time-varying networks

june 2015 by Vaguery

Many network systems are composed of interdependent but distinct types of interactions, which cannot be fully understood in isolation. These different types of interactions are often represented as layers, attributes on the edges or as a time-dependence of the network structure. Although they are crucial for a more comprehensive scientific understanding, these representations offer substantial challenges. Namely, it is an open problem how to precisely characterize the large or mesoscale structure of network systems in relation to these additional aspects. Furthermore, the direct incorporation of these features invariably increases the effective dimension of the network description, and hence aggravates the problem of overfitting, i.e. the use of overly-complex characterizations that mistake purely random fluctuations for actual structure. In this work, we propose a robust and principled method to tackle these problems, by constructing generative models of modular network structure, incorporating layered, attributed and time-varying properties, as well as a Bayesian methodology to infer the parameters from data and select the most appropriate model according to statistical evidence. We show that the method is capable of revealing hidden structure in layered, edge-valued and time-varying networks, and that the most appropriate level of granularity with respect to the additional dimensions can be reliably identified. We illustrate our approach on a variety of empirical systems, including a social network of physicians, the voting correlations of deputies in the Brazilian national congress, the global airport network, and a proximity network of high-school students.

network-theory
visualization
representation
dynamical-systems
models
rather-interesting
nudge-targets
consider:feature-discovery
june 2015 by Vaguery

**related tags**

Copy this bookmark: