rather-interesting   2210

« earlier    

[1704.00454] Clustering in Hilbert simplex geometry
Clustering categorical distributions in the probability simplex is a fundamental task met in many applications dealing with normalized histograms. Traditionally, the differential-geometric structures of the probability simplex have been used either by (i) setting the Riemannian metric tensor to the Fisher information matrix of the categorical distributions, or (ii) defining the dualistic information-geometric structure induced by a smooth dissimilarity measure, the Kullback-Leibler divergence. In this work, we introduce for this clustering task a novel computationally-friendly framework for modeling the probability simplex termed {\em Hilbert simplex geometry}. In the Hilbert simplex geometry, the distance function is described by a polytope. We discuss the pros and cons of those different statistical modelings, and benchmark experimentally these geometries for center-based k-means and k-center clusterings. We show that Hilbert metric in the probability simplex satisfies the property of information monotonicity. Furthermore, since a canonical Hilbert metric distance can be defined on any bounded convex subset of the Euclidean space, we also consider Hilbert's projective geometry of the elliptope of correlation matrices and study its clustering performances.
clustering  representation  categorical-variables  rather-interesting  machine-learning  to-understand 
7 days ago by Vaguery
[1802.04148] On Dendrites Generated By Symmetric Polygonal Systems: The Case of Regular Polygons
We define G-symmetric polygonal systems of similarities and study the properties of symmetric dendrites, which appear as their attractors. This allows us to find the conditions under which the attractor of a zipper becomes a dendrite.
fractals  geometry  nonlinear-dynamics  rather-interesting  purdy-pitchers 
7 days ago by Vaguery
[1312.1001] Optimal detection of intersections between convex polyhedra
For a polyhedron P in ℝd, denote by |P| its combinatorial complexity, i.e., the number of faces of all dimensions of the polyhedra. In this paper, we revisit the classic problem of preprocessing polyhedra independently so that given two preprocessed polyhedra P and Q in ℝd, each translated and rotated, their intersection can be tested rapidly.
For d=3 we show how to perform such a test in O(log|P|+log|Q|) time after linear preprocessing time and space. This running time is the best possible and improves upon the last best known query time of O(log|P|log|Q|) by Dobkin and Kirkpatrick (1990).
We then generalize our method to any constant dimension d, achieving the same optimal O(log|P|+log|Q|) query time using a representation of size O(|P|⌊d/2⌋+ε) for any ε>0 arbitrarily small. This answers an even older question posed by Dobkin and Kirkpatrick 30 years ago.
In addition, we provide an alternative O(log|P|+log|Q|) algorithm to test the intersection of two convex polygons P and Q in the plane.
computational-complexity  computational-geometry  algorithms  rather-interesting  consider:performance-measures  to-write-about  nudge-targets 
7 days ago by Vaguery
[1806.04609] Streaming PCA and Subspace Tracking: The Missing Data Case
For many modern applications in science and engineering, data are collected in a streaming fashion carrying time-varying information, and practitioners need to process them with a limited amount of memory and computational resources in a timely manner for decision making. This often is coupled with the missing data problem, such that only a small fraction of data attributes are observed. These complications impose significant, and unconventional, constraints on the problem of streaming Principal Component Analysis (PCA) and subspace tracking, which is an essential building block for many inference tasks in signal processing and machine learning. This survey article reviews a variety of classical and recent algorithms for solving this problem with low computational and memory complexities, particularly those applicable in the big data regime with missing data. We illustrate that streaming PCA and subspace tracking algorithms can be understood through algebraic and geometric perspectives, and they need to be adjusted carefully to handle missing data. Both asymptotic and non-asymptotic convergence guarantees are reviewed. Finally, we benchmark the performance of several competitive algorithms in the presence of missing data for both well-conditioned and ill-conditioned systems.
online-learning  algorithms  machine-learning  data-balancing  streaming-algorithms  performance-measure  rather-interesting  to-write-about  to-generalize  consider:multiobjective-versions 
10 days ago by Vaguery
[1804.00746] The simple essence of automatic differentiation
Automatic differentiation (AD) in reverse mode (RAD) is a central component of deep learning and other uses of large-scale optimization. Commonly used RAD algorithms such as backpropagation, however, are complex and stateful, hindering deep understanding, improvement, and parallel execution. This paper develops a simple, generalized AD algorithm calculated from a simple, natural specification. The general algorithm is then specialized by varying the representation of derivatives. In particular, applying well-known constructions to a naive representation yields two RAD algorithms that are far simpler than previously known. In contrast to commonly used RAD implementations, the algorithms defined here involve no graphs, tapes, variables, partial derivatives, or mutation. They are inherently parallel-friendly, correct by construction, and usable directly from an existing programming language with no need for new data types or programming style, thanks to use of an AD-agnostic compiler plugin.
computer-science  differentiation  gradients  numerical-methods  machine-learning  algorithms  category-theory  rather-interesting  to-understand 
10 days ago by Vaguery
The Fermat primality test and the GCD test | The Math Less Traveled
In my previous post we proved that if shares a nontrivial common factor with , then , and this in turn proves that is not prime (by Fermat’s Little Theorem).

But wait a minute, this is silly: if shares a common factor with , then we don’t need anything as complex as Fermat’s Little Theorem to figure out that is not prime! All we have to do is compute using the Euclidean Algorithm, and when we find out that the result isn’t , we can immediately conclude that isn’t prime since it has a nontrivial divisor.
algorithms  number-theory  rather-interesting  nudge-targets  consider:looking-to-see 
10 days ago by Vaguery
Tanya Khovanova's Math Blog » Blog Archive » 3-Symmetric Graphs
In my previous post I described 3-symmetric permutations. Now I want to define 3-symmetric graphs.

A reminder: a k-symmetric permutation is such that the densities of all permutations of size k in it are the same. In particular a 2-symmetric permutation has the same number of inversions and non-inversions. How do we translate this to graphs? I call a graph 2-symmetric if it has the same number of edges as non-edges. 2-symmetric graphs are easy to construct, but they only exist if the number of vertices, n, is such that n choose 2 is even. The simplest non-trivial examples are graphs with 4 vertices and three edges.
graph-theory  feature-construction  rather-interesting  mathematical-recreations 
10 days ago by Vaguery
[1807.04271] A quantum-inspired classical algorithm for recommendation systems
A recommendation system suggests products to users based on data about user preferences. It is typically modeled by a problem of completing an m×n matrix of small rank k. We give the first classical algorithm to produce a recommendation in O(poly(k)polylog(m,n)) time, which is an exponential improvement on previous algorithms that run in time linear in m and n. Our strategy is inspired by a quantum algorithm by Kerenidis and Prakash: like the quantum algorithm, instead of reconstructing a user's full list of preferences, we only seek a randomized sample from the user's preferences. Our main result is an algorithm that samples high-weight entries from a low-rank approximation of the input matrix in time independent of m and n, given natural sampling assumptions on that input matrix. As a consequence, we show that Kerenidis and Prakash's quantum machine learning (QML) algorithm, one of the strongest candidates for provably exponential speedups in QML, does not in fact give an exponential speedup over classical algorithms.
recommendations  algorithms  quantum-computing  classical-computing  rather-interesting  information-theory  to-understand 
10 days ago by Vaguery
Friendship Paradox | Math Section
Your friends are more popular than you. In the year 1991, the sociologist Scott L. Feld made an interesting discovery. He realized that on average, most individual’s friends have more friends than the individual. This phenomenon is called the friendship paradox. [1] In this article, we describe the friendship paradox using graph theory, provide a mathematical proof, and give examples for possible applications. Furthermore, we talk about its applications in monitoring disease outbreaks.
graph-theory  rather-interesting  network-theory  to-write-about 
12 days ago by Vaguery
An overview of semantic image segmentation.
More specifically, the goal of semantic image segmentation is to label each pixel of an image with a corresponding class of what is being represented. Because we're predicting for every pixel in the image, this task is commonly referred to as dense prediction.
image-segmentation  image-processing  distributed-processing  rather-interesting  neural-networks  to-write-about  to-do  nudge-targets  consider:representation 
12 days ago by Vaguery
[1811.11888] Reality Inspired Voter Models: A Mini-review
This mini-review presents extensions of the voter model that incorporate a number of plausible features of real decision-making processes of individuals. Although these generalizations are not calibrated by empirical data, the resulting dynamics are suggestive of real collective social behaviors.
collective-behavior  voting  social-networks  social-dynamics  abstraction  simulation  to-write-about  review  rather-interesting 
12 days ago by Vaguery
[1601.05613] Partial Sum Minimization of Singular Values Representation on Grassmann Manifolds
As a significant subspace clustering method, low rank representation (LRR) has attracted great attention in recent years. To further improve the performance of LRR and extend its applications, there are several issues to be resolved. The nuclear norm in LRR does not sufficiently use the prior knowledge of the rank which is known in many practical problems. The LRR is designed for vectorial data from linear spaces, thus not suitable for high dimensional data with intrinsic non-linear manifold structure. This paper proposes an extended LRR model for manifold-valued Grassmann data which incorporates prior knowledge by minimizing partial sum of singular values instead of the nuclear norm, namely Partial Sum minimization of Singular Values Representation (GPSSVR). The new model not only enforces the global structure of data in low rank, but also retains important information by minimizing only smaller singular values. To further maintain the local structures among Grassmann points, we also integrate the Laplacian penalty with GPSSVR. An effective algorithm is proposed to solve the optimization problem based on the GPSSVR model. The proposed model and algorithms are assessed on some widely used human action video datasets and a real scenery dataset. The experimental results show that the proposed methods obviously outperform other state-of-the-art methods.
dimension-reduction  feature-selection  spatial-embedding  metrics  define-your-terms  rather-interesting  representation  to-write-about  to-simulate  consider:performance-measures  consider:data-balancing 
15 days ago by Vaguery
[1807.05283] When Are Two Gossips the Same? Types of Communication in Epistemic Gossip Protocols
We provide an in-depth study of the knowledge-theoretic aspects of communication in so-called gossip protocols. Pairs of agents communicate by means of calls in order to spread information---so-called secrets---within the group. Depending on the nature of such calls knowledge spreads in different ways within the group. Systematizing existing literature, we identify 18 different types of communication, and model their epistemic effects through corresponding indistinguishability relations. We then provide a classification of these relations and show its usefulness for an epistemic analysis in presence of different communication types. Finally, we explain how to formalise the assumption that the agents have common knowledge of a distributed epistemic gossip protocol.
agent-based  graph-theory  communication  artificial-life  collective-behavior  simulation  define-your-terms  rather-interesting  to-simulate 
15 days ago by Vaguery
[1808.00722] On the Harborth constant of $C_3 oplus C_{3n}$
For a finite abelian group (G,+,0) the Harborth constant 𝗀(G) is the smallest integer k such that each squarefree sequence over G of length k, equivalently each subset of G of cardinality at least k, has a subsequence of length exp(G) whose sum is 0. In this paper, it is established that 𝗀(G)=3n+3 for prime n≠3 and 𝗀(C3⊕C9)=13.
group-theory  combinatorics  algorithms  rather-interesting  feature-construction  to-understand  to-write-about  an-example-would-be-useful-about-now 
15 days ago by Vaguery
[1712.02950] CycleGAN, a Master of Steganography
CycleGAN (Zhu et al. 2017) is one recent successful approach to learn a transformation between two image distributions. In a series of experiments, we demonstrate an intriguing property of the model: CycleGAN learns to "hide" information about a source image into the images it generates in a nearly imperceptible, high-frequency signal. This trick ensures that the generator can recover the original sample and thus satisfy the cyclic consistency requirement, while the generated image remains realistic. We connect this phenomenon with adversarial attacks by viewing CycleGAN's training procedure as training a generator of adversarial examples and demonstrate that the cyclic consistency loss causes CycleGAN to be especially vulnerable to adversarial attacks.
adversarial-attacks  machine-learning  deep-learning  generative-models  tricksy-humans  steganography-as-a-side-effect  rather-interesting  to-write-about 
16 days ago by Vaguery
[1808.02679] Age of Information Upon Decisions
We consider an M/M/1 update-and-decide system where Poisson distributed decisions are made based on the received updates. We propose to characterize the freshness of the received updates at decision epochs with Age upon Decisions (AuD). Under the first-come-first-served policy (FCFS), the closed form average AuD is derived. We show that the average AuD of the system is determined by the arrival rate and the service rate, and is independent of the decision rate. Thus, merely increasing the decision rate does not improve the timeliness of decisions. Nevertheless, increasing the arrival rate and the service rate simultaneously can decrease the average AuD efficiently.
queueing-theory  information-theory  planning  simulation  rather-interesting  to-write-about 
16 days ago by Vaguery
"Created Facts and the Flawed Ontology of Copyright Law" by Justin Hughes
It is black letter doctrine that facts are not copyrightable: facts are discovered, not created—so they will always lack the originality needed for copyright protection. As straightforward as this reasoning seems, it is fundamentally flawed. Using the “social facts” theory of philosopher John Searle, this Article explores a variety of “created facts” cases—designation systems, systematic evaluations, and privately written laws—in which original expression from private individuals is adopted by social convention and generates facts in our social reality. In the course of this discussion, the paper places facts in their historical and philosophical context, explores how courts conflate facts with expressions of fact, and explains the difference between social facts created by expression and the “facts” of literature and fiction. Having established that the copyrighted works discussed in these cases produce facts, the question arises whether copyright's merger doctrine eliminates the copyright protection—a result that is both seemingly harsh and seemingly necessary. This Article proposes a recalibration of the merger doctrine to acknowledge that “created facts” are a unique situation in which the incentive of copyright is needed not just to generate the expression, but also needed to generate the facts. Reprinted by permission of the publisher.
law  pragmatism  philosophy  rather-interesting  to-understand 
16 days ago by Vaguery
CTK Insights » Blog Archive A pizza with a hole - CTK Insights
Now we are in a position to tackle the problem from the Crux. Any line through the center of the rectangular pizza divides it into equal parts. One of these lines stands out. It's the one that divides into equal parts the circular hole, that's the line that passes through the hole's center. Thus to solve the problem we cut through the centers of the rectangle and that of the hole.

But that's not the only solution. In fact there are infinitely many more, although to find any of these in a constructive manner is rather difficult, if not impossible.
mathematical-recreations  plane-geometry  rather-interesting  open-questions  nudge-targets  consider:looking-to-see  compass-and-straightedge 
17 days ago by Vaguery
Supermultiplexed optical imaging and barcoding with engineered polyynes | Nature Methods
Optical multiplexing has a large impact in photonics, the life sciences and biomedicine. However, current technology is limited by a 'multiplexing ceiling' from existing optical materials. Here we engineered a class of polyyne-based materials for optical supermultiplexing. We achieved 20 distinct Raman frequencies, as 'Carbon rainbow', through rational engineering of conjugation length, bond-selective isotope doping and end-capping substitution of polyynes. With further probe functionalization, we demonstrated ten-color organelle imaging in individual living cells with high specificity, sensitivity and photostability. Moreover, we realized optical data storage and identification by combinatorial barcoding, yielding to our knowledge the largest number of distinct spectral barcodes to date. Therefore, these polyynes hold great promise in live-cell imaging and sorting as well as in high-throughput diagnostics and screening.
materials-science  nanotechnology  indistinguishable-from-magic  molecular-design  molecular-biology  to-write-about  rather-interesting 
17 days ago by Vaguery
@dynamicCallable: Unix Tools as Swift Functions – The Always Right Institute – Almost always sometimes definitely right.
A new feature in Swift 5 are Dynamic Callable’s. We combine this with the related Dynamic Member Lookup feature to expose the filesystem and Unix shell commands as regular Swift objects and functions.

Wait what?!
swift  programming-language  library  rather-interesting  to-do 
17 days ago by Vaguery

« earlier    

related tags

a-bit-backwards-though  abstraction  actually  adversarial-attacks  aesthetics  agent-based  algebra  algorithms  also:duh  an-example-would-be-useful-about-now  anthropology-of-data  anthropology  approximation  architecture  arctic-curve  arithmetic  art  artificial-life  autobiographical  automata  benchmarking  billiards  biological-metaphor-of-the-month  biology  biometrics  book-art  boolean-functions  cartography  categorical-variables  category-theory  cellular-automata  chronology  circuits  cladistics  classical-computing  classification  clustering  cognition  collaboration  collective-behavior  combinatorics  communication  community-formation  compass-and-straightedge  competition  compilers  complexity-measures  complexology  computability  computational-complexity  computational-geometry  computational-methods  computer-science  condensed-matter  conjecture  consequences-of-the-rules  consider:classification  consider:data-balancing  consider:feature-discovery  consider:looking-to-see  consider:multiobjective-versions  consider:other-design-patterns  consider:performance-measures  consider:rediscovery  consider:representation  consider:req  consider:simulation  constraint-satisfaction  construction  contingency  cooperation  coopting  corporatism  counting  critical-thinking  cultural-dynamics  cultural-engineering  cultural-predation  data-analysis  data-balancing  data-structures  deception  decomposition  deep-learning  define-your-terms  democracy  design-patterns  design  developmental-biology  differentiation  dimension-reduction  discovery  discriminators  distributed-processing  dynamical-systems  ecology  enumeration  epidemiology-of-ideas  epidemiology  essay  evolutionary-algorithms  exercises  experimentation  explanation  exploding-dots  fascism  feature-construction  feature-extraction  feature-selection  floating-point  for-puzzles  forgery  fractals  frechet-distance  games  generative-art  generative-models  genetic-programming  geometry  gradients  graph-coloring  graph-theory  graphic-design  grassmannian  group-theory  halting-problem  heterogeneity  heuristics  hey-i-know-these-folks  history-of-ideas  history-of-science  history  humanities  hypergraphs  image-processing  image-segmentation  imprecision  indistinguishable-from-magic  inference  information-theory  inspiration  introspection  inverse-problems  it's-more-complicated-than-you-think  javascript  kata  knot-theory  law  learning-from-data  learning-in-public  library  linear-algebra  linguistics  literary-criticism  looking-to-see  machine-learning  management  mastery  materials-science  mathematical-programming  mathematical-recreations  mathematics  memoir  mental-health  metaheuristics  metrics  microsoft  modeling  models-and-modes  molecular-biology  molecular-design  mystery  nanohistory  nanotechnology  network-theory  neural-networks  nonlinear-dynamics  normalization  nostalgia  np-hard  nudge-targets  nudgelike  number-theory  numerical-methods  online-learning  oopsie  open-questions  optimization  organizational-behavior  oulipo  parafiction  parataxis  pattern-discovery  pedagogy  performance-measure  permutations  phase-transitions  philosophy  plane-geometry  planning  political-economy  polynomials  pragmatism  probability-theory  programming-language  progressivism  proof  propaganda  public-policy  purdy-pitchers  python  quantum-computing  queueing-theory  random-walks  recommendation-engines  recommendations  reconfiguration  recurrent-networks  refactoring  representation  req  review  reworks  rewriting-systems  schemes  science-studies  selection  sequences  set-theory  signaling  simulation  social-dynamics  social-media  social-networks  social-norms  sociotechnical-us  software-synthesis  sorting  spam  spatial-embedding  speaking-as  statistical-mechanics  statistics  steganography-as-a-side-effect  strange-loop  streaming-algorithms  strings  swift  symmathesy  teams  text-editor  the-mangle-in-practice  the-use-of-books  theoretical-biology  tiling  to-do  to-follow  to-generalize  to-read  to-simulate  to-think-on  to-understand  to-write-about  topical  topology  training-data  tricksy-humans  uncertainty  user-centric-art  user-centric-design  user-interface  utopianism  video  visualization  voting  walsh-polynomials  wedge-product  whoopsie-daisy  you-know-for-kids 

Copy this bookmark: