nhaliday + papers   198

Anti-hash test. - Codeforces
- Thue-Morse sequence
- nice paper: http://www.mii.lt/olympiads_in_informatics/pdf/INFOL119.pdf
In general, polynomial string hashing is a useful technique in construction of efficient string algorithms. One simply needs to remember to carefully select the modulus M and the variable of the polynomial p depending on the application. A good rule of thumb is to pick both values as prime numbers with M as large as possible so that no integer overflow occurs and p being at least the size of the alphabet.
2.2. Upper Bound on M
[stuff about 32- and 64-bit integers]
2.3. Lower Bound on M
On the other side Mis bounded due to the well-known birthday paradox: if we consider a collection of m keys with m ≥ 1.2√M then the chance of a collision to occur within this collection is at least 50% (assuming that the distribution of fingerprints is close to uniform on the set of all strings). Thus if the birthday paradox applies then one needs to choose M=ω(m^2)to have a fair chance to avoid a collision. However, one should note that not always the birthday paradox applies. As a benchmark consider the following two problems.

I generally prefer to use Schwartz-Zippel to reason about collision probabilities w/ this kind of thing, eg, https://people.eecs.berkeley.edu/~sinclair/cs271/n3.pdf.

A good way to get more accurate results: just use multiple primes and the Chinese remainder theorem to get as large an M as you need w/o going beyond 64-bit arithmetic.

more on this: https://codeforces.com/blog/entry/60442
oly  oly-programming  gotchas  howto  hashing  algorithms  strings  random  best-practices  counterexample  multi  pdf  papers  nibble  examples  fields  polynomials  lecture-notes  yoga  probability  estimate  magnitude  hacker  adversarial  CAS  lattice  discrete 
august 2019 by nhaliday
c++ - Which is faster: Stack allocation or Heap allocation - Stack Overflow
On my machine, using g++ 3.4.4 on Windows, I get "0 clock ticks" for both stack and heap allocation for anything less than 100000 allocations, and even then I get "0 clock ticks" for stack allocation and "15 clock ticks" for heap allocation. When I measure 10,000,000 allocations, stack allocation takes 31 clock ticks and heap allocation takes 1562 clock ticks.

so maybe around 100x difference? what does that work out to in terms of total workload?

hmm:
http://vlsiarch.eecs.harvard.edu/wp-content/uploads/2017/02/asplos17mallacc.pdf
Recent work shows that dynamic memory allocation consumes nearly 7% of all cycles in Google datacenters.

That's not too bad actually. Seems like I shouldn't worry about shifting from heap to stack/globals unless profiling says it's important, particularly for non-oly stuff.

edit: Actually, factor x100 for 7% is pretty high, could be increase constant factor by almost an order of magnitude.

edit: Well actually that's not the right math. 93% + 7%*.01 is not much smaller than 100%
q-n-a  stackex  programming  c(pp)  systems  memory-management  performance  intricacy  comparison  benchmarks  data  objektbuch  empirical  google  papers  nibble  time  measure  pro-rata  distribution  multi  pdf  oly-programming  computer-memory 
june 2019 by nhaliday
What every computer scientist should know about floating-point arithmetic
Floating-point arithmetic is considered as esoteric subject by many people. This is rather surprising, because floating-point is ubiquitous in computer systems: Almost every language has a floating-point datatype; computers from PCs to supercomputers have floating-point accelerators; most compilers will be called upon to compile floating-point algorithms from time to time; and virtually every operating system must respond to floating-point exceptions such as overflow. This paper presents a tutorial on the aspects of floating-point that have a direct impact on designers of computer systems. It begins with background on floating-point representation and rounding error, continues with a discussion of the IEEE floating point standard, and concludes with examples of how computer system builders can better support floating point.

https://stackoverflow.com/questions/2729637/does-epsilon-really-guarantees-anything-in-floating-point-computations
"you must use an epsilon when dealing with floats" is a knee-jerk reaction of programmers with a superficial understanding of floating-point computations, for comparisons in general (not only to zero).

This is usually unhelpful because it doesn't tell you how to minimize the propagation of rounding errors, it doesn't tell you how to avoid cancellation or absorption problems, and even when your problem is indeed related to the comparison of two floats, it doesn't tell you what value of epsilon is right for what you are doing.

...

Regarding the propagation of rounding errors, there exists specialized analyzers that can help you estimate it, because it is a tedious thing to do by hand.

https://www.di.ens.fr/~cousot/projects/DAEDALUS/synthetic_summary/CEA/Fluctuat/index.html

This was part of HW1 of CS24:
https://en.wikipedia.org/wiki/Kahan_summation_algorithm
In particular, simply summing n numbers in sequence has a worst-case error that grows proportional to n, and a root mean square error that grows as {\displaystyle {\sqrt {n}}} {\sqrt {n}} for random inputs (the roundoff errors form a random walk).[2] With compensated summation, the worst-case error bound is independent of n, so a large number of values can be summed with an error that only depends on the floating-point precision.[2]

cf:
https://en.wikipedia.org/wiki/Pairwise_summation
In numerical analysis, pairwise summation, also called cascade summation, is a technique to sum a sequence of finite-precision floating-point numbers that substantially reduces the accumulated round-off error compared to naively accumulating the sum in sequence.[1] Although there are other techniques such as Kahan summation that typically have even smaller round-off errors, pairwise summation is nearly as good (differing only by a logarithmic factor) while having much lower computational cost—it can be implemented so as to have nearly the same cost (and exactly the same number of arithmetic operations) as naive summation.

In particular, pairwise summation of a sequence of n numbers xn works by recursively breaking the sequence into two halves, summing each half, and adding the two sums: a divide and conquer algorithm. Its worst-case roundoff errors grow asymptotically as at most O(ε log n), where ε is the machine precision (assuming a fixed condition number, as discussed below).[1] In comparison, the naive technique of accumulating the sum in sequence (adding each xi one at a time for i = 1, ..., n) has roundoff errors that grow at worst as O(εn).[1] Kahan summation has a worst-case error of roughly O(ε), independent of n, but requires several times more arithmetic operations.[1] If the roundoff errors are random, and in particular have random signs, then they form a random walk and the error growth is reduced to an average of {\displaystyle O(\varepsilon {\sqrt {\log n}})} O(\varepsilon {\sqrt {\log n}}) for pairwise summation.[2]

A very similar recursive structure of summation is found in many fast Fourier transform (FFT) algorithms, and is responsible for the same slow roundoff accumulation of those FFTs.[2][3]

https://eng.libretexts.org/Bookshelves/Electrical_Engineering/Book%3A_Fast_Fourier_Transforms_(Burrus)/10%3A_Implementing_FFTs_in_Practice/10.8%3A_Numerical_Accuracy_in_FFTs
However, these encouraging error-growth rates only apply if the trigonometric “twiddle” factors in the FFT algorithm are computed very accurately. Many FFT implementations, including FFTW and common manufacturer-optimized libraries, therefore use precomputed tables of twiddle factors calculated by means of standard library functions (which compute trigonometric constants to roughly machine precision). The other common method to compute twiddle factors is to use a trigonometric recurrence formula—this saves memory (and cache), but almost all recurrences have errors that grow as O(n‾√) , O(n) or even O(n2) which lead to corresponding errors in the FFT.

...

There are, in fact, trigonometric recurrences with the same logarithmic error growth as the FFT, but these seem more difficult to implement efficiently; they require that a table of Θ(logn) values be stored and updated as the recurrence progresses. Instead, in order to gain at least some of the benefits of a trigonometric recurrence (reduced memory pressure at the expense of more arithmetic), FFTW includes several ways to compute a much smaller twiddle table, from which the desired entries can be computed accurately on the fly using a bounded number (usually <3) of complex multiplications. For example, instead of a twiddle table with n entries ωkn , FFTW can use two tables with Θ(n‾√) entries each, so that ωkn is computed by multiplying an entry in one table (indexed with the low-order bits of k ) by an entry in the other table (indexed with the high-order bits of k ).

[ed.: Nicholas Higham's "Accuracy and Stability of Numerical Algorithms" seems like a good reference for this kind of analysis.]
nibble  pdf  papers  programming  systems  numerics  nitty-gritty  intricacy  approximation  accuracy  types  sci-comp  multi  q-n-a  stackex  hmm  oly-programming  accretion  formal-methods  yak-shaving  wiki  reference  algorithms  yoga  ground-up  divide-and-conquer  fourier  books  tidbits  chart  caltech  nostalgia 
may 2019 by nhaliday
[1803.00085] Chinese Text in the Wild
We introduce Chinese Text in the Wild, a very large dataset of Chinese text in street view images.

...

We give baseline results using several state-of-the-art networks, including AlexNet, OverFeat, Google Inception and ResNet for character recognition, and YOLOv2 for character detection in images. Overall Google Inception has the best performance on recognition with 80.5% top-1 accuracy, while YOLOv2 achieves an mAP of 71.0% on detection. Dataset, source code and trained models will all be publicly available on the website.
nibble  pdf  papers  preprint  machine-learning  deep-learning  deepgoog  state-of-art  china  asia  writing  language  dataset  error  accuracy  computer-vision  pic  ocr  org:mat  benchmarks  questions 
may 2019 by nhaliday
ON THE GEOMETRY OF NASH EQUILIBRIA AND CORRELATED EQUILIBRIA
Abstract: It is well known that the set of correlated equilibrium distributions of an n-player noncooperative game is a convex polytope that includes all the Nash equilibrium distributions. We demonstrate an elementary yet surprising result: the Nash equilibria all lie on the boundary of the polytope.
pdf  nibble  papers  ORFE  game-theory  optimization  geometry  dimensionality  linear-algebra  equilibrium  structure  differential  correlation  iidness  acm  linear-programming  spatial  characterization  levers 
may 2019 by nhaliday
[1804.04268] Incomplete Contracting and AI Alignment
We suggest that the analysis of incomplete contracting developed by law and economics researchers can provide a useful framework for understanding the AI alignment problem and help to generate a systematic approach to finding solutions. We first provide an overview of the incomplete contracting literature and explore parallels between this work and the problem of AI alignment. As we emphasize, misalignment between principal and agent is a core focus of economic analysis. We highlight some technical results from the economics literature on incomplete contracts that may provide insights for AI alignment researchers. Our core contribution, however, is to bring to bear an insight that economists have been urged to absorb from legal scholars and other behavioral scientists: the fact that human contracting is supported by substantial amounts of external structure, such as generally available institutions (culture, law) that can supply implied terms to fill the gaps in incomplete contracts. We propose a research agenda for AI alignment work that focuses on the problem of how to build AI that can replicate the human cognitive processes that connect individual incomplete contracts with this supporting external structure.
nibble  preprint  org:mat  papers  ai  ai-control  alignment  coordination  contracts  law  economics  interests  culture  institutions  number  context  behavioral-econ  composition-decomposition  rent-seeking  whole-partial-many 
april 2018 by nhaliday
Theory of Self-Reproducing Automata - John von Neumann
Fourth Lecture: THE ROLE OF HIGH AND OF EXTREMELY HIGH COMPLICATION

Comparisons between computing machines and the nervous systems. Estimates of size for computing machines, present and near future.

Estimates for size for the human central nervous system. Excursus about the “mixed” character of living organisms. Analog and digital elements. Observations about the “mixed” character of all componentry, artificial as well as natural. Interpretation of the position to be taken with respect to these.

Evaluation of the discrepancy in size between artificial and natural automata. Interpretation of this discrepancy in terms of physical factors. Nature of the materials used.

The probability of the presence of other intellectual factors. The role of complication and the theoretical penetration that it requires.

Questions of reliability and errors reconsidered. Probability of individual errors and length of procedure. Typical lengths of procedure for computing machines and for living organisms--that is, for artificial and for natural automata. Upper limits on acceptable probability of error in individual operations. Compensation by checking and self-correcting features.

Differences of principle in the way in which errors are dealt with in artificial and in natural automata. The “single error” principle in artificial automata. Crudeness of our approach in this case, due to the lack of adequate theory. More sophisticated treatment of this problem in natural automata: The role of the autonomy of parts. Connections between this autonomy and evolution.

- 10^10 neurons in brain, 10^4 vacuum tubes in largest computer at time
- machines faster: 5 ms from neuron potential to neuron potential, 10^-3 ms for vacuum tubes

https://en.wikipedia.org/wiki/John_von_Neumann#Computing
pdf  article  papers  essay  nibble  math  cs  computation  bio  neuro  neuro-nitgrit  scale  magnitude  comparison  acm  von-neumann  giants  thermo  phys-energy  speed  performance  time  density  frequency  hardware  ems  efficiency  dirty-hands  street-fighting  fermi  estimate  retention  physics  interdisciplinary  multi  wiki  links  people  🔬  atoms  duplication  iteration-recursion  turing  complexity  measure  nature  technology  complex-systems  bits  information-theory  circuits  robust  structure  composition-decomposition  evolution  mutation  axioms  analogy  thinking  input-output  hi-order-bits  coding-theory  flexibility  rigidity  automata-languages 
april 2018 by nhaliday
[1410.0369] The Universe of Minds
kinda dumb, don't think this guy is anywhere close to legit (e.g., he claims set of mind designs is countable, but gives no actual reason to believe that)
papers  preprint  org:mat  ratty  miri-cfar  ai  intelligence  philosophy  logic  software  cs  computation  the-self 
march 2018 by nhaliday
[1509.02504] Electric charge in hyperbolic motion: The early history and other geometrical aspects
We revisit the early work of Minkowski and Sommerfeld concerning hyperbolic motion, and we describe some geometrical aspects of the electrodynamic interaction. We discuss the advantages of a time symmetric formulation in which the material points are replaced by infinitesimal length elements.

SPACE AND TIME: An annotated, illustrated edition of Hermann Minkowski's revolutionary essay: http://web.mit.edu/redingtn/www/netadv/SP20130311.html
nibble  preprint  papers  org:mat  physics  electromag  relativity  exposition  history  mostly-modern  pre-ww2  science  the-trenches  discovery  intricacy  classic  explanation  einstein  giants  plots  manifolds  article  multi  liner-notes  org:junk  org:edu  absolute-relative 
november 2017 by nhaliday
Stability of the Solar System - Wikipedia
The stability of the Solar System is a subject of much inquiry in astronomy. Though the planets have been stable when historically observed, and will be in the short term, their weak gravitational effects on one another can add up in unpredictable ways. For this reason (among others) the Solar System is chaotic,[1] and even the most precise long-term models for the orbital motion of the Solar System are not valid over more than a few tens of millions of years.[2]

The Solar System is stable in human terms, and far beyond, given that it is unlikely any of the planets will collide with each other or be ejected from the system in the next few billion years,[3] and the Earth's orbit will be relatively stable.[4]

Since Newton's law of gravitation (1687), mathematicians and astronomers (such as Laplace, Lagrange, Gauss, Poincaré, Kolmogorov, Vladimir Arnold and Jürgen Moser) have searched for evidence for the stability of the planetary motions, and this quest led to many mathematical developments, and several successive 'proofs' of stability of the Solar System.[5]

...

The planets' orbits are chaotic over longer timescales, such that the whole Solar System possesses a Lyapunov time in the range of 2–230 million years.[3] In all cases this means that the position of a planet along its orbit ultimately becomes impossible to predict with any certainty (so, for example, the timing of winter and summer become uncertain), but in some cases the orbits themselves may change dramatically. Such chaos manifests most strongly as changes in eccentricity, with some planets' orbits becoming significantly more—or less—elliptical.[7]

Is the Solar System Stable?: https://www.ias.edu/ideas/2011/tremaine-solar-system

Is the Solar System Stable?: https://arxiv.org/abs/1209.5996
nibble  wiki  reference  article  physics  mechanics  space  gravity  flux-stasis  uncertainty  robust  perturbation  math  dynamical  math.DS  volo-avolo  multi  org:edu  org:inst  papers  preprint  time  data  org:mat 
november 2017 by nhaliday
Karl Pearson and the Chi-squared Test
Pearson's paper of 1900 introduced what subsequently became known as the chi-squared test of goodness of fit. The terminology and allusions of 80 years ago create a barrier for the modern reader, who finds that the interpretation of Pearson's test procedure and the assessment of what he achieved are less than straightforward, notwithstanding the technical advances made since then. An attempt is made here to surmount these difficulties by exploring Pearson's relevant activities during the first decade of his statistical career, and by describing the work by his contemporaries and predecessors which seem to have influenced his approach to the problem. Not all the questions are answered, and others remain for further study.

original paper: http://www.economics.soton.ac.uk/staff/aldrich/1900.pdf

How did Karl Pearson come up with the chi-squared statistic?: https://stats.stackexchange.com/questions/97604/how-did-karl-pearson-come-up-with-the-chi-squared-statistic
He proceeds by working with the multivariate normal, and the chi-square arises as a sum of squared standardized normal variates.

You can see from the discussion on p160-161 he's clearly discussing applying the test to multinomial distributed data (I don't think he uses that term anywhere). He apparently understands the approximate multivariate normality of the multinomial (certainly he knows the margins are approximately normal - that's a very old result - and knows the means, variances and covariances, since they're stated in the paper); my guess is that most of that stuff is already old hat by 1900. (Note that the chi-squared distribution itself dates back to work by Helmert in the mid-1870s.)

Then by the bottom of p163 he derives a chi-square statistic as "a measure of goodness of fit" (the statistic itself appears in the exponent of the multivariate normal approximation).

He then goes on to discuss how to evaluate the p-value*, and then he correctly gives the upper tail area of a χ212χ122 beyond 43.87 as 0.000016. [You should keep in mind, however, that he didn't correctly understand how to adjust degrees of freedom for parameter estimation at that stage, so some of the examples in his papers use too high a d.f.]
nibble  papers  acm  stats  hypothesis-testing  methodology  history  mostly-modern  pre-ww2  old-anglo  giants  science  the-trenches  stories  multi  q-n-a  overflow  explanation  summary  innovation  discovery  distribution  degrees-of-freedom  limits 
october 2017 by nhaliday
[1709.06560] Deep Reinforcement Learning that Matters
https://twitter.com/WAWilsonIV/status/912505885565452288
I’ve been experimenting w/ various kinds of value function approaches to RL lately, and its striking how primitive and bad things seem to be
At first I thought it was just that my code sucks, but then I played with the OpenAI baselines and nope, it’s the children that are wrong.
And now, what comes across my desk but this fantastic paper: (link: https://arxiv.org/abs/1709.06560) arxiv.org/abs/1709.06560 How long until the replication crisis hits AI?

https://twitter.com/WAWilsonIV/status/911318326504153088
Seriously I’m not blown away by the PhDs’ records over the last 30 years. I bet you’d get better payoff funding eccentrics and amateurs.
There are essentially zero fundamentally new ideas in AI, the papers are all grotesquely hyperparameter tuned, nobody knows why it works.

Deep Reinforcement Learning Doesn't Work Yet: https://www.alexirpan.com/2018/02/14/rl-hard.html
Once, on Facebook, I made the following claim.

Whenever someone asks me if reinforcement learning can solve their problem, I tell them it can’t. I think this is right at least 70% of the time.
papers  preprint  machine-learning  acm  frontier  speedometer  deep-learning  realness  replication  state-of-art  survey  reinforcement  multi  twitter  social  discussion  techtariat  ai  nibble  org:mat  unaffiliated  ratty  acmtariat  liner-notes  critique  sample-complexity  cost-benefit  todo 
september 2017 by nhaliday
New Theory Cracks Open the Black Box of Deep Learning | Quanta Magazine
A new idea called the “information bottleneck” is helping to explain the puzzling success of today’s artificial-intelligence algorithms — and might also explain how human brains learn.

sounds like he's just talking about autoencoders?
news  org:mag  org:sci  popsci  announcement  research  deep-learning  machine-learning  acm  information-theory  bits  neuro  model-class  big-surf  frontier  nibble  hmm  signal-noise  deepgoog  expert  ideas  wild-ideas  summary  talks  video  israel  roots  physics  interdisciplinary  ai  intelligence  shannon  giants  arrows  preimage  lifts-projections  composition-decomposition  characterization  markov  gradient-descent  papers  liner-notes  experiment  hi-order-bits  generalization  expert-experience  explanans  org:inst  speedometer 
september 2017 by nhaliday
Rank aggregation basics: Local Kemeny optimisation | David R. MacIver
This turns our problem from a global search to a local one: Basically we can start from any point in the search space and search locally by swapping adjacent pairs until we hit a minimum. This turns out to be quite easy to do. _We basically run insertion sort_: At step n we have the first n items in a locally Kemeny optimal order. Swap the n+1th item backwards until the majority think its predecessor is < it. This ensures all adjacent pairs are in the majority order, so swapping them would result in a greater than or equal K. This is of course an O(n^2) algorithm. In fact, the problem of merely finding a locally Kemeny optimal solution can be done in O(n log(n)) (for much the same reason as you can sort better than insertion sort). You just take the directed graph of majority votes and find a Hamiltonian Path. The nice thing about the above version of the algorithm is that it gives you a lot of control over where you start your search.
techtariat  liner-notes  papers  tcs  algorithms  machine-learning  acm  optimization  approximation  local-global  orders  graphs  graph-theory  explanation  iteration-recursion  time-complexity  nibble 
september 2017 by nhaliday
Fermat's Library | Cassini, Rømer and the velocity of light annotated/explained version.
Abstract: The discovery of the finite nature of the velocity of light is usually attributed to Rømer. However, a text at the Paris Observatory confirms the minority opinion according to which Cassini was first to propose the ‘successive motion’ of light, while giving a rather correct order of magnitude for the duration of its propagation from the Sun to the Earth. We examine this question, and discuss why, in spite of the criticisms of Halley, Cassini abandoned this hypothesis while leaving Rømer free to publish it.
liner-notes  papers  essay  history  early-modern  europe  the-great-west-whale  giants  the-trenches  mediterranean  nordic  science  innovation  discovery  physics  electromag  space  speed  nibble  org:sci  org:mat 
september 2017 by nhaliday
Correlated Equilibria in Game Theory | Azimuth
Given this, it’s not surprising that Nash equilibria can be hard to find. Last September a paper came out making this precise, in a strong way:

• Yakov Babichenko and Aviad Rubinstein, Communication complexity of approximate Nash equilibria.

The authors show there’s no guaranteed method for players to find even an approximate Nash equilibrium unless they tell each other almost everything about their preferences. This makes finding the Nash equilibrium prohibitively difficult to find when there are lots of players… in general. There are particular games where it’s not difficult, and that makes these games important: for example, if you’re trying to run a government well. (A laughable notion these days, but still one can hope.)

Klarreich’s article in Quanta gives a nice readable account of this work and also a more practical alternative to the concept of Nash equilibrium. It’s called a ‘correlated equilibrium’, and it was invented by the mathematician Robert Aumann in 1974. You can see an attempt to define it here:
baez  org:bleg  nibble  mathtariat  commentary  summary  news  org:mag  org:sci  popsci  equilibrium  GT-101  game-theory  acm  conceptual-vocab  concept  definition  thinking  signaling  coordination  tcs  complexity  communication-complexity  lower-bounds  no-go  liner-notes  big-surf  papers  research  algorithmic-econ  volo-avolo 
july 2017 by nhaliday
Predicting the outcomes of organic reactions via machine learning: are current descriptors sufficient? | Scientific Reports
As machine learning/artificial intelligence algorithms are defeating chess masters and, most recently, GO champions, there is interest – and hope – that they will prove equally useful in assisting chemists in predicting outcomes of organic reactions. This paper demonstrates, however, that the applicability of machine learning to the problems of chemical reactivity over diverse types of chemistries remains limited – in particular, with the currently available chemical descriptors, fundamental mathematical theorems impose upper bounds on the accuracy with which raction yields and times can be predicted. Improving the performance of machine-learning methods calls for the development of fundamentally new chemical descriptors.
study  org:nat  papers  machine-learning  chemistry  measurement  volo-avolo  lower-bounds  analysis  realness  speedometer  nibble  🔬  applications  frontier  state-of-art  no-go  accuracy  interdisciplinary 
july 2017 by nhaliday
Why is the Lin and Tegmark paper 'Why does deep and cheap learning work so well?' important? - Quora
To take the analogy further than I probably should, the resolution to the magic key problem might be that the key is magical, but that the locks are particularly magical. For deep learning, my guess is that it’s a bit of both.
q-n-a  qra  papers  liner-notes  deep-learning  off-convex  machine-learning  explanation  nibble  big-picture  explanans 
february 2017 by nhaliday
[1604.03640] Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex
We discuss relations between Residual Networks (ResNet), Recurrent Neural Networks (RNNs) and the primate visual cortex. We begin with the observation that a shallow RNN is exactly equivalent to a very deep ResNet with weight sharing among the layers. A direct implementation of such a RNN, although having orders of magnitude fewer parameters, leads to a performance similar to the corresponding ResNet. We propose 1) a generalization of both RNN and ResNet architectures and 2) the conjecture that a class of moderately deep RNNs is a biologically-plausible model of the ventral stream in visual cortex. We demonstrate the effectiveness of the architectures by testing them on the CIFAR-10 dataset.
papers  preprint  neuro  biodet  interdisciplinary  deep-learning  model-class  identity  machine-learning  nibble  org:mat  computer-vision 
february 2017 by nhaliday
inequalities - Is the Jaccard distance a distance? - MathOverflow
Steinhaus Transform
the referenced survey: http://kenclarkson.org/nn_survey/p.pdf

It's known that this transformation produces a metric from a metric. Now if you take as the base metric D the symmetric difference between two sets, what you end up with is the Jaccard distance (which actually is known by many other names as well).
q-n-a  overflow  nibble  math  acm  sublinear  metrics  metric-space  proofs  math.CO  tcstariat  arrows  reduction  measure  math.MG  similarity  multi  papers  survey  computational-geometry  cs  algorithms  pdf  positivity  msr  tidbits  intersection  curvature  convexity-curvature  intersection-connectedness  signum 
february 2017 by nhaliday
« earlier      
per page:    204080120160

bundles : academemeta

related tags

:/  aaronson  absolute-relative  abstraction  academia  accretion  accuracy  acm  acmtariat  additive-combo  advanced  adversarial  advice  aggregator  ai  ai-control  alg-combo  algebra  algorithmic-econ  algorithms  alignment  alt-inst  AMT  analogy  analysis  anglo  announcement  anonymity  aphorism  apollonian-dionysian  app  applicability-prereqs  applications  approximation  arms  arrows  article  asia  atoms  audio  authoritarianism  auto-learning  automata-languages  automation  average-case  aversion  axioms  baez  bandits  bare-hands  bayesian  behavioral-econ  ben-recht  benchmarks  best-practices  better-explained  biases  big-list  big-peeps  big-picture  big-surf  binomial  bio  biodet  bioinformatics  bitcoin  bits  boltzmann  books  boolean-analysis  brain-scan  brands  brexit  britain  brunn-minkowski  business  business-models  c(pp)  caltech  career  carmack  CAS  causation  characterization  chart  checklists  chemistry  china  circuits  clarity  classic  classification  clever-rats  coalitions  cocktail  cocoa  coding-theory  cog-psych  combo-optimization  commentary  common-case  communication  communication-complexity  comparison  competition  compilers  complex-systems  complexity  composition-decomposition  compressed-sensing  computation  computational-geometry  computer-memory  computer-vision  concentration-of-measure  concept  conceptual-vocab  concurrency  conference  confluence  confounding  constraint-satisfaction  context  contracts  contrarianism  convexity-curvature  cool  coordination  correctness  correlation  cost-benefit  counterexample  counting  coupling-cohesion  creative  critique  crypto  crypto-anarchy  cryptocurrency  cs  culture  curvature  cybernetics  cycles  d-lang  dan-luu  dark-arts  data  data-science  data-structures  database  dataset  debate  debt  decision-theory  deep-learning  deepgoog  definite-planning  definition  degrees-of-freedom  dennett  density  dependence-independence  descriptive  desktop  detail-architecture  devtools  differential  dimensionality  direction  dirty-hands  discipline  discovery  discrete  discussion  distributed  distribution  divide-and-conquer  dotnet  DSL  dumb-ML  duplication  dynamic  dynamical  early-modern  earth  economics  eden-heaven  efficiency  einstein  elections  electromag  elegance  embeddings  embodied  empirical  ems  endo-exo  endogenous-exogenous  energy-resources  engineering  ensembles  entanglement  entropy-like  equilibrium  eric-kaufmann  error  essay  estimate  ethics  europe  events  evidence-based  evolution  examples  existence  exocortex  expansionism  experiment  expert  expert-experience  explanans  explanation  exploratory  exposition  extrema  fall-2016  features  fermi  fiber  fields  finance  finiteness  fixed-point  flexibility  flux-stasis  focs  foreign-lang  form-design  formal-methods  fourier  french  frequency  frontier  functional  futurism  game-theory  games  gelman  generalization  generative  genetics  geography  geometry  georgia  germanic  giants  gig-econ  git  golang  google  gotchas  government  gowers  grad-school  gradient-descent  graph-theory  graphical-models  graphs  gravity  grokkability  grokkability-clarity  ground-up  GT-101  guessing  GWAS  hacker  hanson  hardware  hashing  haskell  hci  heavy-industry  heavyweights  heuristic  hg  hi-order-bits  history  hmm  hn  homepage  homo-hetero  homogeneity  howto  hsu  huge-data-the-biggest  human-ml  humanity  hypothesis-testing  icml  ideas  identification-equivalence  identity  idk  IEEE  iidness  impetus  impro  industrial-org  inference  info-dynamics  info-econ  info-foraging  infographic  information-theory  init  innovation  input-output  institutions  integral  intel  intelligence  interdisciplinary  interests  interface  interface-compatibility  internet  interpretability  intersection  intersection-connectedness  intricacy  intuition  investing  iron-age  isotropy  israel  iteration-recursion  jargon  jvm  kernels  labor  language  large-factor  latent-variables  lattice  law  learning  learning-theory  lecture-notes  lectures  len:short  lens  let-me-see  levers  leviathan  lexical  libraries  lifts-projections  limits  linear-algebra  linear-programming  linearity  liner-notes  links  list  local-global  logic  long-short-run  long-term  lower-bounds  machine-learning  macro  magnitude  manifolds  map-territory  maps  marginal  market-failure  markov  matching  math  math.CA  math.CO  math.CV  math.DS  math.GR  math.MG  math.NT  math.RT  mathtariat  matrix-factorization  measure  measurement  mechanics  mechanism-design  media  medicine  mediterranean  memory-management  meta:math  meta:reading  meta:research  meta:science  meta:war  metabuch  metameta  methodology  metric-space  metrics  michael-jordan  micro  military  minimalism  miri-cfar  mit  ML-MAP-E  model-class  models  monetary-fiscal  monotonicity  monte-carlo  mostly-modern  motivation  move-fast-(and-break-things)  msr  multi  multiplicative  mutation  narrative  nature  network-structure  neuro  neuro-nitgrit  neurons  new-religion  news  nibble  nips  nitty-gritty  nlp  no-go  nonlinearity  nordic  norms  nostalgia  novelty  nuclear  number  numerics  objektbuch  ocaml-sml  occam  ocr  off-convex  offense-defense  old-anglo  oly  oly-programming  online-learning  oop  open-problems  openai  operational  opsec  optimization  order-disorder  orders  ORFE  org:anglo  org:biz  org:bleg  org:com  org:edu  org:inst  org:junk  org:lite  org:mag  org:mat  org:nat  org:popup  org:rec  org:sci  organization  oscillation  oss  overflow  oxbridge  p:someday  p:whenever  PAC  papers  paradox  parametric  parsimony  pdf  people  percolation  performance  perturbation  phase-transition  philosophy  phys-energy  physics  pic  pinboard  piracy  planning  plots  pls  plt  polisci  polynomials  pop-structure  popsci  population-genetics  positivity  postmortem  power-law  pragmatic  pre-ww2  prediction  prediction-markets  preimage  preprint  presentation  princeton  priors-posteriors  privacy  pro-rata  probabilistic-method  probability  problem-solving  prof  profile  programming  project  proofs  propaganda  properties  protocol-metadata  pseudorandomness  psychology  psychometrics  publishing  python  q-n-a  qra  QTL  quality  quantum  quantum-info  quantum-money  questions  quixotic  quotes  rand-approx  rand-complexity  random  random-networks  ranking  ratty  reading  realness  recommendations  reduction  reference  reflection  regression  regularity  regularization  regularizer  reinforcement  relativity  rent-seeking  replication  repo  research  research-program  retention  review  rhetoric  rigidity  rigor  rigorous-crypto  risk  robust  roots  rounding  rust  s:*  s:**  sample-complexity  sampling  sanjeev-arora  sapiens  scala  scale  scholar  scholar-pack  sci-comp  science  scitariat  SDP  search  sebastien-bubeck  security  selection  sequential  series  shannon  shipping  SIGGRAPH  signal-noise  signaling  signum  similarity  simulation  skeleton  skunkworks  sky  sleuthin  slides  smoothness  social  social-choice  social-psych  social-science  society  sociology  soft-question  software  space  sparsity  spatial  spectral  speculation  speed  speedometer  spock  stackex  stanford  stat-mech  state  state-of-art  static-dynamic  stats  stoc  stochastic-processes  stories  stream  street-fighting  strings  structure  study  studying  stylized-facts  sublinear  success  sum-of-squares  summary  summer-2016  survey  synchrony  syntax  synthesis  system-design  systematic-ad-hoc  systems  tails  talks  tcs  tcstariat  tech  technical-writing  technology  techtariat  telos-atelos  terminal  the-classics  the-great-west-whale  the-self  the-trenches  the-world-is-just-atoms  theory-of-mind  theory-practice  thermo  thesis  thick-thin  things  thinking  threat-modeling  thurston  tidbits  tightness  time  time-complexity  time-preference  time-series  todo  tools  top-n  tradecraft  tradeoffs  trees  tricki  tricks  trivia  trust  truth  turing  tutorial  twitter  types  ubiquity  UGC  unaffiliated  uncertainty  unintended-consequences  uniqueness  unit  universalism-particularism  unsupervised  usa  utopia-dystopia  ux  vague  values  vazirani  vc-dimension  vcs  video  virtu  visual-understanding  visualization  volo-avolo  von-neumann  washington  waves  web  white-paper  whole-partial-many  wiki  wild-ideas  winter-2016  wire-guided  wisdom  within-without  workflow  working-stiff  wormholes  worrydream  worse-is-better/the-right-thing  writing  yak-shaving  yoga  zooming  🌞  🎓  🎩  👳  🔬  🖥 

Copy this bookmark:



description:


tags: