nhaliday + levers   96

Stein's example - Wikipedia
Stein's example (or phenomenon or paradox), in decision theory and estimation theory, is the phenomenon that when three or more parameters are estimated simultaneously, there exist combined estimators more accurate on average (that is, having lower expected mean squared error) than any method that handles the parameters separately. It is named after Charles Stein of Stanford University, who discovered the phenomenon in 1955.[1]

An intuitive explanation is that optimizing for the mean-squared error of a combined estimator is not the same as optimizing for the errors of separate estimators of the individual parameters. In practical terms, if the combined error is in fact of interest, then a combined estimator should be used, even if the underlying parameters are independent; this occurs in channel estimation in telecommunications, for instance (different factors affect overall channel performance). On the other hand, if one is instead interested in estimating an individual parameter, then using a combined estimator does not help and is in fact worse.

...

Many simple, practical estimators achieve better performance than the ordinary estimator. The best-known example is the James–Stein estimator, which works by starting at X and moving towards a particular point (such as the origin) by an amount inversely proportional to the distance of X from that point.
nibble  concept  levers  wiki  reference  acm  stats  probability  decision-theory  estimate  distribution  atoms 
february 2018 by nhaliday
Use and Interpretation of LD Score Regression
LD Score regression distinguishes confounding from polygenicity in genome-wide association studies: https://sci-hub.bz/10.1038/ng.3211
- Po-Ru Loh, Nick Patterson, et al.

https://www.biorxiv.org/content/biorxiv/early/2014/02/21/002931.full.pdf

Both polygenicity (i.e. many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield inflated distributions of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from bias and true signal from polygenicity. We have developed an approach that quantifies the contributions of each by examining the relationship between test statistics and linkage disequilibrium (LD). We term this approach LD Score regression. LD Score regression provides an upper bound on the contribution of confounding bias to the observed inflation in test statistics and can be used to estimate a more powerful correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of test statistic inflation in many GWAS of large sample size.

Supplementary Note: https://images.nature.com/original/nature-assets/ng/journal/v47/n3/extref/ng.3211-S1.pdf

An atlas of genetic correlations across human diseases
and traits: https://sci-hub.bz/10.1038/ng.3406

https://www.biorxiv.org/content/early/2015/01/27/014498.full.pdf

Supplementary Note: https://images.nature.com/original/nature-assets/ng/journal/v47/n11/extref/ng.3406-S1.pdf

https://github.com/bulik/ldsc
ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.
nibble  pdf  slides  talks  bio  biodet  genetics  genomics  GWAS  genetic-correlation  correlation  methodology  bioinformatics  concept  levers  🌞  tutorial  explanation  pop-structure  gene-drift  ideas  multi  study  org:nat  article  repo  software  tools  libraries  stats  hypothesis-testing  biases  confounding  gotchas  QTL  simulation  survey  preprint  population-genetics 
november 2017 by nhaliday
Ancient Admixture in Human History
- Patterson, Reich et al., 2012
Population mixture is an important process in biology. We present a suite of methods for learning about population mixtures, implemented in a software package called ADMIXTOOLS, that support formal tests for whether mixture occurred and make it possible to infer proportions and dates of mixture. We also describe the development of a new single nucleotide polymorphism (SNP) array consisting of 629,433 sites with clearly documented ascertainment that was specifically designed for population genetic analyses and that we genotyped in 934 individuals from 53 diverse populations. To illustrate the methods, we give a number of examples that provide new insights about the history of human admixture. The most striking finding is a clear signal of admixture into northern Europe, with one ancestral population related to present-day Basques and Sardinians and the other related to present-day populations of northeast Asia and the Americas. This likely reflects a history of admixture between Neolithic migrants and the indigenous Mesolithic population of Europe, consistent with recent analyses of ancient bones from Sweden and the sequencing of the genome of the Tyrolean “Iceman.”
nibble  pdf  study  article  methodology  bio  sapiens  genetics  genomics  population-genetics  migration  gene-flow  software  trees  concept  history  antiquity  europe  roots  gavisti  🌞  bioinformatics  metrics  hypothesis-testing  levers  ideas  libraries  tools  pop-structure 
november 2017 by nhaliday
Introduction to Scaling Laws
https://betadecay.wordpress.com/2009/10/02/the-physics-of-scaling-laws-and-dimensional-analysis/
http://galileo.phys.virginia.edu/classes/304/scaling.pdf

Galileo’s Discovery of Scaling Laws: https://www.mtholyoke.edu/~mpeterso/classes/galileo/scaling8.pdf
Days 1 and 2 of Two New Sciences

An example of such an insight is “the surface of a small solid is comparatively greater than that of a large one” because the surface goes like the square of a linear dimension, but the volume goes like the cube.5 Thus as one scales down macroscopic objects, forces on their surfaces like viscous drag become relatively more important, and bulk forces like weight become relatively less important. Galileo uses this idea on the First Day in the context of resistance in free fall, as an explanation for why similar objects of different size do not fall exactly together, but the smaller one lags behind.
nibble  org:junk  exposition  lecture-notes  physics  mechanics  street-fighting  problem-solving  scale  magnitude  estimate  fermi  mental-math  calculation  nitty-gritty  multi  scitariat  org:bleg  lens  tutorial  guide  ground-up  tricki  skeleton  list  cheatsheet  identity  levers  hi-order-bits  yoga  metabuch  pdf  article  essay  history  early-modern  europe  the-great-west-whale  science  the-trenches  discovery  fluid  architecture  oceans  giants  tidbits 
august 2017 by nhaliday
Inscribed angle - Wikipedia
pf:
- for triangle w/ one side = a diameter, draw isosceles triangle and use supplementary angle identities
- otherwise draw second triangle w/ side = a diameter, and use above result twice
nibble  math  geometry  spatial  ground-up  wiki  reference  proofs  identity  levers  yoga 
august 2017 by nhaliday
Diophantine approximation - Wikipedia
- rationals perfectly approximated by themselves, badly approximated (eps~1/q) by other rationals
- irrationals well-approximated (eps~1/q^2) by rationals: https://en.wikipedia.org/wiki/Dirichlet%27s_approximation_theorem
nibble  wiki  reference  math  math.NT  approximation  accuracy  levers  pigeonhole-markov  multi  tidbits  discrete  rounding 
august 2017 by nhaliday
Kelly criterion - Wikipedia
In probability theory and intertemporal portfolio choice, the Kelly criterion, Kelly strategy, Kelly formula, or Kelly bet, is a formula used to determine the optimal size of a series of bets. In most gambling scenarios, and some investing scenarios under some simplifying assumptions, the Kelly strategy will do better than any essentially different strategy in the long run (that is, over a span of time in which the observed fraction of bets that are successful equals the probability that any given bet will be successful). It was described by J. L. Kelly, Jr, a researcher at Bell Labs, in 1956.[1] The practical use of the formula has been demonstrated.[2][3][4]

The Kelly Criterion is to bet a predetermined fraction of assets and can be counterintuitive. In one study,[5][6] each participant was given $25 and asked to bet on a coin that would land heads 60% of the time. Participants had 30 minutes to play, so could place about 300 bets, and the prizes were capped at $250. Behavior was far from optimal. "Remarkably, 28% of the participants went bust, and the average payout was just $91. Only 21% of the participants reached the maximum. 18 of the 61 participants bet everything on one toss, while two-thirds gambled on tails at some stage in the experiment." Using the Kelly criterion and based on the odds in the experiment, the right approach would be to bet 20% of the pot on each throw (see first example in Statement below). If losing, the size of the bet gets cut; if winning, the stake increases.
nibble  betting  investing  ORFE  acm  checklists  levers  probability  algorithms  wiki  reference  atoms  extrema  parsimony  tidbits  decision-theory  decision-making  street-fighting  mental-math  calculation 
august 2017 by nhaliday
Pearson correlation coefficient - Wikipedia
https://en.wikipedia.org/wiki/Coefficient_of_determination
what does this mean?: https://twitter.com/GarettJones/status/863546692724858880
deleted but it was about the Pearson correlation distance: 1-r
I guess it's a metric

https://en.wikipedia.org/wiki/Explained_variation

http://infoproc.blogspot.com/2014/02/correlation-and-variance.html
A less misleading way to think about the correlation R is as follows: given X,Y from a standardized bivariate distribution with correlation R, an increase in X leads to an expected increase in Y: dY = R dX. In other words, students with +1 SD SAT score have, on average, roughly +0.4 SD college GPAs. Similarly, students with +1 SD college GPAs have on average +0.4 SAT.

this reminds me of the breeder's equation (but it uses r instead of h^2, so it can't actually be the same)

https://www.reddit.com/r/slatestarcodex/comments/631haf/on_the_commentariat_here_and_why_i_dont_think_i/dfx4e2s/
stats  science  hypothesis-testing  correlation  metrics  plots  regression  wiki  reference  nibble  methodology  multi  twitter  social  discussion  best-practices  econotariat  garett-jones  concept  conceptual-vocab  accuracy  causation  acm  matrix-factorization  todo  explanation  yoga  hsu  street-fighting  levers  🌞  2014  scitariat  variance-components  meta:prediction  biodet  s:**  mental-math  reddit  commentary  ssc  poast  gwern  data-science  metric-space  similarity  measure  dependence-independence 
may 2017 by nhaliday
Strings, periods, and borders
A border of x is any proper prefix of x that equals a suffix of x.

...overlapping borders of a string imply that the string is periodic...

In the border array ß[1..n] of x, entry ß[i] is the length
of the longest border of x[1..i].
pdf  nibble  slides  lectures  algorithms  strings  exposition  yoga  atoms  levers  tidbits  sequential 
may 2017 by nhaliday
Kin selection - Wikipedia
Formally, genes should increase in frequency when

{\displaystyle rB>C}
where

r=the genetic relatedness of the recipient to the actor, often defined as the probability that a gene picked randomly from each at the same locus is identical by descent.
B=the additional reproductive benefit gained by the recipient of the altruistic act,
C=the reproductive cost to the individual performing the act.
This inequality is known as Hamilton's rule after W. D. Hamilton who in 1964 published the first formal quantitative treatment of kin selection.

The relatedness parameter (r) in Hamilton's rule was introduced in 1922 by Sewall Wright as a coefficient of relationship that gives the probability that at a random locus, the alleles there will be identical by descent.[20] Subsequent authors, including Hamilton, sometimes reformulate this with a regression, which, unlike probabilities, can be negative. A regression analysis producing statistically significant negative relationships indicates that two individuals are less genetically alike than two random ones (Hamilton 1970, Nature & Grafen 1985 Oxford Surveys in Evolutionary Biology). This has been invoked to explain the evolution of spiteful behaviour consisting of acts that result in harm, or loss of fitness, to both the actor and the recipient.

Several scientific studies have found that the kin selection model can be applied to nature. For example, in 2010 researchers used a wild population of red squirrels in Yukon, Canada to study kin selection in nature. The researchers found that surrogate mothers would adopt related orphaned squirrel pups but not unrelated orphans. The researchers calculated the cost of adoption by measuring a decrease in the survival probability of the entire litter after increasing the litter by one pup, while benefit was measured as the increased chance of survival of the orphan. The degree of relatedness of the orphan and surrogate mother for adoption to occur depended on the number of pups the surrogate mother already had in her nest, as this affected the cost of adoption. The study showed that females always adopted orphans when rB > C, but never adopted when rB < C, providing strong support for Hamilton's rule.[21]
bio  nature  evolution  selection  group-selection  kinship  altruism  levers  methodology  population-genetics  genetics  wiki  reference  nibble  stylized-facts  biodet  🌞  concept  metrics  EGT  selfish-gene  cooperate-defect  similarity  interests  ecology 
march 2017 by nhaliday
Fundamental Theorems of Evolution: The American Naturalist: Vol 0, No 0
I suggest that the most fundamental theorem of evolution is the Price equation, both because of its simplicity and broad scope and because it can be used to derive four other familiar results that are similarly fundamental: Fisher’s average-excess equation, Robertson’s secondary theorem of natural selection, the breeder’s equation, and Fisher’s fundamental theorem. These derivations clarify both the relationships behind these results and their assumptions. Slightly less fundamental results include those for multivariate evolution and social selection. A key feature of fundamental theorems is that they have great simplicity and scope, which are often achieved by sacrificing perfect accuracy. Quantitative genetics has been more productive of fundamental theorems than population genetics, probably because its empirical focus on unknown genotypes freed it from the tyranny of detail and allowed it to focus on general issues.
study  essay  bio  evolution  population-genetics  fisher  selection  EGT  dynamical  exposition  methodology  🌞  big-picture  levers  list  nibble  article  chart  explanation  clarity  trees  ground-up  ideas 
march 2017 by nhaliday
More on Multivariate Gaussians
Fact #1: mean and covariance uniquely determine distribution
Fact #3: closure under sum, marginalizing, and conditioning
covariance of conditional distribution is given by a Schur complement (independent of x_B. is that obvious?)
pdf  exposition  lecture-notes  stanford  nibble  distribution  acm  machine-learning  probability  levers  calculation  ground-up  characterization  rigidity  closure  nitty-gritty  linear-algebra  properties 
february 2017 by nhaliday
Structure theorem for finitely generated modules over a principal ideal domain - Wikipedia
- finitely generative modules over PID isomorphic to sum of quotients by decreasing sequences of proper ideals
- never really understood the proof of this in Ma5b
math  algebra  characterization  levers  math.AC  wiki  reference  nibble  proofs  additive  arrows 
february 2017 by nhaliday
Relationships among probability distributions - Wikipedia
- One distribution is a special case of another with a broader parameter space
- Transforms (function of a random variable);
- Combinations (function of several variables);
- Approximation (limit) relationships;
- Compound relationships (useful for Bayesian inference);
- Duality;
- Conjugate priors.
stats  probability  characterization  list  levers  wiki  reference  objektbuch  calculation  distribution  nibble  cheatsheet  closure  composition-decomposition  properties 
february 2017 by nhaliday
bounds - What is the variance of the maximum of a sample? - Cross Validated
- sum of variances is always a bound
- can't do better even for iid Bernoulli
- looks like nice argument from well-known probabilist (using E[(X-Y)^2] = 2Var X), but not clear to me how he gets to sum_i instead of sum_{i,j} in the union bound?
edit: argument is that, for j = argmax_k Y_k, we have r < X_i - Y_j <= X_i - Y_i for all i, including i = argmax_k X_k
- different proof here (later pages): http://www.ism.ac.jp/editsec/aism/pdf/047_1_0185.pdf
Var(X_n:n) <= sum Var(X_k:n) + 2 sum_{i < j} Cov(X_i:n, X_j:n) = Var(sum X_k:n) = Var(sum X_k) = nσ^2
why are the covariances nonnegative? (are they?). intuitively seems true.
- for that, see https://pinboard.in/u:nhaliday/b:ed4466204bb1
- note that this proof shows more generally that sum Var(X_k:n) <= sum Var(X_k)
- apparently that holds for dependent X_k too? http://mathoverflow.net/a/96943/20644
q-n-a  overflow  stats  acm  distribution  tails  bias-variance  moments  estimate  magnitude  probability  iidness  tidbits  concentration-of-measure  multi  orders  levers  extrema  nibble  bonferroni  coarse-fine  expert  symmetry  s:*  expert-experience  proofs 
february 2017 by nhaliday
Kolmogorov's zero–one law - Wikipedia
In probability theory, Kolmogorov's zero–one law, named in honor of Andrey Nikolaevich Kolmogorov, specifies that a certain type of event, called a tail event, will either almost surely happen or almost surely not happen; that is, the probability of such an event occurring is zero or one.

tail events include limsup E_i
math  probability  levers  limits  discrete  wiki  reference  nibble 
february 2017 by nhaliday
Dvoretzky's theorem - Wikipedia
In mathematics, Dvoretzky's theorem is an important structural theorem about normed vector spaces proved by Aryeh Dvoretzky in the early 1960s, answering a question of Alexander Grothendieck. In essence, it says that every sufficiently high-dimensional normed vector space will have low-dimensional subspaces that are approximately Euclidean. Equivalently, every high-dimensional bounded symmetric convex set has low-dimensional sections that are approximately ellipsoids.

http://mathoverflow.net/questions/143527/intuitive-explanation-of-dvoretzkys-theorem
http://mathoverflow.net/questions/46278/unexpected-applications-of-dvoretzkys-theorem
math  math.FA  inner-product  levers  characterization  geometry  math.MG  concentration-of-measure  multi  q-n-a  overflow  intuition  examples  proofs  dimensionality  gowers  mathtariat  tcstariat  quantum  quantum-info  norms  nibble  high-dimension  wiki  reference  curvature  convexity-curvature  tcs 
january 2017 by nhaliday
Wald's equation - Wikipedia
important identity that simplifies the calculation of the expected value of the sum of a random number of random quantities
math  levers  probability  wiki  reference  nibble  expectancy  identity 
january 2017 by nhaliday
probability - How to prove Bonferroni inequalities? - Mathematics Stack Exchange
- integrated version of inequalities for alternating sums of (N choose j), where r.v. N = # of events occuring
- inequalities for alternating binomial coefficients follow from general property of unimodal (increasing then decreasing) sequences, which can be gotten w/ two cases for increasing and decreasing resp.
- the final alternating zero sum property follows for binomial coefficients from expanding (1 - 1)^N = 0
- The idea of proving inequality by integrating simpler inequality of r.v.s is nice. Proof from CS 150 was more brute force from what I remember.
q-n-a  overflow  math  probability  tcs  probabilistic-method  estimate  proofs  levers  yoga  multi  tidbits  metabuch  monotonicity  calculation  nibble  bonferroni  tricki  binomial  s:null 
january 2017 by nhaliday
Computational Complexity: Favorite Theorems: The Yao Principle
The Yao Principle applies when we don't consider the algorithmic complexity of the players. For example in communication complexity we have two players who each have a separate half of an input string and they want to compute some function of the input with the minimum amount of communication between them. The Yao principle states that the best probabilistic strategies for the players will achieve exactly the communication bounds as the best deterministic strategy over a worst-case distribution of inputs.

The Yao Principle plays a smaller role where we measure the running time of an algorithm since applying the Principle would require solving an extremely large linear program. But since so many of our bounds are in information-based models like communication and decision-tree complexity, the Yao Principle, though not particularly complicated, plays an important role in lower bounds in a large number of results in our field.
tcstariat  tcs  complexity  adversarial  rand-approx  algorithms  game-theory  yoga  levers  communication-complexity  random  lower-bounds  average-case  nibble  org:bleg 
january 2017 by nhaliday
Carathéodory's theorem (convex hull) - Wikipedia
- any convex combination in R^d can be pared down to at most d+1 points
- eg, in R^2 you can always fit a point in convex hull in a triangle
tcs  acm  math.MG  geometry  levers  wiki  reference  optimization  linear-programming  math  linear-algebra  nibble  spatial  curvature  convexity-curvature 
january 2017 by nhaliday
gt.geometric topology - Intuitive crutches for higher dimensional thinking - MathOverflow
Terry Tao:
I can't help you much with high-dimensional topology - it's not my field, and I've not picked up the various tricks topologists use to get a grip on the subject - but when dealing with the geometry of high-dimensional (or infinite-dimensional) vector spaces such as R^n, there are plenty of ways to conceptualise these spaces that do not require visualising more than three dimensions directly.

For instance, one can view a high-dimensional vector space as a state space for a system with many degrees of freedom. A megapixel image, for instance, is a point in a million-dimensional vector space; by varying the image, one can explore the space, and various subsets of this space correspond to various classes of images.

One can similarly interpret sound waves, a box of gases, an ecosystem, a voting population, a stream of digital data, trials of random variables, the results of a statistical survey, a probabilistic strategy in a two-player game, and many other concrete objects as states in a high-dimensional vector space, and various basic concepts such as convexity, distance, linearity, change of variables, orthogonality, or inner product can have very natural meanings in some of these models (though not in all).

It can take a bit of both theory and practice to merge one's intuition for these things with one's spatial intuition for vectors and vector spaces, but it can be done eventually (much as after one has enough exposure to measure theory, one can start merging one's intuition regarding cardinality, mass, length, volume, probability, cost, charge, and any number of other "real-life" measures).

For instance, the fact that most of the mass of a unit ball in high dimensions lurks near the boundary of the ball can be interpreted as a manifestation of the law of large numbers, using the interpretation of a high-dimensional vector space as the state space for a large number of trials of a random variable.

More generally, many facts about low-dimensional projections or slices of high-dimensional objects can be viewed from a probabilistic, statistical, or signal processing perspective.

Scott Aaronson:
Here are some of the crutches I've relied on. (Admittedly, my crutches are probably much more useful for theoretical computer science, combinatorics, and probability than they are for geometry, topology, or physics. On a related note, I personally have a much easier time thinking about R^n than about, say, R^4 or R^5!)

1. If you're trying to visualize some 4D phenomenon P, first think of a related 3D phenomenon P', and then imagine yourself as a 2D being who's trying to visualize P'. The advantage is that, unlike with the 4D vs. 3D case, you yourself can easily switch between the 3D and 2D perspectives, and can therefore get a sense of exactly what information is being lost when you drop a dimension. (You could call this the "Flatland trick," after the most famous literary work to rely on it.)
2. As someone else mentioned, discretize! Instead of thinking about R^n, think about the Boolean hypercube {0,1}^n, which is finite and usually easier to get intuition about. (When working on problems, I often find myself drawing {0,1}^4 on a sheet of paper by drawing two copies of {0,1}^3 and then connecting the corresponding vertices.)
3. Instead of thinking about a subset S⊆R^n, think about its characteristic function f:R^n→{0,1}. I don't know why that trivial perspective switch makes such a big difference, but it does ... maybe because it shifts your attention to the process of computing f, and makes you forget about the hopeless task of visualizing S!
4. One of the central facts about R^n is that, while it has "room" for only n orthogonal vectors, it has room for exp⁡(n) almost-orthogonal vectors. Internalize that one fact, and so many other properties of R^n (for example, that the n-sphere resembles a "ball with spikes sticking out," as someone mentioned before) will suddenly seem non-mysterious. In turn, one way to internalize the fact that R^n has so many almost-orthogonal vectors is to internalize Shannon's theorem that there exist good error-correcting codes.
5. To get a feel for some high-dimensional object, ask questions about the behavior of a process that takes place on that object. For example: if I drop a ball here, which local minimum will it settle into? How long does this random walk on {0,1}^n take to mix?

Gil Kalai:
This is a slightly different point, but Vitali Milman, who works in high-dimensional convexity, likes to draw high-dimensional convex bodies in a non-convex way. This is to convey the point that if you take the convex hull of a few points on the unit sphere of R^n, then for large n very little of the measure of the convex body is anywhere near the corners, so in a certain sense the body is a bit like a small sphere with long thin "spikes".
q-n-a  intuition  math  visual-understanding  list  discussion  thurston  tidbits  aaronson  tcs  geometry  problem-solving  yoga  👳  big-list  metabuch  tcstariat  gowers  mathtariat  acm  overflow  soft-question  levers  dimensionality  hi-order-bits  insight  synthesis  thinking  models  cartoons  coding-theory  information-theory  probability  concentration-of-measure  magnitude  linear-algebra  boolean-analysis  analogy  arrows  lifts-projections  measure  markov  sampling  shannon  conceptual-vocab  nibble  degrees-of-freedom  worrydream  neurons  retrofit  oscillation  paradox  novelty  tricki  concrete  high-dimension  s:***  manifolds  direction  curvature  convexity-curvature 
december 2016 by nhaliday
Information Processing: Search results for compressed sensing
https://www.unz.com/jthompson/the-hsu-boundary/
http://infoproc.blogspot.com/2017/09/phase-transitions-and-genomic.html
Added: Here are comments from "Donoho-Student":
Donoho-Student says:
September 14, 2017 at 8:27 pm GMT • 100 Words

The Donoho-Tanner transition describes the noise-free (h2=1) case, which has a direct analog in the geometry of polytopes.

The n = 30s result from Hsu et al. (specifically the value of the coefficient, 30, when p is the appropriate number of SNPs on an array and h2 = 0.5) is obtained via simulation using actual genome matrices, and is original to them. (There is no simple formula that gives this number.) The D-T transition had only been established in the past for certain classes of matrices, like random matrices with specific distributions. Those results cannot be immediately applied to genomes.

The estimate that s is (order of magnitude) 10k is also a key input.

I think Hsu refers to n = 1 million instead of 30 * 10k = 300k because the effective SNP heritability of IQ might be less than h2 = 0.5 — there is noise in the phenotype measurement, etc.

Donoho-Student says:
September 15, 2017 at 11:27 am GMT • 200 Words

Lasso is a common statistical method but most people who use it are not familiar with the mathematical theorems from compressed sensing. These results give performance guarantees and describe phase transition behavior, but because they are rigorous theorems they only apply to specific classes of sensor matrices, such as simple random matrices. Genomes have correlation structure, so the theorems do not directly apply to the real world case of interest, as is often true.

What the Hsu paper shows is that the exact D-T phase transition appears in the noiseless (h2 = 1) problem using genome matrices, and a smoothed version appears in the problem with realistic h2. These are new results, as is the prediction for how much data is required to cross the boundary. I don’t think most gwas people are familiar with these results. If they did understand the results they would fund/design adequately powered studies capable of solving lots of complex phenotypes, medical conditions as well as IQ, that have significant h2.

Most people who use lasso, as opposed to people who prove theorems, are not even aware of the D-T transition. Even most people who prove theorems have followed the Candes-Tao line of attack (restricted isometry property) and don’t think much about D-T. Although D eventually proved some things about the phase transition using high dimensional geometry, it was initially discovered via simulation using simple random matrices.
hsu  list  stream  genomics  genetics  concept  stats  methodology  scaling-up  scitariat  sparsity  regression  biodet  bioinformatics  norms  nibble  compressed-sensing  applications  search  ideas  multi  albion  behavioral-gen  iq  state-of-art  commentary  explanation  phase-transition  measurement  volo-avolo  regularization  levers  novelty  the-trenches  liner-notes  clarity  random-matrices  innovation  high-dimension  linear-models 
november 2016 by nhaliday
Borel–Cantelli lemma - Wikipedia
- sum of probabilities finite => a.s. only finitely many occur
- "<=" w/ some assumptions (pairwise independence)
- classic result from CS 150 (problem set 1)
wiki  reference  estimate  probability  math  acm  concept  levers  probabilistic-method  limits  nibble  borel-cantelli 
november 2016 by nhaliday
« earlier      
per page:    204080120160

bundles : abstractmathmetaproblem-solvingthinkingtk

related tags

aaronson  academia  accuracy  acm  acmtariat  additive  additive-combo  aDNA  adversarial  age-generation  agri-mindset  albion  algebra  algebraic-complexity  algorithms  alien-character  altruism  AMT  analogy  analytical-holistic  antiquity  aphorism  apollonian-dionysian  applicability-prereqs  applications  approximation  archaeology  architecture  arrows  article  asia  atoms  autism  average-case  behavioral-gen  ben-recht  best-practices  better-explained  betting  bias-variance  biases  big-list  big-peeps  big-picture  binomial  bio  biodet  bioinformatics  bits  bonferroni  books  boolean-analysis  borel-cantelli  branches  britain  calculation  caltech  cartoons  causation  characterization  chart  cheatsheet  checklists  chemistry  china  civilization  clarity  classic  clever-rats  cliometrics  closure  coarse-fine  coding-theory  cohesion  coloring  combo-optimization  commentary  communication-complexity  commutativity  comparison  competition  complexity  composition-decomposition  compressed-sensing  concentration-of-measure  concept  conceptual-vocab  concrete  confluence  confounding  confusion  conquest-empire  control  convergence  convexity-curvature  cool  cooperate-defect  correlation  cost-benefit  counterexample  cs  curiosity  curvature  darwinian  data  data-science  database  decision-making  decision-theory  deep-learning  deep-materialism  definition  degrees-of-freedom  dennett  density  dependence-independence  differential  dimensionality  direction  discovery  discrete  discussion  disease  distribution  domestication  duality  dynamical  dysgenics  early-modern  ecology  econotariat  EGT  electromag  embeddings  embodied  ends-means  enhancement  entanglement  entropy-like  environment  equilibrium  erdos  ergodic  essay  estimate  europe  evolution  examples  existence  expectancy  expert  expert-experience  explanans  explanation  exposition  extratricky  extrema  features  fermi  fields  finiteness  fisher  fixed-point  fluid  fourier  frontier  game-theory  garett-jones  gavisti  GCTA  gene-drift  gene-flow  generalization  genetic-correlation  genetics  genomics  geography  geometry  giants  gibbon  gnxp  gotchas  gowers  gradient-descent  graph-theory  graphical-models  graphs  gravity  gregory-clark  ground-up  group-selection  guide  GWAS  gwern  hari-seldon  hashing  heuristic  hi-order-bits  high-dimension  history  homo-hetero  homogeneity  hsu  hypothesis-testing  ideas  identity  IEEE  iidness  impact  inference  information-theory  inner-product  innovation  insight  integral  interdisciplinary  interests  intuition  invariance  investing  iq  iron-age  islam  iterative-methods  japan  jargon  judaism  kinship  knowledge  korea  learning-theory  lecture-notes  lectures  lens  levers  lexical  libraries  lifts-projections  limits  linear-algebra  linear-models  linear-programming  linearity  liner-notes  links  list  local-global  logic  logos  lower-bounds  machine-learning  magnitude  manifolds  map-territory  marginal  markov  martingale  matching  math  math.AC  math.AG  math.CA  math.CO  math.CT  math.CV  math.DS  math.FA  math.GN  math.GR  math.MG  math.NT  mathtariat  matrix-factorization  measure  measurement  mechanics  medieval  mediterranean  MENA  mental-math  meta:math  meta:prediction  metabuch  metameta  methodology  metric-space  metrics  michael-jordan  migration  minimum-viable  missing-heritability  mit  ML-MAP-E  model-class  models  moments  monotonicity  motivation  mrtz  multi  multiplicative  mutation  nature  neurons  nibble  nitty-gritty  norms  novelty  objektbuch  oceans  ocw  off-convex  old-anglo  oly  optimization  order-disorder  orders  ORFE  org:bleg  org:edu  org:junk  org:mat  org:nat  oscillation  overflow  p:***  p:whenever  PAC  paradox  parsimony  path-dependence  pdf  perturbation  phase-transition  physics  pic  pigeonhole-markov  pinker  piracy  plots  poast  polynomials  pop-diff  pop-structure  population-genetics  positivity  power-law  pragmatic  pre-2013  preimage  preprint  prioritizing  probabilistic-method  probability  problem-solving  proofs  properties  pseudorandomness  psychiatry  q-n-a  qra  QTL  quantifiers-sums  quantitative-qualitative  quantum  quantum-info  quixotic  rand-approx  random  random-matrices  ratty  rec-math  recent-selection  reddit  reduction  reference  regression  regularity  regularization  relativity  repo  research  retrofit  rigidity  rigor  roadmap  robust  roots  rot  rounding  russia  s:*  s:**  s:***  s:null  sampling  sapiens  scale  scaling-up  scholar-pack  science  scitariat  search  selection  selfish-gene  sensitivity  sequential  series  sex  shannon  signum  similarity  simler  simplex  simulation  skeleton  slides  smoothness  social  soft-question  software  space  sparsity  spatial  spearhead  spectral  speed  spreading  ssc  stackex  stanford  stat-mech  stat-power  state  state-of-art  stats  stirling  stochastic-processes  stream  street-fighting  strings  structure  study  studying  stylized-facts  summary  survey  symmetry  synthesis  systematic-ad-hoc  tails  talks  tcs  tcstariat  techtariat  telos-atelos  temperature  tensors  the-classics  the-great-west-whale  the-trenches  theory-practice  thermo  things  thinking  thurston  tidbits  tightness  time  tip-of-tongue  todo  toolkit  tools  top-n  topology  track-record  trees  trends  tricki  tricks  tutorial  twitter  unit  usa  variance-components  video  visual-understanding  visualization  visuo  volo-avolo  von-neumann  waves  west-hunter  wiki  wormholes  worrydream  yoga  zooming  🌞  🎓  👳  👽  🔬 

Copy this bookmark:



description:


tags: