high-dimension   27

Accurate Genomic Prediction Of Human Height | bioRxiv
Stephen Hsu's compressed sensing application paper

We construct genomic predictors for heritable and extremely complex human quantitative traits (height, heel bone density, and educational attainment) using modern methods in high dimensional statistics (i.e., machine learning). Replication tests show that these predictors capture, respectively, ~40, 20, and 9 percent of total variance for the three traits. For example, predicted heights correlate ~0.65 with actual height; actual heights of most individuals in validation samples are within a few cm of the prediction.


I'm in Mountain View to give a talk at 23andMe. Their latest funding round was $250M on a (reported) valuation of $1.5B. If I just add up the Crunchbase numbers it looks like almost half a billion invested at this point...

Slides: Genomic Prediction of Complex Traits

Here's how people + robots handle your spit sample to produce a SNP genotype:

study  bio  preprint  GWAS  state-of-art  embodied  genetics  genomics  compressed-sensing  high-dimension  machine-learning  missing-heritability  hsu  scitariat  education  🌞  frontier  britain  regression  data  visualization  correlation  phase-transition  multi  commentary  summary  pdf  slides  brands  skunkworks  hard-tech  presentation  talks  methodology  intricacy  bioinformatics  scaling-up  stat-power  sparsity  norms  nibble  speedometer  stats  linear-models  2017  biodet 
september 2017 by nhaliday
Overcoming Bias : High Dimensional Societes?
I’ve seen many “spatial” models in social science. Such as models where voters and politicians sit at points in a space of policies. Or where customers and firms sit at points in a space of products. But I’ve never seen a discussion of how one should expect such models to change in high dimensions, such as when there are more dimensions than points.

In small dimensional spaces, the distances between points vary greatly; neighboring points are much closer to each other than are distant points. However, in high dimensional spaces, distances between points vary much less; all points are about the same distance from all other points. When points are distributed randomly, however, these distances do vary somewhat, allowing us to define the few points closest to each point as that point’s “neighbors”. “Hubs” are closest neighbors to many more points than average, while “anti-hubs” are closest neighbors to many fewer points than average. It turns out that in higher dimensions a larger fraction of points are hubs and anti-hubs (Zimek et al. 2012).

If we think of people or organizations as such points, is being a hub or anti-hub associated with any distinct social behavior?  Does it contribute substantially to being popular or unpopular? Or does the fact that real people and organizations are in fact distributed in real space overwhelm such things, which only only happen in a truly high dimensional social world?
ratty  hanson  speculation  ideas  thinking  spatial  dimensionality  high-dimension  homo-hetero  analogy  models  network-structure  degrees-of-freedom 
july 2017 by nhaliday
Dvoretzky's theorem - Wikipedia
In mathematics, Dvoretzky's theorem is an important structural theorem about normed vector spaces proved by Aryeh Dvoretzky in the early 1960s, answering a question of Alexander Grothendieck. In essence, it says that every sufficiently high-dimensional normed vector space will have low-dimensional subspaces that are approximately Euclidean. Equivalently, every high-dimensional bounded symmetric convex set has low-dimensional sections that are approximately ellipsoids.

math  math.FA  inner-product  levers  characterization  geometry  math.MG  concentration-of-measure  multi  q-n-a  overflow  intuition  examples  proofs  dimensionality  gowers  mathtariat  tcstariat  quantum  quantum-info  norms  nibble  high-dimension  wiki  reference  curvature  convexity-curvature  tcs 
january 2017 by nhaliday

related tags

2016-election  2017  accretion  acm  additive-combo  adversarial  ai-control  ai  algorithms  alt-inst  analogy  analysis  antidemos  applications  approximation  article  atoms  attention  average-case  best-practices  big-picture  bio  biodet  bioinformatics  boltzmann  books  boolean-analysis  brands  britain  brunn-minkowski  cartoons  chaining  characterization  chart  cmu  coarse-fine  commentary  comparison  competition  complement-substitute  compressed-sensing  concentration-of-measure  concept  convexity-curvature  cooperate-defect  coordination  correlation  counter-revolution  course  crosstab  curiosity  curvature  data-science  data-structures  data  decision-theory  deep-learning  deepgoog  degrees-of-freedom  dimensionality  direct-indirect  direction  discussion  distribution  draft  ecology  economics  education  embeddings  embodied  encyclopedic  entropy-like  environment  equilibrium  ergodic  estimate  evolution  examples  expert-experience  expert  explanation  exploratory  explore-exploit  exposition  fedja  frontier  gaussian-processes  generalization  genetics  genomics  geography  geometry  giants  google  gotchas  government  gowers  gradient-descent  ground-up  gwas  gwern  hanson  hard-tech  hashing  heuristic  hi-order-bits  high-dimensional-probability  history  homepage  homo-hetero  howto  hsu  ideas  incentives  information-theory  inner-product  insight  intelligence  interdisciplinary  intricacy  intuition  iteration-recursion  learning-theory  lecture-notes  letters  levers  limits  linear-algebra  linear-models  links  local-global  machine-learning  magnitude  markov  martingale  math.ca  math.co  math.fa  math.mg  math  mathtariat  matrix-factorization  measure  medicine  meta:medicine  meta:science  metabuch  methodology  metric-space  mihai  missing-heritability  mit  model-class  models  moloch  monte-carlo  multi  network-structure  nibble  no-go  norms  novelty  objektbuch  off-convex  oly  optimization  orfe  org:bleg  org:edu  org:mat  orourke  overflow  p:***  p:*  p:whenever  paradox  pdf  people  performance  phase-transition  pigeonhole-markov  poast  policy  politics  prediction  preprint  presentation  princeton  probabilistic-method  probability  prof  proofs  proposal  q-n-a  quantum-info  quantum  questions  quixotic  rand-approx  random-matrices  random  ratty  reduction  reference  regression  reinforcement  relaxation  retention  rhetoric  risk  s:**  saas  scale  scaling-up  science  scitariat  separation  shift  skunkworks  slides  soft-question  sparsity  spatial  speculation  speedometer  stat-mech  stat-power  state-of-art  stats  stochastic-processes  stories  study  summary  survey  synthesis  systems  talks  tcs  tcstariat  technology  telos-atelos  tensors  the-self  the-world-is-just-atoms  things  thinking  threat-modeling  tidbits  time  trees  trump  turing  uchicago  unit  unsupervised  video  visual-understanding  visualization  volo-avolo  west-hunter  wiki  wormholes  yoga  🌞  👳  🔬 

Copy this bookmark: