nhaliday + bioinformatics   81

Estimation of effect size distribution from genome-wide association studies and implications for future discoveries
We report a set of tools to estimate the number of susceptibility loci and the distribution of their effect sizes for a trait on the basis of discoveries from existing genome-wide association studies (GWASs). We propose statistical power calculations for future GWASs using estimated distributions of effect sizes. Using reported GWAS findings for height, Crohn’s disease and breast, prostate and colorectal (BPC) cancers, we determine that each of these traits is likely to harbor additional loci within the spectrum of low-penetrance common variants. These loci, which can be identified from sufficiently powerful GWASs, together could explain at least 15–20% of the known heritability of these traits. However, for BPC cancers, which have modest familial aggregation, our analysis suggests that risk models based on common variants alone will have modest discriminatory power (63.5% area under curve), even with new discoveries.

later paper:
Distribution of allele frequencies and effect sizes and their interrelationships for common genetic susceptibility variants: http://www.pnas.org/content/108/44/18026.full

Recent discoveries of hundreds of common susceptibility SNPs from genome-wide association studies provide a unique opportunity to examine population genetic models for complex traits. In this report, we investigate distributions of various population genetic parameters and their interrelationships using estimates of allele frequencies and effect-size parameters for about 400 susceptibility SNPs across a spectrum of qualitative and quantitative traits. We calibrate our analysis by statistical power for detection of SNPs to account for overrepresentation of variants with larger effect sizes in currently known SNPs that are expected due to statistical power for discovery. Across all qualitative disease traits, minor alleles conferred “risk” more often than “protection.” Across all traits, an inverse relationship existed between “regression effects” and allele frequencies. Both of these trends were remarkably strong for type I diabetes, a trait that is most likely to be influenced by selection, but were modest for other traits such as human height or late-onset diseases such as type II diabetes and cancers. Across all traits, the estimated effect-size distribution suggested the existence of increasingly large numbers of susceptibility SNPs with decreasingly small effects. For most traits, the set of SNPs with intermediate minor allele frequencies (5–20%) contained an unusually small number of susceptibility loci and explained a relatively small fraction of heritability compared with what would be expected from the distribution of SNPs in the general population. These trends could have several implications for future studies of common and uncommon variants.

...

Relationship Between Allele Frequency and Effect Size. We explored the relationship between allele frequency and effect size in different scales. An inverse relationship between the squared regression coefficient and f(1 − f) was observed consistently across different traits (Fig. 3). For a number of these traits, however, the strengths of these relationships become less pronounced after adjustment for ascertainment due to study power. The strength of the trend, as captured by the slope of the fitted line (Table 2), markedly varies between traits, with an almost 10-fold change between the two extremes of distinct types of traits. After adjustment, the most pronounced trend was seen for type I diabetes and Crohn’s disease among qualitative traits and LDL level among quantitative traits. In exploring the relationship between the frequency of the risk allele and the magnitude of the associated risk coefficient (Fig. S4), we observed a quadratic pattern that indicates increasing risk coefficients as the risk-allele frequency diverges away from 0.50 either toward 0 or toward 1. Thus, it appears that regression coefficients for common susceptibility SNPs increase in magnitude monotonically with decreasing minor-allele frequency, irrespective of whether the minor allele confers risk or protection. However, for some traits, such as type I diabetes, risk alleles were predominantly minor alleles, that is, they had frequencies of less than 0.50.
pdf  nibble  study  article  org:nat  🌞  biodet  genetics  population-genetics  GWAS  QTL  distribution  disease  cancer  stat-power  bioinformatics  magnitude  embodied  prediction  scale  scaling-up  variance-components  multi  missing-heritability  effect-size  regression  correlation  data 
november 2017 by nhaliday
Use and Interpretation of LD Score Regression
LD Score regression distinguishes confounding from polygenicity in genome-wide association studies: https://sci-hub.bz/10.1038/ng.3211
- Po-Ru Loh, Nick Patterson, et al.

https://www.biorxiv.org/content/biorxiv/early/2014/02/21/002931.full.pdf

Both polygenicity (i.e. many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield inflated distributions of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from bias and true signal from polygenicity. We have developed an approach that quantifies the contributions of each by examining the relationship between test statistics and linkage disequilibrium (LD). We term this approach LD Score regression. LD Score regression provides an upper bound on the contribution of confounding bias to the observed inflation in test statistics and can be used to estimate a more powerful correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of test statistic inflation in many GWAS of large sample size.

Supplementary Note: https://images.nature.com/original/nature-assets/ng/journal/v47/n3/extref/ng.3211-S1.pdf

An atlas of genetic correlations across human diseases
and traits: https://sci-hub.bz/10.1038/ng.3406

https://www.biorxiv.org/content/early/2015/01/27/014498.full.pdf

Supplementary Note: https://images.nature.com/original/nature-assets/ng/journal/v47/n11/extref/ng.3406-S1.pdf

https://github.com/bulik/ldsc
ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.
nibble  pdf  slides  talks  bio  biodet  genetics  genomics  GWAS  genetic-correlation  correlation  methodology  bioinformatics  concept  levers  🌞  tutorial  explanation  pop-structure  gene-drift  ideas  multi  study  org:nat  article  repo  software  tools  libraries  stats  hypothesis-testing  biases  confounding  gotchas  QTL  simulation  survey  preprint  population-genetics 
november 2017 by nhaliday
Ancient Admixture in Human History
- Patterson, Reich et al., 2012
Population mixture is an important process in biology. We present a suite of methods for learning about population mixtures, implemented in a software package called ADMIXTOOLS, that support formal tests for whether mixture occurred and make it possible to infer proportions and dates of mixture. We also describe the development of a new single nucleotide polymorphism (SNP) array consisting of 629,433 sites with clearly documented ascertainment that was specifically designed for population genetic analyses and that we genotyped in 934 individuals from 53 diverse populations. To illustrate the methods, we give a number of examples that provide new insights about the history of human admixture. The most striking finding is a clear signal of admixture into northern Europe, with one ancestral population related to present-day Basques and Sardinians and the other related to present-day populations of northeast Asia and the Americas. This likely reflects a history of admixture between Neolithic migrants and the indigenous Mesolithic population of Europe, consistent with recent analyses of ancient bones from Sweden and the sequencing of the genome of the Tyrolean “Iceman.”
nibble  pdf  study  article  methodology  bio  sapiens  genetics  genomics  population-genetics  migration  gene-flow  software  trees  concept  history  antiquity  europe  roots  gavisti  🌞  bioinformatics  metrics  hypothesis-testing  levers  ideas  libraries  tools  pop-structure 
november 2017 by nhaliday
Accurate Genomic Prediction Of Human Height | bioRxiv
Stephen Hsu's compressed sensing application paper

We construct genomic predictors for heritable and extremely complex human quantitative traits (height, heel bone density, and educational attainment) using modern methods in high dimensional statistics (i.e., machine learning). Replication tests show that these predictors capture, respectively, ~40, 20, and 9 percent of total variance for the three traits. For example, predicted heights correlate ~0.65 with actual height; actual heights of most individuals in validation samples are within a few cm of the prediction.

https://infoproc.blogspot.com/2017/09/accurate-genomic-prediction-of-human.html

http://infoproc.blogspot.com/2017/11/23andme.html
I'm in Mountain View to give a talk at 23andMe. Their latest funding round was $250M on a (reported) valuation of $1.5B. If I just add up the Crunchbase numbers it looks like almost half a billion invested at this point...

Slides: Genomic Prediction of Complex Traits

Here's how people + robots handle your spit sample to produce a SNP genotype:

https://drive.google.com/file/d/1e_zuIPJr1hgQupYAxkcbgEVxmrDHAYRj/view
study  bio  preprint  GWAS  state-of-art  embodied  genetics  genomics  compressed-sensing  high-dimension  machine-learning  missing-heritability  hsu  scitariat  education  🌞  frontier  britain  regression  data  visualization  correlation  phase-transition  multi  commentary  summary  pdf  slides  brands  skunkworks  hard-tech  presentation  talks  methodology  intricacy  bioinformatics  scaling-up  stat-power  sparsity  norms  nibble  speedometer  stats  linear-models  2017  biodet 
september 2017 by nhaliday
10 million DTC dense marker genotypes by end of 2017? – Gene Expression
Ultimately I do wonder if I was a bit too optimistic that 50% of the US population will be sequenced at 30x by 2025. But the dynamic is quite likely to change rapidly because of a technological shift as the sector goes through a productivity uptick. We’re talking about exponential growth, which humans have weak intuition about….
https://gnxp.nofe.me/2017/06/27/genome-sequencing-for-the-people-is-near/
https://gnxp.nofe.me/2017/07/11/23andme-ancestry-only-is-49-99-for-prime-day/
gnxp  scitariat  commentary  biotech  scaling-up  genetics  genomics  scale  bioinformatics  multi  toys  measurement  duplication  signal-noise  coding-theory 
june 2017 by nhaliday
Genomic analysis of family data reveals additional genetic effects on intelligence and personality | bioRxiv
methodology:
Using Extended Genealogy to Estimate Components of Heritability for 23 Quantitative and Dichotomous Traits: http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1003520
Pedigree- and SNP-Associated Genetics and Recent Environment are the Major Contributors to Anthropometric and Cardiometabolic Trait Variation: http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1005804

Missing Heritability – found?: https://westhunt.wordpress.com/2017/02/09/missing-heritability-found/
There is an interesting new paper out on genetics and IQ. The claim is that they have found the missing heritability – in rare variants, generally different in each family.

Some of the variants, the ones we find with GWAS, are fairly common and fitness-neutral: the variant that slightly increases IQ confers the same fitness (or very close to the same) as the one that slightly decreases IQ – presumably because of other effects it has. If this weren’t the case, it would be impossible for both of the variants to remain common.

The rare variants that affect IQ will generally decrease IQ – and since pleiotropy is the norm, usually they’ll be deleterious in other ways as well. Genetic load.

Happy families are all alike; every unhappy family is unhappy in its own way.: https://westhunt.wordpress.com/2017/06/06/happy-families-are-all-alike-every-unhappy-family-is-unhappy-in-its-own-way/
It now looks as if the majority of the genetic variance in IQ is the product of mutational load, and the same may be true for many psychological traits. To the extent this is the case, a lot of human psychological variation must be non-adaptive. Maybe some personality variation fulfills an evolutionary function, but a lot does not. Being a dumb asshole may be a bug, rather than a feature. More generally, this kind of analysis could show us whether particular low-fitness syndromes, like autism, were ever strategies – I suspect not.

It’s bad new news for medicine and psychiatry, though. It would suggest that what we call a given type of mental illness, like schizophrenia, is really a grab-bag of many different syndromes. The ultimate causes are extremely varied: at best, there may be shared intermediate causal factors. Not good news for drug development: individualized medicine is a threat, not a promise.

see also comment at: https://pinboard.in/u:nhaliday/b:a6ab4034b0d0

https://www.reddit.com/r/slatestarcodex/comments/5sldfa/genomic_analysis_of_family_data_reveals/
So the big implication here is that it's better than I had dared hope - like Yang/Visscher/Hsu have argued, the old GCTA estimate of ~0.3 is indeed a rather loose lower bound on additive genetic variants, and the rest of the missing heritability is just the relatively uncommon additive variants (ie <1% frequency), and so, like Yang demonstrated with height, using much more comprehensive imputation of SNP scores or using whole-genomes will be able to explain almost all of the genetic contribution. In other words, with better imputation panels, we can go back and squeeze out better polygenic scores from old GWASes, new GWASes will be able to reach and break the 0.3 upper bound, and eventually we can feasibly predict 0.5-0.8. Between the expanding sample sizes from biobanks, the still-falling price of whole genomes, the gradual development of better regression methods (informative priors, biological annotation information, networks, genetic correlations), and better imputation, the future of GWAS polygenic scores is bright. Which obviously will be extremely helpful for embryo selection/genome synthesis.

The argument that this supports mutation-selection balance is weaker but plausible. I hope that it's true, because if that's why there is so much genetic variation in intelligence, then that strongly encourages genetic engineering - there is no good reason or Chesterton fence for intelligence variants being non-fixed, it's just that evolution is too slow to purge the constantly-accumulating bad variants. And we can do better.
https://rubenarslan.github.io/generation_scotland_pedigree_gcta/

The surprising implications of familial association in disease risk: https://arxiv.org/abs/1707.00014
https://spottedtoad.wordpress.com/2017/06/09/personalized-medicine-wont-work-but-race-based-medicine-probably-will/
As Greg Cochran has pointed out, this probably isn’t going to work. There are a few genes like BRCA1 (which makes you more likely to get breast and ovarian cancer) that we can detect and might affect treatment, but an awful lot of disease turns out to be just the result of random chance and deleterious mutation. This means that you can’t easily tailor disease treatment to people’s genes, because everybody is fucked up in their own special way. If Johnny is schizophrenic because of 100 random errors in the genes that code for his neurons, and Jack is schizophrenic because of 100 other random errors, there’s very little way to test a drug to work for either of them- they’re the only one in the world, most likely, with that specific pattern of errors. This is, presumably why the incidence of schizophrenia and autism rises in populations when dads get older- more random errors in sperm formation mean more random errors in the baby’s genes, and more things that go wrong down the line.

The looming crisis in human genetics: http://www.economist.com/node/14742737
Some awkward news ahead
- Geoffrey Miller

Human geneticists have reached a private crisis of conscience, and it will become public knowledge in 2010. The crisis has depressing health implications and alarming political ones. In a nutshell: the new genetics will reveal much less than hoped about how to cure disease, and much more than feared about human evolution and inequality, including genetic differences between classes, ethnicities and races.

2009!
study  preprint  bio  biodet  behavioral-gen  GWAS  missing-heritability  QTL  🌞  scaling-up  replication  iq  education  spearhead  sib-study  multi  west-hunter  scitariat  genetic-load  mutation  medicine  meta:medicine  stylized-facts  ratty  unaffiliated  commentary  rhetoric  wonkish  genetics  genomics  race  pop-structure  poast  population-genetics  psychiatry  aphorism  homo-hetero  generalization  scale  state-of-art  ssc  reddit  social  summary  gwern  methodology  personality  britain  anglo  enhancement  roots  s:*  2017  data  visualization  database  let-me-see  bioinformatics  news  org:rec  org:anglo  org:biz  track-record  prediction  identity-politics  pop-diff  recent-selection  westminster  inequality  egalitarianism-hierarchy  high-dimension  applications  dimensionality  ideas  no-go  volo-avolo  magnitude  variance-components  GCTA  tradeoffs  counter-revolution  org:mat  dysgenics  paternal-age  distribution  chart  abortion-contraception-embryo 
june 2017 by nhaliday
Interview: Mostly Sealing Wax | West Hunter
https://soundcloud.com/user-519115521/greg-cochran-part-2
https://medium.com/@houstoneuler/annotating-part-2-of-the-greg-cochran-interview-with-james-miller-678ba33f74fc

- conformity and Google, defense and spying (China knows prob almost all our "secrets")
- in the past you could just find new things faster than people could reverse-engineer. part of the problem is that innovation is slowing down today (part of the reason for convergence by China/developing world).
- introgression from archaics of various kinds
- mutational load and IQ, wrath of khan neanderthal
- trade and antiquity (not that useful besides ideas tbh), Roman empire, disease, smallpox
- spices needed to be grown elsewhere, but besides that...
- analogy: caste system in India (why no Brahmin car repairmen?), slavery in Greco-Roman times, more water mills in medieval times (rivers better in north, but still could have done it), new elite not liking getting hands dirty, low status of engineers, rise of finance
- crookery in finance, hedge fund edge might be substantially insider trading
- long-term wisdom of moving all manufacturing to China...?
- economic myopia: British financialization before WW1 vis-a-vis Germany. North vs. South and cotton/industry, camels in Middle East vs. wagons in Europe
- Western medicine easier to convert to science than Eastern, pseudoscience and wrong theories better than bag of recipes
- Greeks definitely knew some things that were lost (eg, line in Pliny makes reference to combinatorics calculation rediscovered by German dude much later. think he's referring to Catalan numbers?), Lucio Russo book
- Indo-Europeans, Western Europe, Amerindians, India, British Isles, gender, disease, and conquest
- no farming (Dark Age), then why were people still farming on Shetland Islands north of Scotland?
- "symbolic" walls, bodies with arrows
- family stuff, children learning, talking dog, memory and aging
- Chinese/Japanese writing difficulty and children learning to read
- Hatfield-McCoy feud: the McCoy family was actually a case study in a neurological journal. they had anger management issues because of cancers of their adrenal gland (!!).

the Chinese know...: https://macropolo.org/casting-off-real-beijings-cryptic-warnings-finance-taking-economy/
Over the last couple of years, a cryptic idiom has crept into the way China’s top leaders talk about risks in the country’s financial system: tuo shi xiang xu (脱实向虚), which loosely translates as “casting off the real for the empty.” Premier Li Keqiang warned against it at his press conference at the end of the 2016 National People’s Congress (NPC). At this year’s NPC, Li inserted this very expression into his annual work report. And in April, while on an inspection tour of Guangxi, President Xi Jinping used the term, saying that China must “unceasingly promote industrial modernization, raise the level of manufacturing, and not allow the real to be cast off for the empty.”

Such an odd turn of phrase is easy to overlook, but it belies concerns about a significant shift in the way that China’s economy works. What Xi and Li were warning against is typically called financialization in developed economies. It’s when “real” companies—industrial firms, manufacturers, utility companies, property developers, and anyone else that produces a tangible product or service—take their money and, rather than put it back into their businesses, invest it in “empty”, or speculative, assets. It occurs when the returns on financial investments outstrip those in the real economy, leading to a disproportionate amount of money being routed into the financial system.

https://twitter.com/gcochran99/status/1160589827651203073
https://archive.is/Yzjyv
Bad day for Lehman Bros.
--
Good day for everyone else, then.
west-hunter  interview  audio  podcast  econotariat  cracker-econ  westminster  culture-war  polarization  tech  sv  google  info-dynamics  business  multi  military  security  scitariat  intel  error  government  defense  critique  rant  race  clown-world  patho-altruism  history  mostly-modern  cold-war  russia  technology  innovation  stagnation  being-right  archaics  gene-flow  sapiens  genetics  the-trenches  thinking  sequential  similarity  genomics  bioinformatics  explanation  europe  asia  china  migration  evolution  recent-selection  immune  atmosphere  latin-america  ideas  sky  developing-world  embodied  africa  MENA  genetic-load  unintended-consequences  iq  enhancement  aDNA  gedanken  mutation  QTL  missing-heritability  tradeoffs  behavioral-gen  biodet  iron-age  mediterranean  the-classics  trade  gibbon  disease  parasites-microbiome  demographics  population  urban  transportation  efficiency  cost-benefit  india  agriculture  impact  status  class  elite  vampire-squid  analogy  finance  higher-ed  trends  rot  zeitgeist  🔬  hsu  stories  aphorism  crooked  realne 
may 2017 by nhaliday
Estimating the number of unseen variants in the human genome
To find all common variants (frequency at least 1%) the number of individuals that need to be sequenced is small (∼350) and does not differ much among the different populations; our data show that, subject to sequence accuracy, the 1000 Genomes Project is likely to find most of these common variants and a high proportion of the rarer ones (frequency between 0.1 and 1%). The data reveal a rule of diminishing returns: a small number of individuals (∼150) is sufficient to identify 80% of variants with a frequency of at least 0.1%, while a much larger number (> 3,000 individuals) is necessary to find all of those variants.

A map of human genome variation from population-scale sequencing: http://www.internationalgenome.org/sites/1000genomes.org/files/docs/nature09534.pdf

Scientists using data from the 1000 Genomes Project, which sequenced one thousand individuals from 26 human populations, found that "a typical [individual] genome differs from the reference human genome at 4.1 million to 5.0 million sites … affecting 20 million bases of sequence."[11] Nearly all (>99.9%) of these sites are small differences, either single nucleotide polymorphisms or brief insertion-deletions in the genetic sequence, but structural variations account for a greater number of base-pairs than the SNPs and indels.[11]

Human genetic variation: https://en.wikipedia.org/wiki/Human_genetic_variation

Singleton Variants Dominate the Genetic Architecture of Human Gene Expression: https://www.biorxiv.org/content/early/2017/12/15/219238
study  sapiens  genetics  genomics  population-genetics  bioinformatics  data  prediction  cost-benefit  scale  scaling-up  org:nat  QTL  methodology  multi  pdf  curvature  convexity-curvature  nonlinearity  measurement  magnitude  🌞  distribution  missing-heritability  pop-structure  genetic-load  mutation  wiki  reference  article  structure  bio  preprint  biodet  variance-components  nibble  chart 
may 2017 by nhaliday
Human genome - Wikipedia
There are an estimated 19,000-20,000 human protein-coding genes.[4] The estimate of the number of human genes has been repeatedly revised down from initial predictions of 100,000 or more as genome sequence quality and gene finding methods have improved, and could continue to drop further.[5][6] Protein-coding sequences account for only a very small fraction of the genome (approximately 1.5%), and the rest is associated with non-coding RNA molecules, regulatory DNA sequences, LINEs, SINEs, introns, and sequences for which as yet no function has been determined.[7]
bio  sapiens  genetics  genomics  bioinformatics  scaling-up  data  scale  wiki  reference  QTL  methodology 
may 2017 by nhaliday
Sequencing a genome for less than the cost of an X-ray? Not quite yet
A $100 genome will cost $100 in the same way that the $1,000 genome costs $1,000. As in, it won’t, at least not soon. “The $1,000 genome” — which sequencer makers began promising about five years ago — “costs us $3,000,” said Richard Gibbs, founder of the Baylor College of Medicine Human Genome Sequencing Center and one of the leaders of the original Human Genome Project in the 1990s.
news  org:sci  scaling-up  data  scale  genetics  genomics  biotech  money  efficiency  bioinformatics  cost-benefit  frontier  speedometer  measurement 
april 2017 by nhaliday
Minor allele frequency - Wikipedia
It is widely used in population genetics studies because it provides information to differentiate between common and rare variants in the population.
jargon  genetics  genomics  bioinformatics  population-genetics  QTL  wiki  reference  metrics  distribution 
march 2017 by nhaliday
Information Processing: Big, complicated data sets
This Times article profiles Nick Patterson, a mathematician whose career wandered from cryptography, to finance (7 years at Renaissance) and finally to bioinformatics. “I’m a data guy,” Dr. Patterson said. “What I know about is how to analyze big, complicated data sets.”

If you're a smart guy looking for something to do, there are 3 huge computational problems staring you in the face, for which the data is readily accessible.

1) human genome: 3 GB of data in a single genome; most data freely available on the Web (e.g., Hapmap stores patterns of sequence variation). Got a hypothesis about deep human history (evolution)? Test it yourself...

2) market prediction: every market tick available at zero or minimal subscription-service cost. Can you model short term movements? It's never been cheaper to build and test your model!

3) internet search: about 10^3 Terabytes of data (admittedly, a barrier to entry for an individual, but not for a startup). Can you come up with a better way to index or search it? What about peripheral problems like language translation or picture or video search?

The biggest barrier to entry is, of course, brainpower and a few years (a decade?) of concentrated learning. But the necessary books are all in the library :-)

Patterson has worked in 2 of the 3 areas listed above! Substituting crypto for internet search is understandable given his age, our cold war history, etc.
hsu  scitariat  quotes  links  news  org:rec  profile  giants  stories  huge-data-the-biggest  genomics  bioinformatics  finance  crypto  history  britain  interdisciplinary  the-trenches  🔬  questions  genetics  dataset  search  web  internet  scale  commentary  apollonian-dionysian  magnitude  examples  open-problems  big-surf  markets  securities  ORFE  nitty-gritty  quixotic  google  startups  ideas  measure  space-complexity  minimum-viable  move-fast-(and-break-things) 
february 2017 by nhaliday
Genetics and educational attainment | npj Science of Learning
Figure 1 is quite good
Sibling Correlations for Behavioral Traits. This figure displays sibling correlations for five traits measured in a large sample of Swedish brother pairs born 1951–1970. All outcomes except years of schooling are measured at conscription, around the age of 18.

correlations for IQ/EA for adoptees are actually nontrivial in adulthood, hmm

Figure 2 has GWAS R^2s through 2016 (in-sample, I guess?)
study  org:nat  biodet  education  methodology  essay  survey  genetics  GWAS  variance-components  init  causation  🌞  metrics  population-genetics  explanation  unit  nibble  len:short  big-picture  behavioral-gen  state-of-art  iq  embodied  correlation  twin-study  sib-study  summary  europe  nordic  data  visualization  s:*  tip-of-tongue  spearhead  bioinformatics 
february 2017 by nhaliday
Information Processing: Epistasis vs additivity
On epistasis: why it is unimportant in polygenic directional selection: http://rstb.royalsocietypublishing.org/content/365/1544/1241.short
- James F. Crow

The Evolution of Multilocus Systems Under Weak Selection: http://www.genetics.org/content/genetics/134/2/627.full.pdf
- Thomas Nagylaki

Data and Theory Point to Mainly Additive Genetic Variance for Complex Traits: http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1000008
The relative proportion of additive and non-additive variation for complex traits is important in evolutionary biology, medicine, and agriculture. We address a long-standing controversy and paradox about the contribution of non-additive genetic variation, namely that knowledge about biological pathways and gene networks imply that epistasis is important. Yet empirical data across a range of traits and species imply that most genetic variance is additive. We evaluate the evidence from empirical studies of genetic variance components and find that additive variance typically accounts for over half, and often close to 100%, of the total genetic variance. We present new theoretical results, based upon the distribution of allele frequencies under neutral and other population genetic models, that show why this is the case even if there are non-additive effects at the level of gene action. We conclude that interactions at the level of genes are not likely to generate much interaction at the level of variance.
hsu  scitariat  commentary  links  study  list  evolution  population-genetics  genetics  methodology  linearity  nonlinearity  comparison  scaling-up  nibble  lens  bounded-cognition  ideas  bio  occam  parsimony  🌞  summary  quotes  multi  org:nat  QTL  stylized-facts  article  explanans  sapiens  biodet  selection  variance-components  metabuch  thinking  models  data  deep-materialism  chart  behavioral-gen  evidence-based  empirical  mutation  spearhead  model-organism  bioinformatics  linear-models  math  magnitude  limits  physics  interdisciplinary  stat-mech 
february 2017 by nhaliday
The infinitesimal model | bioRxiv
Our focus here is on the infinitesimal model. In this model, one or several quantitative traits are described as the sum of a genetic and a non-genetic component, the first being distributed as a normal random variable centred at the average of the parental genetic components, and with a variance independent of the parental traits. We first review the long history of the infinitesimal model in quantitative genetics. Then we provide a definition of the model at the phenotypic level in terms of individual trait values and relationships between individuals, but including different evolutionary processes: genetic drift, recombination, selection, mutation, population structure, ... We give a range of examples of its application to evolutionary questions related to stabilising selection, assortative mating, effective population size and response to selection, habitat preference and speciation. We provide a mathematical justification of the model as the limit as the number M of underlying loci tends to infinity of a model with Mendelian inheritance, mutation and environmental noise, when the genetic component of the trait is purely additive. We also show how the model generalises to include epistatic effects. In each case, by conditioning on the pedigree relating individuals in the population, we incorporate arbitrary selection and population structure. We suppose that we can observe the pedigree up to the present generation, together with all the ancestral traits, and we show, in particular, that the genetic components of the individual trait values in the current generation are indeed normally distributed with a variance independent of ancestral traits, up to an error of order M^{-1/2}. Simulations suggest that in particular cases the convergence may be as fast as 1/M.

published version:
The infinitesimal model: Definition, derivation, and implications: https://sci-hub.tw/10.1016/j.tpb.2017.06.001

Commentary: Fisher’s infinitesimal model: A story for the ages: http://www.sciencedirect.com/science/article/pii/S0040580917301508?via%3Dihub
This commentary distinguishes three nested approximations, referred to as “infinitesimal genetics,” “Gaussian descendants” and “Gaussian population,” each plausibly called “the infinitesimal model.” The first and most basic is Fisher’s “infinitesimal” approximation of the underlying genetics – namely, many loci, each making a small contribution to the total variance. As Barton et al. (2017) show, in the limit as the number of loci increases (with enough additivity), the distribution of genotypic values for descendants approaches a multivariate Gaussian, whose variance–covariance structure depends only on the relatedness, not the phenotypes, of the parents (or whether their population experiences selection or other processes such as mutation and migration). Barton et al. (2017) call this rigorously defensible “Gaussian descendants” approximation “the infinitesimal model.” However, it is widely assumed that Fisher’s genetic assumptions yield another Gaussian approximation, in which the distribution of breeding values in a population follows a Gaussian — even if the population is subject to non-Gaussian selection. This third “Gaussian population” approximation, is also described as the “infinitesimal model.” Unlike the “Gaussian descendants” approximation, this third approximation cannot be rigorously justified, except in a weak-selection limit, even for a purely additive model. Nevertheless, it underlies the two most widely used descriptions of selection-induced changes in trait means and genetic variances, the “breeder’s equation” and the “Bulmer effect.” Future generations may understand why the “infinitesimal model” provides such useful approximations in the face of epistasis, linkage, linkage disequilibrium and strong selection.
study  exposition  bio  evolution  population-genetics  genetics  methodology  QTL  preprint  models  unit  len:long  nibble  linearity  nonlinearity  concentration-of-measure  limits  applications  🌞  biodet  oscillation  fisher  perturbation  stylized-facts  chart  ideas  article  pop-structure  multi  pdf  piracy  intricacy  map-territory  kinship  distribution  simulation  ground-up  linear-models  applicability-prereqs  bioinformatics 
january 2017 by nhaliday
The deleterious mutation load is insensitive to recent population history : Nature Genetics : Nature Research
contrary:
Distance from sub-Saharan Africa predicts mutational load in diverse human genomes: http://www.pnas.org/content/113/4/E440.abstract
“Out Of Africa” Bottleneck Is What Really Matters For Mutations: https://www.gnxp.com/WordPress/2017/04/26/out-of-africa-bottleneck-is-what-really-matters-for-mutations/
But there is also a lot of archaeological and some ancient genetic DNA now that indicates that the vast majority of non-African ancestry began to expand rapidly around 50-60,000 years ago. This is tens of thousands of years after the lowest value given above. Therefore, again we have to make recourse to a long period of separation before the expansion. This is not implausible on the face of it, but we could do something else: just assume there’s an artifact with their methods and the inferred date of divergence is too old. That would solve many of the issues.

I really don’t know if the above quibbles have any ramification for the site frequency spectrum of deleterious mutations. My own hunch is that no, it doesn’t impact the qualitative results at all.

Figure 3 clearly shows that Europeans are enriched for weak and moderately deleterious mutations (the last category produces weird results, and I wish they’d talked about this more, but they observe that strong deleterious mutations have issues getting detected). Ne is just the effective population size and s is the selection coefficient (bigger number, stronger selection).

Too Much Diversity: https://westhunt.wordpress.com/2012/11/30/too-much-diversity/
There’s a new paper out in Nature, by Wenqing Fu and many other people, about the recent origin of most variants in protein-coding genes. They conclude that most are less than 5-10,000 year old – younger in Europeans than in Africans. This is a natural consequence of the shape of human demographic history – there was a huge population increase with the advent of agriculture, and more people meant more mutations. That agricultural expansion happened somewhat earlier in the Middle East and Europe than in Africa.

...

A very few mutations are beneficial, some are neutral and many are deleterious, although the degree of harm inflicted varies widely. So the population expansion also increased the number of bad mutations – but unless selection also relaxed, it would not have changed the per-capita number of deleterious mutations, or the distribution of their effects (what fraction had large, medium, or small effects on fitness). It increased the diversity of deleterious mutations – they are more motley, not more common. The article never talks about that per-capita number, or, if it did , I was unable to winkle it out. It talks about ages and numbers of mutations – but not the mean number, in either of the two populations studied (European Americans and African Americans) . I think it would been a lot clearer, confused fewer reporters, if it had made that distinction. On the other hand, depending on the facts on the ground, talking about mutational load might be a grant-killer. There was a paper earlier this year (with many of the same authors) that used about half of the same data and did mention per-capita numbers. I’ve discussed it.

...

The paper says that there may be an excess of weakly deleterious mutations in Europeans due to bottlenecks back in the Ice Age. The idea works like this: selection is less efficient in small populations. Deleterious mutations with an effect s < 1/Ne drift freely and are not efficiently removed by selection. This effect takes on the order of Ne generations – so a population reduced to an effective size of of 10,000 for 10,000 generations ( ~250,000 years) would accumulate a large-than-usual number of deleterious mutations of effect size ~10-4. Lohmueller et al wrote about this back in 2008: the scenario they used had a European ancestral bottleneck 200,000 years long, which is A. what you need to make this scenario work and B. impossible, since it’s way before anatomically modern humans left Africa. Back to the drawing board.

disease alleles:
Ascertainment bias can create the illusion of genetic health disparities: https://www.biorxiv.org/content/early/2017/09/28/195768
study  genetics  regularizer  genetic-load  sapiens  europe  africa  comparison  world  recent-selection  org:nat  pop-structure  null-result  pop-diff  multi  evolution  roots  gnxp  scitariat  commentary  summary  migration  gene-drift  long-short-run  bio  preprint  🌞  debate  hmm  idk  disease  genomics  bioinformatics  spreading  west-hunter  antiquity  eden 
january 2017 by nhaliday
The Genetic Architecture of Quantitative Traits Cannot Be Inferred from Variance Component Analysis
Classical quantitative genetic analyses estimate additive and non-additive genetic and environmental components of variance from phenotypes of related individuals without knowing the identities of quantitative trait loci (QTLs). Many studies have found a large proportion of quantitative trait variation can be attributed to the additive genetic variance (VA), providing the basis for claims that non-additive gene actions are unimportant. In this study, we show that arbitrarily defined parameterizations of genetic effects seemingly consistent with non-additive gene actions can also capture the majority of genetic variation. This reveals a logical flaw in using the relative magnitudes of variance components to indicate the relative importance of additive and non-additive gene actions. We discuss the implications and propose that variance component analyses should not be used to infer the genetic architecture of quantitative traits.
study  genetics  QTL  methodology  variance-components  critique  gotchas  nonlinearity  regularizer  🌞  biodet  pro-rata  roots  null-result  bioinformatics 
december 2016 by nhaliday
Science Policy | West Hunter
If my 23andme profile revealed that I was the last of the Plantagenets (as some suspect), and therefore rightfully King of the United States and Defender of Mexico, and I asked you for a general view of the right approach to science and technology – where the most promise is, what should be done, etc – what would you say?

genetically personalized medicine: https://westhunt.wordpress.com/2016/12/08/science-policy/#comment-85698
I have no idea how personalized medicine is supposed to work. Suppose that we sequence your entire genome, and then we intend to tailor a therapeutic approach to your genome.

How do we test it? By trying it on a bunch of genetically similar people? The more genetic details we take into account, the smaller that class is. It could easily become so small that it would be difficult to recruit enough people for a reasonable statistical trial. Second, the more details we take into account, the smaller the class that benefits from the whole testing process – which as far as I can see, is just as expensive as conventional Phasei/II etc trials.

What am I missing?

Now if you are a forethoughtful trillionaire, sure: you manufacture lots of clones just to test therapies you might someday need, and cost is no object.

I think I can see ways you could make it work tho [edit: what did I mean by this?...damnit]
west-hunter  discussion  politics  government  policy  science  technology  the-world-is-just-atoms  🔬  scitariat  meta:science  proposal  genetics  genomics  medicine  meta:medicine  multi  ideas  counter-revolution  poast  homo-hetero  generalization  scale  antidemos  alt-inst  applications  dimensionality  high-dimension  bioinformatics  no-go  volo-avolo  magnitude  trump  2016-election  questions 
december 2016 by nhaliday
Information Processing: Search results for compressed sensing
https://www.unz.com/jthompson/the-hsu-boundary/
http://infoproc.blogspot.com/2017/09/phase-transitions-and-genomic.html
Added: Here are comments from "Donoho-Student":
Donoho-Student says:
September 14, 2017 at 8:27 pm GMT • 100 Words

The Donoho-Tanner transition describes the noise-free (h2=1) case, which has a direct analog in the geometry of polytopes.

The n = 30s result from Hsu et al. (specifically the value of the coefficient, 30, when p is the appropriate number of SNPs on an array and h2 = 0.5) is obtained via simulation using actual genome matrices, and is original to them. (There is no simple formula that gives this number.) The D-T transition had only been established in the past for certain classes of matrices, like random matrices with specific distributions. Those results cannot be immediately applied to genomes.

The estimate that s is (order of magnitude) 10k is also a key input.

I think Hsu refers to n = 1 million instead of 30 * 10k = 300k because the effective SNP heritability of IQ might be less than h2 = 0.5 — there is noise in the phenotype measurement, etc.

Donoho-Student says:
September 15, 2017 at 11:27 am GMT • 200 Words

Lasso is a common statistical method but most people who use it are not familiar with the mathematical theorems from compressed sensing. These results give performance guarantees and describe phase transition behavior, but because they are rigorous theorems they only apply to specific classes of sensor matrices, such as simple random matrices. Genomes have correlation structure, so the theorems do not directly apply to the real world case of interest, as is often true.

What the Hsu paper shows is that the exact D-T phase transition appears in the noiseless (h2 = 1) problem using genome matrices, and a smoothed version appears in the problem with realistic h2. These are new results, as is the prediction for how much data is required to cross the boundary. I don’t think most gwas people are familiar with these results. If they did understand the results they would fund/design adequately powered studies capable of solving lots of complex phenotypes, medical conditions as well as IQ, that have significant h2.

Most people who use lasso, as opposed to people who prove theorems, are not even aware of the D-T transition. Even most people who prove theorems have followed the Candes-Tao line of attack (restricted isometry property) and don’t think much about D-T. Although D eventually proved some things about the phase transition using high dimensional geometry, it was initially discovered via simulation using simple random matrices.
hsu  list  stream  genomics  genetics  concept  stats  methodology  scaling-up  scitariat  sparsity  regression  biodet  bioinformatics  norms  nibble  compressed-sensing  applications  search  ideas  multi  albion  behavioral-gen  iq  state-of-art  commentary  explanation  phase-transition  measurement  volo-avolo  regularization  levers  novelty  the-trenches  liner-notes  clarity  random-matrices  innovation  high-dimension  linear-models  grokkability-clarity 
november 2016 by nhaliday
Wiring the Brain: The dark arts of statistical genomics
GCTA+GWAS

This is where GCTA analyses come in. The idea here is to estimate the total contribution of common risk variants in the population to determining who develops a disease, without necessarily having to identify them all individually first. The basic premise of GCTA analyses is to not worry about picking up the signatures of individual SNPs, but instead to use all the SNPs analysed to simply measure relatedness among people in your study population. Then you can compare that index of (distant) relatedness to an index of phenotypic similarity. For a trait like height, that will be a correlation between two continuous measures. For diseases, however, the phenotypic measure is categorical – you either have been diagnosed with it or you haven’t.
explanation  methodology  genetics  population-genetics  bio  enhancement  GWAS  variance-components  🌞  scaling-up  bioinformatics  genomics  nibble  🔬  article  GCTA  tip-of-tongue  spearhead  pop-structure  psychiatry  autism  disease  models  map-territory  QTL  concept  levers  ideas  biodet 
october 2016 by nhaliday
Information Processing: Evidence for (very) recent natural selection in humans
height (+), infant head circumference (+), some biomolecular stuff, female hip size (+), male BMI (-), age of menarche (+, !!), and birth weight (+)

Strong selection in the recent past can cause allele frequencies to change significantly. Consider two different SNPs, which today have equal minor allele frequency (for simplicity, let this be equal to one half). Assume that one SNP was subject to strong recent selection, and another (neutral) has had approximately zero effect on fitness. The advantageous version of the first SNP was less common in the far past, and rose in frequency recently (e.g., over the last 2k years). In contrast, the two versions of the neutral SNP have been present in roughly the same proportion (up to fluctuations) for a long time. Consequently, in the total past breeding population (i.e., going back tens of thousands of years) there have been many more copies of the neutral alleles (and the chunks of DNA surrounding them) than of the positively selected allele. Each of the chunks of DNA around the SNPs we are considering is subject to a roughly constant rate of mutation.

Looking at the current population, one would then expect a larger variety of mutations in the DNA region surrounding the neutral allele (both versions) than near the favored selected allele (which was rarer in the population until very recently, and whose surrounding region had fewer chances to accumulate mutations). By comparing the difference in local mutational diversity between the two versions of the neutral allele (should be zero modulo fluctuations, for the case MAF = 0.5), and between the (+) and (-) versions of the selected allele (nonzero, due to relative change in frequency), one obtains a sensitive signal for recent selection. See figure at bottom for more detail. In the paper what I call mutational diversity is measured by looking at distance distribution of singletons, which are rare variants found in only one individual in the sample under study.

The 2,000 year selection of the British: http://www.unz.com/gnxp/the-2000-year-selection-of-the-british/

Detection of human adaptation during the past 2,000 years: http://www.biorxiv.org/content/early/2016/05/07/052084

The key idea is that recent selection distorts the ancestral genealogy of sampled haplotypes at a selected site. In particular, the terminal (tip) branches of the genealogy tend to be shorter for the favored allele than for the disfavored allele, and hence, haplotypes carrying the favored allele will tend to carry fewer singleton mutations (Fig. 1A-C and SOM).

To capture this effect, we use the sum of distances to the nearest singleton in each direction from a test SNP as a summary statistic (Fig. 1D).

Figure 1. Illustration of the SDS method.

Figure 2. Properties of SDS.

Based on a recent model of European demography [25], we estimate that the mean tip length for a neutral sample of 3,000 individuals is 75 generations, or roughly 2,000 years (Fig. 2A). Since SDS aims to measure changes in tip lengths of the genealogy, we conjectured that it would be most likely to detect selection approximately within this timeframe.

Indeed, in simulated sweep models with samples of 3,000 individuals (Fig. 2B,C and fig. S2), we find that SDS focuses specifically on very recent time scales, and has equal power for hard and soft sweeps within this timeframe. At individual loci, SDS is powered to detect ~2% selection over 100 generations. Moreover, SDS has essentially no power to detect older selection events that stopped >100 generations before the present. In contrast, a commonly-used test for hard sweeps, iHS [12], integrates signal over much longer timescales (>1,000 generations), has no specificity to the more recent history, and has essentially no power for the soft sweep scenarios.

Catching evolution in the act with the Singleton Density Score: http://www.molecularecologist.com/2016/05/catching-evolution-in-the-act-with-the-singleton-density-score/
The Singleton Density Score (SDS) is a measure based on the idea that changes in allele frequencies induced by recent selection can be observed in a sample’s genealogy as differences in the branch length distribution.

You don’t need a weatherman: https://westhunt.wordpress.com/2016/05/08/you-dont-need-a-weatherman/
You can do a million cool things with this method. Since the effective time scale goes inversely with sample size, you could look at evolution in England over the past 1000 years or the past 500. Differencing, over the period 1-1000 AD. Since you can look at polygenic traits, you can see whether the alleles favoring higher IQs have increased or decreased in frequency over various stretches of time. You can see if Greg Clark’s proposed mechanism really happened. You can (soon) tell if creeping Pinkerization is genetic, or partly genetic.

You could probably find out if the Middle Easterners really have gotten slower, and when it happened.

Looking at IQ alleles, you could not only show whether the Ashkenazi Jews really are biologically smarter but if so, when it happened, which would give you strong hints as to how it happened.

We know that IQ-favoring alleles are going down (slowly) right now (not counting immigration, which of course drastically speeds it up). Soon we will know if this was true while Russia was under the Mongol yoke – we’ll know how smart Periclean Athenians were and when that boost occurred. And so on. And on!

...

“The pace has been so rapid that humans have changed significantly in body and mind over recorded history."

bicameral mind: https://westhunt.wordpress.com/2016/05/08/you-dont-need-a-weatherman/#comment-78934

https://westhunt.wordpress.com/2016/05/08/you-dont-need-a-weatherman/#comment-78939
Chinese, Koreans, Japanese and Ashkenazi Jews all have high levels of myopia. Australian Aborigines have almost none, I think.

https://westhunt.wordpress.com/2016/05/08/you-dont-need-a-weatherman/#comment-79094
I expect that the fall of all great empires is based on long term dysgenic trends. There is no logical reason why so many empires and civilizations throughout history could grow so big and then not simply keep growing, except for dysgenics.
--
I can think of about twenty other possible explanations off the top of my head, but dysgenics is a possible cause.
--
I agree with DataExplorer. The largest factor in the decay of civilizations is dysgenics. The discussion by R. A. Fisher 1930 p. 193 is very cogent on this matter. Soon we will know for sure.
--
Sometimes it can be rapid. Assume that the upper classes are mostly urban, and somewhat sharper than average. Then the Mongols arrive.
sapiens  study  genetics  evolution  hsu  trends  data  visualization  recent-selection  methodology  summary  GWAS  2016  scitariat  britain  commentary  embodied  biodet  todo  control  multi  gnxp  pop-diff  stat-power  mutation  hypothesis-testing  stats  age-generation  QTL  gene-drift  comparison  marginal  aDNA  simulation  trees  time  metrics  density  measurement  conquest-empire  pinker  population-genetics  aphorism  simler  dennett  👽  the-classics  iron-age  mediterranean  volo-avolo  alien-character  russia  medieval  spearhead  gregory-clark  bio  preprint  domestication  MENA  iq  islam  history  poast  west-hunter  scale  behavioral-gen  gotchas  cost-benefit  genomics  bioinformatics  stylized-facts  concept  levers  🌞  pop-structure  nibble  explanation  ideas  usa  dysgenics  list  applicability-prereqs  cohesion  judaism  visuo  correlation  china  asia  japan  korea  civilization  gibbon  rot  roots  fisher  giants  books  old-anglo  selection  agri-mindset  hari-seldon 
august 2016 by nhaliday
« earlier      
per page:    204080120160

bundles : frame

related tags

2016-election  abortion-contraception-embryo  accuracy  acm  aDNA  advertising  africa  age-generation  agri-mindset  agriculture  ai  ai-control  albion  algorithms  alien-character  alignment  allodium  alt-inst  amazon  analogy  analytical-holistic  anglo  anglosphere  antidemos  antiquity  aphorism  apollonian-dionysian  apple  applicability-prereqs  applications  archaeology  archaics  aristos  art  article  asia  assortative-mating  atmosphere  atoms  attention  audio  authoritarianism  autism  auto-learning  automation  backup  barons  bayesian  behavioral-gen  being-becoming  being-right  benchmarks  benevolence  best-practices  biases  big-peeps  big-picture  big-surf  bio  biodet  bioinformatics  biotech  bits  books  bounded-cognition  brands  britain  broad-econ  business  business-models  calculator  california  canada  cancer  candidate-gene  canon  capital  capitalism  cartoons  causation  chart  checklists  china  civil-liberty  civilization  clarity  class  class-warfare  classic  climate-change  clown-world  coarse-fine  cocktail  coding-theory  cog-psych  cohesion  cold-war  collaboration  commentary  comparison  compensation  competition  complement-substitute  composition-decomposition  compressed-sensing  computation  computer-vision  concentration-of-measure  concept  conceptual-vocab  concrete  conference  confidence  confounding  confusion  conquest-empire  contest  context  contrarianism  control  convexity-curvature  cooperate-defect  correlation  cost-benefit  counter-revolution  courage  course  cracker-econ  creative  crime  critique  crooked  crypto  cs  cultural-dynamics  culture-war  curiosity  curvature  cycles  cynicism-idealism  dark-arts  darwinian  data  data-science  database  dataset  death  debate  debt  decision-making  deep-learning  deep-materialism  defense  definite-planning  degrees-of-freedom  democracy  demographics  dennett  density  detail-architecture  developing-world  dimensionality  direct-indirect  dirty-hands  discussion  disease  distribution  domestication  DP  dropbox  drugs  duplication  dysgenics  early-modern  eastern-europe  economics  econotariat  ed-yong  eden  education  effect-size  efficiency  egalitarianism-hierarchy  einstein  elite  embodied  empirical  ems  endo-exo  endocrine  endogenous-exogenous  energy-resources  engineering  enhancement  ensembles  entrepreneurialism  environment  envy  epidemiology  equilibrium  error  essay  essence-existence  estimate  ethics  europe  evidence-based  evolution  examples  expansionism  explanans  explanation  exploratory  exposition  extra-introversion  facebook  faq  farmers-and-foragers  fashun  FDA  feudal  fiction  finance  fisher  flexibility  fluid  flux-stasis  focus  foreign-lang  frontier  futurism  gallic  games  gavisti  GCTA  gedanken  gender  gender-diff  gene-drift  gene-flow  generalization  generative  genetic-correlation  genetic-load  genetics  genomics  geoengineering  geography  germanic  giants  gibbon  gnosis-logos  gnxp  god-man-beast-victim  google  gotchas  government  grad-school  graphical-models  graphs  gregory-clark  grokkability-clarity  ground-up  growth-econ  guide  GWAS  gwern  GxE  hard-tech  hari-seldon  harvard  hashing  heavy-industry  heterodox  hidden-motives  high-dimension  high-variance  higher-ed  history  hmm  homo-hetero  honor  howto  hsu  huge-data-the-biggest  human-capital  human-ml  hypocrisy  hypothesis-testing  ideas  identity-politics  idk  immune  impact  impetus  india  individualism-collectivism  industrial-revolution  inequality  info-dynamics  information-theory  init  innovation  insight  institutions  intel  interdisciplinary  interests  internet  intersection-connectedness  interview  intricacy  investing  ioannidis  iq  iron-age  is-ought  islam  iteration-recursion  janus  japan  jargon  jobs  journos-pundits  judaism  justice  kaggle  kinship  knowledge  korea  kumbaya-kult  language  latent-variables  latin-america  law  leadership  learning  lecture-notes  lectures  len:long  len:short  lens  let-me-see  levers  leviathan  lexical  libraries  limits  linear-models  linearity  liner-notes  links  list  literature  local-global  long-short-run  longevity  love-hate  low-hanging  machine-learning  macro  magnitude  management  map-territory  marginal  market-power  markets  markov  math  math.CA  math.CO  maxim-gun  measure  measurement  media  medicine  medieval  mediterranean  MENA  mendel-randomization  meta-analysis  meta:medicine  meta:prediction  meta:science  metabolic  metabuch  metameta  methodology  metric-space  metrics  microfoundations  microsoft  migration  military  minimum-viable  missing-heritability  mit  ML-MAP-E  mobile  model-class  model-organism  models  modernity  moments  monetary-fiscal  money  morality  mostly-modern  move-fast-(and-break-things)  multi  multiplicative  musk  mutation  myth  n-factor  narrative  nationalism-globalism  nature  network-structure  neuro  new-religion  news  nibble  nietzschean  nips  nitty-gritty  no-go  noble-lie  nonlinearity  nordic  norms  northeast  novelty  nuclear  null-result  nutrition  nyc  objektbuch  occam  occident  ocw  old-anglo  oly  open-closed  open-problems  optimism  order-disorder  ORFE  org:anglo  org:biz  org:edu  org:foreign  org:gov  org:mag  org:mat  org:med  org:nat  org:rec  org:sci  organizing  orient  oscillation  oss  outcome-risk  outliers  overflow  PAC  papers  paradox  parallax  parasites-microbiome  parsimony  paternal-age  path-dependence  patho-altruism  patience  pdf  peace-violence  pennsylvania  people  personality  perturbation  pessimism  phalanges  pharma  phase-transition  philosophy  physics  pinker  piracy  plots  poast  podcast  polanyi-marx  polarization  policy  polisci  politics  pop-diff  pop-structure  popsci  population  population-genetics  power  power-law  pre-ww2  prediction  preprint  presentation  primitivism  princeton  priors-posteriors  pro-rata  probability  profile  programming  properties  proposal  psych-architecture  psychiatry  psycho-atoms  psychology  psychometrics  python  q-n-a  QTL  quantum  questions  quixotic  quotes  race  random  random-matrices  randy-ayndy  ranking  rant  ratty  reading  realness  reason  recent-selection  recruiting  reddit  redistribution  reference  reflection  regression  regularization  regularizer  regulation  reinforcement  religion  rent-seeking  replication  repo  research  retention  revolution  rhetoric  rhythm  risk  ritual  robotics  roots  rot  russia  rust  s:*  s:***  sapiens  scale  scaling-up  science  scifi-fantasy  scitariat  search  securities  security  selection  sequential  shakespeare  shift  sib-study  signal-noise  signaling  similarity  simler  simulation  sinosphere  skeleton  skunkworks  sky  slides  social  social-choice  social-norms  socs-and-mops  software  space  space-complexity  sparsity  spearhead  speculation  speed  speedometer  spreading  ssc  stackex  stagnation  stanford  startups  stat-mech  stat-power  state-of-art  statesmen  stats  status  stereotypes  stochastic-processes  stock-flow  stories  strategy  stream  stress  structure  study  studying  stylized-facts  sublinear  success  summary  survey  sv  synchrony  systematic-ad-hoc  tactics  tails  talks  taxes  tcs  tech  technology  telos-atelos  the-classics  the-devil  the-founding  the-great-west-whale  the-south  the-trenches  the-watchers  the-west  the-world-is-just-atoms  theory-of-mind  theos  thick-thin  thiel  things  thinking  time  time-preference  tip-of-tongue  todo  tools  top-n  toys  traces  track-record  trade  tradeoffs  transportation  trees  trends  tribalism  trivia  trump  trust  truth  tutorial  twin-study  twitter  unaffiliated  uncertainty  unintended-consequences  unit  urban  urban-rural  us-them  usa  vampire-squid  variance-components  venture  virginia-DC  visual-understanding  visualization  visuo  vitality  volo-avolo  war  wealth  web  welfare-state  west-hunter  westminster  wiki  wild-ideas  winner-take-all  wisdom  within-without  wonkish  world  world-war  X-not-about-Y  yoga  zeitgeist  zero-positive-sum  zooming  🌞  🎩  👽  🔬  🖥 

Copy this bookmark:



description:


tags: