nhaliday + stat-power   24

Estimation of effect size distribution from genome-wide association studies and implications for future discoveries
We report a set of tools to estimate the number of susceptibility loci and the distribution of their effect sizes for a trait on the basis of discoveries from existing genome-wide association studies (GWASs). We propose statistical power calculations for future GWASs using estimated distributions of effect sizes. Using reported GWAS findings for height, Crohn’s disease and breast, prostate and colorectal (BPC) cancers, we determine that each of these traits is likely to harbor additional loci within the spectrum of low-penetrance common variants. These loci, which can be identified from sufficiently powerful GWASs, together could explain at least 15–20% of the known heritability of these traits. However, for BPC cancers, which have modest familial aggregation, our analysis suggests that risk models based on common variants alone will have modest discriminatory power (63.5% area under curve), even with new discoveries.

later paper:
Distribution of allele frequencies and effect sizes and their interrelationships for common genetic susceptibility variants: http://www.pnas.org/content/108/44/18026.full

Recent discoveries of hundreds of common susceptibility SNPs from genome-wide association studies provide a unique opportunity to examine population genetic models for complex traits. In this report, we investigate distributions of various population genetic parameters and their interrelationships using estimates of allele frequencies and effect-size parameters for about 400 susceptibility SNPs across a spectrum of qualitative and quantitative traits. We calibrate our analysis by statistical power for detection of SNPs to account for overrepresentation of variants with larger effect sizes in currently known SNPs that are expected due to statistical power for discovery. Across all qualitative disease traits, minor alleles conferred “risk” more often than “protection.” Across all traits, an inverse relationship existed between “regression effects” and allele frequencies. Both of these trends were remarkably strong for type I diabetes, a trait that is most likely to be influenced by selection, but were modest for other traits such as human height or late-onset diseases such as type II diabetes and cancers. Across all traits, the estimated effect-size distribution suggested the existence of increasingly large numbers of susceptibility SNPs with decreasingly small effects. For most traits, the set of SNPs with intermediate minor allele frequencies (5–20%) contained an unusually small number of susceptibility loci and explained a relatively small fraction of heritability compared with what would be expected from the distribution of SNPs in the general population. These trends could have several implications for future studies of common and uncommon variants.

...

Relationship Between Allele Frequency and Effect Size. We explored the relationship between allele frequency and effect size in different scales. An inverse relationship between the squared regression coefficient and f(1 − f) was observed consistently across different traits (Fig. 3). For a number of these traits, however, the strengths of these relationships become less pronounced after adjustment for ascertainment due to study power. The strength of the trend, as captured by the slope of the fitted line (Table 2), markedly varies between traits, with an almost 10-fold change between the two extremes of distinct types of traits. After adjustment, the most pronounced trend was seen for type I diabetes and Crohn’s disease among qualitative traits and LDL level among quantitative traits. In exploring the relationship between the frequency of the risk allele and the magnitude of the associated risk coefficient (Fig. S4), we observed a quadratic pattern that indicates increasing risk coefficients as the risk-allele frequency diverges away from 0.50 either toward 0 or toward 1. Thus, it appears that regression coefficients for common susceptibility SNPs increase in magnitude monotonically with decreasing minor-allele frequency, irrespective of whether the minor allele confers risk or protection. However, for some traits, such as type I diabetes, risk alleles were predominantly minor alleles, that is, they had frequencies of less than 0.50.
pdf  nibble  study  article  org:nat  🌞  biodet  genetics  population-genetics  GWAS  QTL  distribution  disease  cancer  stat-power  bioinformatics  magnitude  embodied  prediction  scale  scaling-up  variance-components  multi  missing-heritability  effect-size  regression  correlation  data 
november 2017 by nhaliday
Accurate Genomic Prediction Of Human Height | bioRxiv
Stephen Hsu's compressed sensing application paper

We construct genomic predictors for heritable and extremely complex human quantitative traits (height, heel bone density, and educational attainment) using modern methods in high dimensional statistics (i.e., machine learning). Replication tests show that these predictors capture, respectively, ~40, 20, and 9 percent of total variance for the three traits. For example, predicted heights correlate ~0.65 with actual height; actual heights of most individuals in validation samples are within a few cm of the prediction.

https://infoproc.blogspot.com/2017/09/accurate-genomic-prediction-of-human.html

http://infoproc.blogspot.com/2017/11/23andme.html
I'm in Mountain View to give a talk at 23andMe. Their latest funding round was $250M on a (reported) valuation of $1.5B. If I just add up the Crunchbase numbers it looks like almost half a billion invested at this point...

Slides: Genomic Prediction of Complex Traits

Here's how people + robots handle your spit sample to produce a SNP genotype:

https://drive.google.com/file/d/1e_zuIPJr1hgQupYAxkcbgEVxmrDHAYRj/view
study  bio  preprint  GWAS  state-of-art  embodied  genetics  genomics  compressed-sensing  high-dimension  machine-learning  missing-heritability  hsu  scitariat  education  🌞  frontier  britain  regression  data  visualization  correlation  phase-transition  multi  commentary  summary  pdf  slides  brands  skunkworks  hard-tech  presentation  talks  methodology  intricacy  bioinformatics  scaling-up  stat-power  sparsity  norms  nibble  speedometer  stats  linear-models  2017  biodet 
september 2017 by nhaliday
Econometric Modeling as Junk Science
The Credibility Revolution in Empirical Economics: How Better Research Design Is Taking the Con out of Econometrics: https://www.aeaweb.org/articles?id=10.1257/jep.24.2.3

On data, experiments, incentives and highly unconvincing research – papers and hot beverages: https://papersandhotbeverages.wordpress.com/2015/10/31/on-data-experiments-incentives-and-highly-unconvincing-research/
In my view, it has just to do with the fact that academia is a peer monitored organization. In the case of (bad) data collection papers, issues related to measurement are typically boring. They are relegated to appendices, no one really has an incentive to monitor it seriously. The problem is similar in formal theory: no one really goes through the algebra in detail, but it is in principle feasible to do it, and, actually, sometimes these errors are detected. If discussing the algebra of a proof is almost unthinkable in a seminar, going into the details of data collection, measurement and aggregation is not only hard to imagine, but probably intrinsically infeasible.

Something different happens for the experimentalist people. As I was saying, I feel we have come to a point in which many papers are evaluated based on the cleverness and originality of the research design (“Using the World Cup qualifiers as an instrument for patriotism!? Woaw! how cool/crazy is that! I wish I had had that idea”). The sexiness of the identification strategy has too often become a goal in itself. When your peers monitor you paying more attention to the originality of the identification strategy than to the research question, you probably have an incentive to mine reality for ever crazier discontinuities. It is true methodologists have been criticized in the past for analogous reasons, such as being guided by the desire to increase mathematical complexity without a clear benefit. But, if you work with pure formal theory or statistical theory, your work is not meant to immediately answer question about the real world, but instead to serve other researchers in their quest. This is something that can, in general, not be said of applied CI work.

https://twitter.com/pseudoerasmus/status/662007951415238656
This post should have been entitled “Zombies who only think of their next cool IV fix”
https://twitter.com/pseudoerasmus/status/662692917069422592
massive lust for quasi-natural experiments, regression discontinuities
barely matters if the effects are not all that big
I suppose even the best of things must reach their decadent phase; methodological innov. to manias……

https://twitter.com/cblatts/status/920988530788130816
Following this "collapse of small-N social psych results" business, where do I predict econ will collapse? I see two main contenders.
One is lab studies. I dallied with these a few years ago in a Kenya lab. We ran several pilots of N=200 to figure out the best way to treat
and to measure the outcome. Every pilot gave us a different stat sig result. I could have written six papers concluding different things.
I gave up more skeptical of these lab studies than ever before. The second contender is the long run impacts literature in economic history
We should be very suspicious since we never see a paper showing that a historical event had no effect on modern day institutions or dvpt.
On the one hand I find these studies fun, fascinating, and probably true in a broad sense. They usually reinforce a widely believed history
argument with interesting data and a cute empirical strategy. But I don't think anyone believes the standard errors. There's probably a HUGE
problem of nonsignificant results staying in the file drawer. Also, there are probably data problems that don't get revealed, as we see with
the recent Piketty paper (http://marginalrevolution.com/marginalrevolution/2017/10/pikettys-data-reliable.html). So I take that literature with a vat of salt, even if I enjoy and admire the works
I used to think field experiments would show little consistency in results across place. That external validity concerns would be fatal.
In fact the results across different samples and places have proven surprisingly similar across places, and added a lot to general theory
Last, I've come to believe there is no such thing as a useful instrumental variable. The ones that actually meet the exclusion restriction
are so weird & particular that the local treatment effect is likely far different from the average treatment effect in non-transparent ways.
Most of the other IVs don't plausibly meet the e clue ion restriction. I mean, we should be concerned when the IV estimate is always 10x
larger than the OLS coefficient. This I find myself much more persuaded by simple natural experiments that use OLS, diff in diff, or
discontinuities, alongside randomized trials.

What do others think are the cliffs in economics?
PS All of these apply to political science too. Though I have a special extra target in poli sci: survey experiments! A few are good. I like
Dan Corstange's work. But it feels like 60% of dissertations these days are experiments buried in a survey instrument that measure small
changes in response. These at least have large N. But these are just uncontrolled labs, with negligible external validity in my mind.
The good ones are good. This method has its uses. But it's being way over-applied. More people have to make big and risky investments in big
natural and field experiments. Time to raise expectations and ambitions. This expectation bar, not technical ability, is the big advantage
economists have over political scientists when they compete in the same space.
(Ok. So are there any friends and colleagues I haven't insulted this morning? Let me know and I'll try my best to fix it with a screed)

HOW MUCH SHOULD WE TRUST DIFFERENCES-IN-DIFFERENCES ESTIMATES?∗: https://economics.mit.edu/files/750
Most papers that employ Differences-in-Differences estimation (DD) use many years of data and focus on serially correlated outcomes but ignore that the resulting standard errors are inconsistent. To illustrate the severity of this issue, we randomly generate placebo laws in state-level data on female wages from the Current Population Survey. For each law, we use OLS to compute the DD estimate of its “effect” as well as the standard error of this estimate. These conventional DD standard errors severely understate the standard deviation of the estimators: we find an “effect” significant at the 5 percent level for up to 45 percent of the placebo interventions. We use Monte Carlo simulations to investigate how well existing methods help solve this problem. Econometric corrections that place a specific parametric form on the time-series process do not perform well. Bootstrap (taking into account the auto-correlation of the data) works well when the number of states is large enough. Two corrections based on asymptotic approximation of the variance-covariance matrix work well for moderate numbers of states and one correction that collapses the time series information into a “pre” and “post” period and explicitly takes into account the effective sample size works well even for small numbers of states.

‘METRICS MONDAY: 2SLS–CHRONICLE OF A DEATH FORETOLD: http://marcfbellemare.com/wordpress/12733
As it turns out, Young finds that
1. Conventional tests tend to overreject the null hypothesis that the 2SLS coefficient is equal to zero.
2. 2SLS estimates are falsely declared significant one third to one half of the time, depending on the method used for bootstrapping.
3. The 99-percent confidence intervals (CIs) of those 2SLS estimates include the OLS point estimate over 90 of the time. They include the full OLS 99-percent CI over 75 percent of the time.
4. 2SLS estimates are extremely sensitive to outliers. Removing simply one outlying cluster or observation, almost half of 2SLS results become insignificant. Things get worse when removing two outlying clusters or observations, as over 60 percent of 2SLS results then become insignificant.
5. Using a Durbin-Wu-Hausman test, less than 15 percent of regressions can reject the null that OLS estimates are unbiased at the 1-percent level.
6. 2SLS has considerably higher mean squared error than OLS.
7. In one third to one half of published results, the null that the IVs are totally irrelevant cannot be rejected, and so the correlation between the endogenous variable(s) and the IVs is due to finite sample correlation between them.
8. Finally, fewer than 10 percent of 2SLS estimates reject instrument irrelevance and the absence of OLS bias at the 1-percent level using a Durbin-Wu-Hausman test. It gets much worse–fewer than 5 percent–if you add in the requirement that the 2SLS CI that excludes the OLS estimate.

Methods Matter: P-Hacking and Causal Inference in Economics*: http://ftp.iza.org/dp11796.pdf
Applying multiple methods to 13,440 hypothesis tests reported in 25 top economics journals in 2015, we show that selective publication and p-hacking is a substantial problem in research employing DID and (in particular) IV. RCT and RDD are much less problematic. Almost 25% of claims of marginally significant results in IV papers are misleading.

https://twitter.com/NoamJStein/status/1040887307568664577
Ever since I learned social science is completely fake, I've had a lot more time to do stuff that matters, like deadlifting and reading about Mediterranean haplogroups
--
Wait, so, from fakest to realest IV>DD>RCT>RDD? That totally matches my impression.

https://twitter.com/wwwojtekk/status/1190731344336293889
https://archive.is/EZu0h
Great (not completely new but still good to have it in one place) discussion of RCTs and inference in economics by Deaton, my favorite sentences (more general than just about RCT) below
Randomization in the tropics revisited: a theme and eleven variations: https://scholar.princeton.edu/sites/default/files/deaton/files/deaton_randomization_revisited_v3_2019.pdf
org:junk  org:edu  economics  econometrics  methodology  realness  truth  science  social-science  accuracy  generalization  essay  article  hmm  multi  study  🎩  empirical  causation  error  critique  sociology  criminology  hypothesis-testing  econotariat  broad-econ  cliometrics  endo-exo  replication  incentives  academia  measurement  wire-guided  intricacy  twitter  social  discussion  pseudoE  effect-size  reflection  field-study  stat-power  piketty  marginal-rev  commentary  data-science  expert-experience  regression  gotchas  rant  map-territory  pdf  simulation  moments  confidence  bias-variance  stats  endogenous-exogenous  control  meta:science  meta-analysis  outliers  summary  sampling  ensembles  monte-carlo  theory-practice  applicability-prereqs  chart  comparison  shift  ratty  unaffiliated  garett-jones 
june 2017 by nhaliday
Why we should love null results – The 100% CI
https://twitter.com/StuartJRitchie/status/870257682233659392
This is a must-read blog for many reasons, but biggest is: it REALLY matters if a hypothesis is likely to be true.
Strikes me that the areas of psychology with the most absurd hypotheses (ones least likely to be true) *AHEMSOCIALPRIMINGAHEM* are also...
...the ones with extremely small sample sizes. So this already-scary graph from the blogpost becomes all the more terrifying:
scitariat  explanation  science  hypothesis-testing  methodology  null-result  multi  albion  twitter  social  commentary  psychology  social-psych  social-science  meta:science  data  visualization  nitty-gritty  stat-power  priors-posteriors 
june 2017 by nhaliday
Educational Romanticism & Economic Development | pseudoerasmus
https://twitter.com/GarettJones/status/852339296358940672
deleeted

https://twitter.com/GarettJones/status/943238170312929280
https://archive.is/p5hRA

Did Nations that Boosted Education Grow Faster?: http://econlog.econlib.org/archives/2012/10/did_nations_tha.html
On average, no relationship. The trendline points down slightly, but for the time being let's just call it a draw. It's a well-known fact that countries that started the 1960's with high education levels grew faster (example), but this graph is about something different. This graph shows that countries that increased their education levels did not grow faster.

Where has all the education gone?: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1016.2704&rep=rep1&type=pdf

https://twitter.com/GarettJones/status/948052794681966593
https://archive.is/kjxqp

https://twitter.com/GarettJones/status/950952412503822337
https://archive.is/3YPic

https://twitter.com/pseudoerasmus/status/862961420065001472
http://hanushek.stanford.edu/publications/schooling-educational-achievement-and-latin-american-growth-puzzle

The Case Against Education: What's Taking So Long, Bryan Caplan: http://econlog.econlib.org/archives/2015/03/the_case_agains_9.html

The World Might Be Better Off Without College for Everyone: https://www.theatlantic.com/magazine/archive/2018/01/whats-college-good-for/546590/
Students don't seem to be getting much out of higher education.
- Bryan Caplan

College: Capital or Signal?: http://www.economicmanblog.com/2017/02/25/college-capital-or-signal/
After his review of the literature, Caplan concludes that roughly 80% of the earnings effect from college comes from signalling, with only 20% the result of skill building. Put this together with his earlier observations about the private returns to college education, along with its exploding cost, and Caplan thinks that the social returns are negative. The policy implications of this will come as very bitter medicine for friends of Bernie Sanders.

Doubting the Null Hypothesis: http://www.arnoldkling.com/blog/doubting-the-null-hypothesis/

Is higher education/college in the US more about skill-building or about signaling?: https://www.quora.com/Is-higher-education-college-in-the-US-more-about-skill-building-or-about-signaling
ballpark: 50% signaling, 30% selection, 20% addition to human capital
more signaling in art history, more human capital in engineering, more selection in philosophy

Econ Duel! Is Education Signaling or Skill Building?: http://marginalrevolution.com/marginalrevolution/2016/03/econ-duel-is-education-signaling-or-skill-building.html
Marginal Revolution University has a brand new feature, Econ Duel! Our first Econ Duel features Tyler and me debating the question, Is education more about signaling or skill building?

Against Tulip Subsidies: https://slatestarcodex.com/2015/06/06/against-tulip-subsidies/

https://www.overcomingbias.com/2018/01/read-the-case-against-education.html

https://nintil.com/2018/02/05/notes-on-the-case-against-education/

https://www.nationalreview.com/magazine/2018-02-19-0000/bryan-caplan-case-against-education-review

https://spottedtoad.wordpress.com/2018/02/12/the-case-against-education/
Most American public school kids are low-income; about half are non-white; most are fairly low skilled academically. For most American kids, the majority of the waking hours they spend not engaged with electronic media are at school; the majority of their in-person relationships are at school; the most important relationships they have with an adult who is not their parent is with their teacher. For their parents, the most important in-person source of community is also their kids’ school. Young people need adult mirrors, models, mentors, and in an earlier era these might have been provided by extended families, but in our own era this all falls upon schools.

Caplan gestures towards work and earlier labor force participation as alternatives to school for many if not all kids. And I empathize: the years that I would point to as making me who I am were ones where I was working, not studying. But they were years spent working in schools, as a teacher or assistant. If schools did not exist, is there an alternative that we genuinely believe would arise to draw young people into the life of their community?

...

It is not an accident that the state that spends the least on education is Utah, where the LDS church can take up some of the slack for schools, while next door Wyoming spends almost the most of any state at $16,000 per student. Education is now the one surviving binding principle of the society as a whole, the one black box everyone will agree to, and so while you can press for less subsidization of education by government, and for privatization of costs, as Caplan does, there’s really nothing people can substitute for it. This is partially about signaling, sure, but it’s also because outside of schools and a few religious enclaves our society is but a darkling plain beset by winds.

This doesn’t mean that we should leave Caplan’s critique on the shelf. Much of education is focused on an insane, zero-sum race for finite rewards. Much of schooling does push kids, parents, schools, and school systems towards a solution ad absurdum, where anything less than 100 percent of kids headed to a doctorate and the big coding job in the sky is a sign of failure of everyone concerned.

But let’s approach this with an eye towards the limits of the possible and the reality of diminishing returns.

https://westhunt.wordpress.com/2018/01/27/poison-ivy-halls/
https://westhunt.wordpress.com/2018/01/27/poison-ivy-halls/#comment-101293
The real reason the left would support Moander: the usual reason. because he’s an enemy.

https://westhunt.wordpress.com/2018/02/01/bright-college-days-part-i/
I have a problem in thinking about education, since my preferences and personal educational experience are atypical, so I can’t just gut it out. On the other hand, knowing that puts me ahead of a lot of people that seem convinced that all real people, including all Arab cabdrivers, think and feel just as they do.

One important fact, relevant to this review. I don’t like Caplan. I think he doesn’t understand – can’t understand – human nature, and although that sometimes confers a different and interesting perspective, it’s not a royal road to truth. Nor would I want to share a foxhole with him: I don’t trust him. So if I say that I agree with some parts of this book, you should believe me.

...

Caplan doesn’t talk about possible ways of improving knowledge acquisition and retention. Maybe he thinks that’s impossible, and he may be right, at least within a conventional universe of possibilities. That’s a bit outside of his thesis, anyhow. Me it interests.

He dismisses objections from educational psychologists who claim that studying a subject improves you in subtle ways even after you forget all of it. I too find that hard to believe. On the other hand, it looks to me as if poorly-digested fragments of information picked up in college have some effect on public policy later in life: it is no coincidence that most prominent people in public life (at a given moment) share a lot of the same ideas. People are vaguely remembering the same crap from the same sources, or related sources. It’s correlated crap, which has a much stronger effect than random crap.

These widespread new ideas are usually wrong. They come from somewhere – in part, from higher education. Along this line, Caplan thinks that college has only a weak ideological effect on students. I don’t believe he is correct. In part, this is because most people use a shifting standard: what’s liberal or conservative gets redefined over time. At any given time a population is roughly half left and half right – but the content of those labels changes a lot. There’s a shift.

https://westhunt.wordpress.com/2018/02/01/bright-college-days-part-i/#comment-101492
I put it this way, a while ago: “When you think about it, falsehoods, stupid crap, make the best group identifiers, because anyone might agree with you when you’re obviously right. Signing up to clear nonsense is a better test of group loyalty. A true friend is with you when you’re wrong. Ideally, not just wrong, but barking mad, rolling around in your own vomit wrong.”
--
You just explained the Credo quia absurdum doctrine. I always wondered if it was nonsense. It is not.
--
Someone on twitter caught it first – got all the way to “sliding down the razor blade of life”. Which I explained is now called “transitioning”

What Catholics believe: https://theweek.com/articles/781925/what-catholics-believe
We believe all of these things, fantastical as they may sound, and we believe them for what we consider good reasons, well attested by history, consistent with the most exacting standards of logic. We will profess them in this place of wrath and tears until the extraordinary event referenced above, for which men and women have hoped and prayed for nearly 2,000 years, comes to pass.

https://westhunt.wordpress.com/2018/02/05/bright-college-days-part-ii/
According to Caplan, employers are looking for conformity, conscientiousness, and intelligence. They use completion of high school, or completion of college as a sign of conformity and conscientiousness. College certainly looks as if it’s mostly signaling, and it’s hugely expensive signaling, in terms of college costs and foregone earnings.

But inserting conformity into the merit function is tricky: things become important signals… because they’re important signals. Otherwise useful actions are contraindicated because they’re “not done”. For example, test scores convey useful information. They could help show that an applicant is smart even though he attended a mediocre school – the same role they play in college admissions. But employers seldom request test scores, and although applicants may provide them, few do. Caplan says ” The word on the street… [more]
econotariat  pseudoE  broad-econ  economics  econometrics  growth-econ  education  human-capital  labor  correlation  null-result  world  developing-world  commentary  spearhead  garett-jones  twitter  social  pic  discussion  econ-metrics  rindermann-thompson  causation  endo-exo  biodet  data  chart  knowledge  article  wealth-of-nations  latin-america  study  path-dependence  divergence  🎩  curvature  microfoundations  multi  convexity-curvature  nonlinearity  hanushek  volo-avolo  endogenous-exogenous  backup  pdf  people  policy  monetary-fiscal  wonkish  cracker-econ  news  org:mag  local-global  higher-ed  impetus  signaling  rhetoric  contrarianism  domestication  propaganda  ratty  hanson  books  review  recommendations  distribution  externalities  cost-benefit  summary  natural-experiment  critique  rent-seeking  mobility  supply-demand  intervention  shift  social-choice  government  incentives  interests  q-n-a  street-fighting  objektbuch  X-not-about-Y  marginal-rev  c:***  qra  info-econ  info-dynamics  org:econlib  yvain  ssc  politics  medicine  stories 
april 2017 by nhaliday
WHAT'S TO KNOW ABOUT THE CREDIBILITY OF EMPIRICAL ECONOMICS? - Ioannidis - 2013 - Journal of Economic Surveys - Wiley Online Library
Abstract. The scientific credibility of economics is itself a scientific question that can be addressed with both theoretical speculations and empirical data. In this review, we examine the major parameters that are expected to affect the credibility of empirical economics: sample size, magnitude of pursued effects, number and pre-selection of tested relationships, flexibility and lack of standardization in designs, definitions, outcomes and analyses, financial and other interests and prejudices, and the multiplicity and fragmentation of efforts. We summarize and discuss the empirical evidence on the lack of a robust reproducibility culture in economics and business research, the prevalence of potential publication and other selective reporting biases, and other failures and biases in the market of scientific information. Overall, the credibility of the economics literature is likely to be modest or even low.

The Power of Bias in Economics Research: http://onlinelibrary.wiley.com/doi/10.1111/ecoj.12461/full
We investigate two critical dimensions of the credibility of empirical economics research: statistical power and bias. We survey 159 empirical economics literatures that draw upon 64,076 estimates of economic parameters reported in more than 6,700 empirical studies. Half of the research areas have nearly 90% of their results under-powered. The median statistical power is 18%, or less. A simple weighted average of those reported results that are adequately powered (power ≥ 80%) reveals that nearly 80% of the reported effects in these empirical economics literatures are exaggerated; typically, by a factor of two and with one-third inflated by a factor of four or more.

Economics isn't a bogus science — we just don't use it correctly: http://www.latimes.com/opinion/op-ed/la-oe-ioannidis-economics-is-a-science-20171114-story.html
https://archive.is/AU7Xm
study  ioannidis  social-science  meta:science  economics  methodology  critique  replication  bounded-cognition  error  stat-power  🎩  🔬  info-dynamics  piracy  empirical  biases  econometrics  effect-size  network-structure  realness  paying-rent  incentives  academia  multi  evidence-based  news  org:rec  rhetoric  contrarianism  backup  cycles  finance  huge-data-the-biggest  org:local 
january 2017 by nhaliday
Information Processing: What is medicine’s 5 sigma?
I'm not aware of this history you reference, but I am only a recent entrant into this field. On the other hand Ioannidis is both a long time genomics researcher and someone who does meta-research on science, so he should know. He may have even written a paper on this subject -- I seem to recall he had hard numbers on the rate of replication of candidate gene studies and claimed it was in the low percents. BTW, this result shows that the vaunted intuition of biomedical types about "how things really work" in the human body is worth very little. We are much better off, in my opinion, relying on machine learning methods and brute force statistical power than priors based on, e.g., knowledge of biochemical pathways or cartoon models of cell function. (Even though such things are sometimes deemed sufficient to raise ~$100m in biotech investment!) This situation may change in the future but the record from the first decade of the 21st century is there for any serious scholar of the scientific method to study.

Both Ioannidis and I (through separate and independent analyses) feel that modern genomics is a good example of biomedical science that (now) actually works and produces results that replicate with relatively high confidence. It should be a model for other areas ...
hsu  replication  science  medicine  scitariat  meta:science  evidence-based  ioannidis  video  interview  bio  genomics  lens  methodology  thick-thin  candidate-gene  hypothesis-testing  complex-systems  stat-power  bounded-cognition  postmortem  info-dynamics  stats 
november 2016 by nhaliday
Information Processing: Evidence for (very) recent natural selection in humans
height (+), infant head circumference (+), some biomolecular stuff, female hip size (+), male BMI (-), age of menarche (+, !!), and birth weight (+)

Strong selection in the recent past can cause allele frequencies to change significantly. Consider two different SNPs, which today have equal minor allele frequency (for simplicity, let this be equal to one half). Assume that one SNP was subject to strong recent selection, and another (neutral) has had approximately zero effect on fitness. The advantageous version of the first SNP was less common in the far past, and rose in frequency recently (e.g., over the last 2k years). In contrast, the two versions of the neutral SNP have been present in roughly the same proportion (up to fluctuations) for a long time. Consequently, in the total past breeding population (i.e., going back tens of thousands of years) there have been many more copies of the neutral alleles (and the chunks of DNA surrounding them) than of the positively selected allele. Each of the chunks of DNA around the SNPs we are considering is subject to a roughly constant rate of mutation.

Looking at the current population, one would then expect a larger variety of mutations in the DNA region surrounding the neutral allele (both versions) than near the favored selected allele (which was rarer in the population until very recently, and whose surrounding region had fewer chances to accumulate mutations). By comparing the difference in local mutational diversity between the two versions of the neutral allele (should be zero modulo fluctuations, for the case MAF = 0.5), and between the (+) and (-) versions of the selected allele (nonzero, due to relative change in frequency), one obtains a sensitive signal for recent selection. See figure at bottom for more detail. In the paper what I call mutational diversity is measured by looking at distance distribution of singletons, which are rare variants found in only one individual in the sample under study.

The 2,000 year selection of the British: http://www.unz.com/gnxp/the-2000-year-selection-of-the-british/

Detection of human adaptation during the past 2,000 years: http://www.biorxiv.org/content/early/2016/05/07/052084

The key idea is that recent selection distorts the ancestral genealogy of sampled haplotypes at a selected site. In particular, the terminal (tip) branches of the genealogy tend to be shorter for the favored allele than for the disfavored allele, and hence, haplotypes carrying the favored allele will tend to carry fewer singleton mutations (Fig. 1A-C and SOM).

To capture this effect, we use the sum of distances to the nearest singleton in each direction from a test SNP as a summary statistic (Fig. 1D).

Figure 1. Illustration of the SDS method.

Figure 2. Properties of SDS.

Based on a recent model of European demography [25], we estimate that the mean tip length for a neutral sample of 3,000 individuals is 75 generations, or roughly 2,000 years (Fig. 2A). Since SDS aims to measure changes in tip lengths of the genealogy, we conjectured that it would be most likely to detect selection approximately within this timeframe.

Indeed, in simulated sweep models with samples of 3,000 individuals (Fig. 2B,C and fig. S2), we find that SDS focuses specifically on very recent time scales, and has equal power for hard and soft sweeps within this timeframe. At individual loci, SDS is powered to detect ~2% selection over 100 generations. Moreover, SDS has essentially no power to detect older selection events that stopped >100 generations before the present. In contrast, a commonly-used test for hard sweeps, iHS [12], integrates signal over much longer timescales (>1,000 generations), has no specificity to the more recent history, and has essentially no power for the soft sweep scenarios.

Catching evolution in the act with the Singleton Density Score: http://www.molecularecologist.com/2016/05/catching-evolution-in-the-act-with-the-singleton-density-score/
The Singleton Density Score (SDS) is a measure based on the idea that changes in allele frequencies induced by recent selection can be observed in a sample’s genealogy as differences in the branch length distribution.

You don’t need a weatherman: https://westhunt.wordpress.com/2016/05/08/you-dont-need-a-weatherman/
You can do a million cool things with this method. Since the effective time scale goes inversely with sample size, you could look at evolution in England over the past 1000 years or the past 500. Differencing, over the period 1-1000 AD. Since you can look at polygenic traits, you can see whether the alleles favoring higher IQs have increased or decreased in frequency over various stretches of time. You can see if Greg Clark’s proposed mechanism really happened. You can (soon) tell if creeping Pinkerization is genetic, or partly genetic.

You could probably find out if the Middle Easterners really have gotten slower, and when it happened.

Looking at IQ alleles, you could not only show whether the Ashkenazi Jews really are biologically smarter but if so, when it happened, which would give you strong hints as to how it happened.

We know that IQ-favoring alleles are going down (slowly) right now (not counting immigration, which of course drastically speeds it up). Soon we will know if this was true while Russia was under the Mongol yoke – we’ll know how smart Periclean Athenians were and when that boost occurred. And so on. And on!

...

“The pace has been so rapid that humans have changed significantly in body and mind over recorded history."

bicameral mind: https://westhunt.wordpress.com/2016/05/08/you-dont-need-a-weatherman/#comment-78934

https://westhunt.wordpress.com/2016/05/08/you-dont-need-a-weatherman/#comment-78939
Chinese, Koreans, Japanese and Ashkenazi Jews all have high levels of myopia. Australian Aborigines have almost none, I think.

https://westhunt.wordpress.com/2016/05/08/you-dont-need-a-weatherman/#comment-79094
I expect that the fall of all great empires is based on long term dysgenic trends. There is no logical reason why so many empires and civilizations throughout history could grow so big and then not simply keep growing, except for dysgenics.
--
I can think of about twenty other possible explanations off the top of my head, but dysgenics is a possible cause.
--
I agree with DataExplorer. The largest factor in the decay of civilizations is dysgenics. The discussion by R. A. Fisher 1930 p. 193 is very cogent on this matter. Soon we will know for sure.
--
Sometimes it can be rapid. Assume that the upper classes are mostly urban, and somewhat sharper than average. Then the Mongols arrive.
sapiens  study  genetics  evolution  hsu  trends  data  visualization  recent-selection  methodology  summary  GWAS  2016  scitariat  britain  commentary  embodied  biodet  todo  control  multi  gnxp  pop-diff  stat-power  mutation  hypothesis-testing  stats  age-generation  QTL  gene-drift  comparison  marginal  aDNA  simulation  trees  time  metrics  density  measurement  conquest-empire  pinker  population-genetics  aphorism  simler  dennett  👽  the-classics  iron-age  mediterranean  volo-avolo  alien-character  russia  medieval  spearhead  gregory-clark  bio  preprint  domestication  MENA  iq  islam  history  poast  west-hunter  scale  behavioral-gen  gotchas  cost-benefit  genomics  bioinformatics  stylized-facts  concept  levers  🌞  pop-structure  nibble  explanation  ideas  usa  dysgenics  list  applicability-prereqs  cohesion  judaism  visuo  correlation  china  asia  japan  korea  civilization  gibbon  rot  roots  fisher  giants  books  old-anglo  selection  agri-mindset  hari-seldon 
august 2016 by nhaliday

bundles : abstractsci

related tags

2016-election  :/  ability-competence  abstraction  academia  accuracy  acm  aDNA  advertising  africa  age-generation  aggregator  agri-mindset  albion  alien-character  alt-inst  analogy  analysis  anglo  anglosphere  antidemos  aphorism  applicability-prereqs  arbitrage  article  asia  assortative-mating  attention  audio  autism  automation  axelrod  axioms  backup  bayesian  behavioral-gen  being-right  bias-variance  biases  bio  biodet  bioinformatics  biotech  books  bounded-cognition  brain-scan  branches  brands  britain  broad-econ  business  c:***  calculator  cancer  candidate-gene  capitalism  causation  charity  chart  china  christianity  civic  civil-liberty  civilization  class  cliometrics  coalitions  cog-psych  cohesion  coming-apart  commentary  communism  community  comparison  competition  complex-systems  composition-decomposition  compressed-sensing  computer-vision  concept  conceptual-vocab  confidence  confusion  conquest-empire  contrarianism  control  convexity-curvature  cooperate-defect  corporation  correlation  cost-benefit  counter-revolution  counterexample  counterfactual  cracker-econ  crime  criminology  critique  crooked  current-events  curvature  cycles  darwinian  data  data-science  death  debate  debt  decision-making  deep-learning  defense  definite-planning  democracy  dennett  density  developing-world  developmental  dimensionality  discussion  disease  distribution  divergence  domestication  douthatish  drama  dropbox  duty  dysgenics  early-modern  eastern-europe  econ-metrics  econometrics  economics  econotariat  education  effect-size  efficiency  elections  elite  embodied  emotion  empirical  endo-exo  endocrine  endogenous-exogenous  ensembles  epistemic  equilibrium  error  essay  estimate  ethics  europe  evidence-based  evolution  expert-experience  explanation  expression-survival  externalities  extra-introversion  farmers-and-foragers  fermi  fiction  field-study  finance  fisher  fitness  flexibility  flux-stasis  foreign-lang  foreign-policy  frontier  garett-jones  GCTA  gender  gender-diff  gene-drift  generalization  genetic-correlation  genetics  genomics  germanic  giants  gibbon  gnon  gnxp  google  gotchas  government  grad-school  graphical-models  graphs  gray-econ  gregory-clark  group-selection  growth-econ  GWAS  gwern  GxE  hanson  hanushek  hard-tech  hari-seldon  harvard  health  heterodox  high-dimension  higher-ed  history  hive-mind  hmm  homo-hetero  housing  hsu  huge-data-the-biggest  human-capital  humility  hypothesis-testing  ideas  ideology  idk  iidness  illusion  immune  impact  impetus  incentives  individualism-collectivism  industrial-org  info-dynamics  info-econ  innovation  input-output  insight  institutions  integrity  intelligence  interests  internet  intervention  interview  intricacy  ioannidis  iq  iron-age  islam  japan  jargon  judaism  judgement  kinship  knowledge  korea  labor  latent-variables  latin-america  learning  left-wing  legibility  lens  levers  leviathan  life-history  linear-models  links  list  local-global  lol  longevity  machine-learning  macro  magnitude  malaise  management  map-territory  marginal  marginal-rev  markets  math  meaningness  measurement  medicine  medieval  mediterranean  MENA  mena4  meta-analysis  meta:medicine  meta:science  metabuch  methodology  metrics  microfoundations  military  missing-heritability  ML-MAP-E  mobility  model-class  model-organism  models  moments  monetary-fiscal  money  monte-carlo  mooc  mostly-modern  multi  mutation  n-factor  nascent-state  nationalism-globalism  natural-experiment  nature  network-structure  neuro  news  nibble  nihil  nitty-gritty  nonlinearity  nordic  norms  null-result  objektbuch  old-anglo  opioids  optimate  optimism  optimization  org:anglo  org:econlib  org:edu  org:junk  org:local  org:mag  org:nat  org:rec  org:sci  organizing  orwellian  outliers  parasites-microbiome  parenting  path-dependence  patho-altruism  paying-rent  pdf  peace-violence  people  pessimism  phalanges  phase-transition  philosophy  pic  piketty  pinker  piracy  planning  plots  poast  podcast  policy  politics  pop-diff  pop-structure  popsci  population-genetics  postmortem  pragmatic  prediction  preference-falsification  preprint  presentation  priors-posteriors  privacy  problem-solving  programming  project  propaganda  proposal  protestant-catholic  pseudoE  psychiatry  psychology  public-goodish  public-health  q-n-a  qra  QTL  quantitative-qualitative  quotes  randy-ayndy  ranking  rant  rationality  ratty  realness  recent-selection  recommendations  recruiting  reddit  reference  reflection  regression  regularizer  regulation  reinforcement  religion  rent-seeking  replication  retention  review  rhetoric  right-wing  rigor  rindermann-thompson  roots  rot  russia  s:***  sampling  sapiens  scale  scaling-up  science  science-anxiety  scifi-fantasy  scitariat  selection  shift  sib-study  signal-noise  signaling  simler  simulation  skeleton  skunkworks  slides  social  social-capital  social-choice  social-psych  social-science  social-structure  sociology  software  sparsity  spatial  spearhead  speedometer  sports  ssc  stat-power  state-of-art  statesmen  stats  stories  street-fighting  study  studying  stylized-facts  sulla  summary  supply-demand  survey  talks  the-classics  the-trenches  theory-practice  theos  thick-thin  things  thinking  time  time-series  todo  toolkit  tools  toxo-gondii  track-record  tradition  transportation  trees  trends  tribalism  troll  trust  truth  twin-study  twitter  unaffiliated  urban-rural  us-them  usa  vampire-squid  variance-components  video  visualization  visuo  volo-avolo  war  wealth-of-nations  west-hunter  westminster  wiki  wire-guided  wisdom  within-without  wonkish  world  world-war  X-not-about-Y  yvain  🌞  🎩  🐸  👽  🔬 

Copy this bookmark:



description:


tags: