probability   10443

« earlier    

Science’s Inference Problem: When Data Doesn’t Mean What We Think It Does - The New York Times
Brian Skyrms emphasize another possible cause of the so-called replication crisis: the tendency, even among “working scientists,” to equate probability with frequency. Frequency is a measure of how often a certain event occurs; it concerns facts about the empirical world. Probability is a measure of rational degree of belief; it concerns how strongly we should expect a certain event to occur. Linking frequency and probability is hardly an error. (Indeed, the notion that in large enough numbers frequencies can approximate probabilities is Diaconis and Skyrms’s fourth “great idea” about chance.) But failing to distinguish the two concepts when testing hypotheses, they warn, “can have pernicious effects.”

Consider statistical significance, a standard scientists often use to judge the worth of their findings. The goal of an experiment is to make an inductive inference: to determine how confident you should be in a hypothesis, given the data. You suspect a coin is weighted (the hypothesis), so you flip it five times and it comes up heads each time (the data); what is the likelihood that your hypothesis is correct? A notable feature of the methodology of statistical significance is that it does not directly pose this question. To determine statistical significance, you ask something more roundabout: What is the probability of getting the same data as a result of random “noise”? That is, what are the odds of getting five heads in a row assuming the coin is not weighted? If that figure is small enough — less than 5 percent is a commonly used threshold — your finding is judged statistically significant. Since the chance of flipping five heads in a row with a fair coin is only about 3 percent, you have cleared the bar.

But what have you found? Diaconis and Skyrms caution that if you are not careful, you can fall prey to a kind of bait-and-switch. You may think you are learning the probability of your hypothesis (the claim that the coin is weighted), given the frequency of heads. But in fact you are learning the probability of the frequency of heads, given the so-called null hypothesis (the assumption there is nothing amiss with the coin). The former is the inductive inference you were looking to make; the latter is a deductive inference that, while helpful in indicating how improbable your data are, does not directly address your hypothesis. Flipping five heads in a row gives some evidence the coin is weighted, but it hardly amounts to a discovery that it is. Because too many scientists rely on the “mechanical” use of this technique, Diaconis and Skyrms argue, they fail to appreciate what they have — and have not — found, thereby fostering the publication of weak results.
probability  frequency  statisticalsignificance 
yesterday by johndodds
Some good "Statistics for programmers" resources
This post is basically a list of books & other resources that teach statistics using programming.
statistics  Programming  learning  hypothesis-testing  confidence-intervals  probability  t-tests  normal-distribution  boostrapping 
5 days ago by rishaanp
Cut The Knot
Website of Alexander Bogomolny.
Math puzzles and philosophy.
Followed by NNT.
math  mathematics  philosophy  puzzles  probability 
5 days ago by drmeme

« earlier    

related tags

-  5*  abstraction  acm  ai  algorithm  analytics  art  atoms  ba  bayes-theorem  bayes  bayesian  bias  bigdata  birthday  bitcoin  black  blog  book  books  books:noted  boostrapping  by:guylebanon  cards  causality  cheatsheet  church  clojure  common_sense  complex_analysis  computers  concept  confidence-intervals  cookbook  cousin  cryonics  data-analysis  data-viz  data  data_science  datascience  decision-theory  deeplearning  design  development  distribution  dmce  dna  ebook  econ  education  elections  estimate  estimation  explanations  face  fh  figurethat  financial_mathematics  flow  football  forecasting  fourth  frequency  functional_analysis  funny  gambling  game  generator  have_read  healthcare  hedge  heuristics  history  hypothesis-testing  in_nb  inference  information  interacting_particle_systems  kelly  kolmogorov  language  large_deviation  large_deviations  learning  lecture_notes  less_wrong  levers  linearalgebra  logic  lyapunov_functions  machine  machinelearning  math  mathematics  maths  measure-concentration  measurements  ml  mle  mooc  moocs  naive  narrative  nibble  nl  nn  non-equilibrium  normal-distribution  numbers  nytimes  paper  papers  paradox  pdes  percentage  phase_transitions  philosophy  poisson  poker  population  predictive  probabilistic  probabilty  processes  programming  puzzles  python  quantum  quanum  r  random  re:almost_none  reference  research  risk_reward  sampling  scala  science  scott_alexander  security  sport  stan  states  statistical_mechanics  statisticalsignificance  statistics  stats  stochastic-analysis  stochastic_process  stochastic_processes  stories  swan  t-tests  tail  taleb's  teaching  theory  thinking  to-read  to:nb  touchette.hugo  tutorial  uncertainty  video  visualisation  visualization  wiki  wikipedia  вероятностей  парадокс  теория 

Copy this bookmark: