nhaliday + bayesian   79

How many laypeople holding a popular opinion are needed to counter an expert opinion?: Thinking & Reasoning: Vol 0, No 0
Although lay opinions and expert opinions have been studied extensively in isolation, the present study examined the relationship between the two by asking how many laypeople are needed to counter an expert opinion. A Bayesian formalisation allowed the prescription of this quantity. Participants were subsequently asked to assess how many laypeople are needed in different situations. The results demonstrate that people are sensitive to the relevant factors identified for determining how many lay opinions are required to counteract a single expert opinion. People's assessments were fairly good in line with Bayesian predictions.
study  psychology  social-psych  learning  rationality  epistemic  info-foraging  info-dynamics  expert  bayesian  neurons  expert-experience  decision-making  reason 
october 2017 by nhaliday
trees are harlequins, words are harlequins — bayes: a kinda-sorta masterpost
lol, gwern: https://www.reddit.com/r/slatestarcodex/comments/6ghsxf/biweekly_rational_feed/diqr0rq/
> What sort of person thinks “oh yeah, my beliefs about these coefficients correspond to a Gaussian with variance 2.5″? And what if I do cross-validation, like I always do, and find that variance 200 works better for the problem? Was the other person wrong? But how could they have known?
> ...Even ignoring the mode vs. mean issue, I have never met anyone who could tell whether their beliefs were normally distributed vs. Laplace distributed. Have you?
I must have spent too much time in Bayesland because both those strike me as very easy and I often think them! My beliefs usually are Laplace distributed when it comes to things like genetics (it makes me very sad to see GWASes with flat priors), and my Gaussian coefficients are actually a variance of 0.70 (assuming standardized variables w.l.o.g.) as is consistent with field-wide meta-analyses indicating that d>1 is pretty rare.
ratty  ssc  core-rats  tumblr  social  explanation  init  philosophy  bayesian  thinking  probability  stats  frequentist  big-yud  lesswrong  synchrony  similarity  critique  intricacy  shalizi  scitariat  selection  mutation  evolution  priors-posteriors  regularization  bias-variance  gwern  reddit  commentary  GWAS  genetics  regression  spock  nitty-gritty  generalization  epistemic  🤖  rationality  poast  multi  best-practices  methodology  data-science 
august 2017 by nhaliday
Stat 260/CS 294: Bayesian Modeling and Inference
Topics
- Priors (conjugate, noninformative, reference)
- Hierarchical models, spatial models, longitudinal models, dynamic models, survival models
- Testing
- Model choice
- Inference (importance sampling, MCMC, sequential Monte Carlo)
- Nonparametric models (Dirichlet processes, Gaussian processes, neutral-to-the-right processes, completely random measures)
- Decision theory and frequentist perspectives (complete class theorems, consistency, empirical Bayes)
- Experimental design
unit  course  berkeley  expert  michael-jordan  machine-learning  acm  bayesian  probability  stats  lecture-notes  priors-posteriors  markov  monte-carlo  frequentist  latent-variables  decision-theory  expert-experience  confidence  sampling 
july 2017 by nhaliday
Unsupervised learning, one notion or many? – Off the convex path
(Task A) Learning a distribution from samples. (Examples: gaussian mixtures, topic models, variational autoencoders,..)

(Task B) Understanding latent structure in the data. This is not the same as (a); for example principal component analysis, clustering, manifold learning etc. identify latent structure but don’t learn a distribution per se.

(Task C) Feature Learning. Learn a mapping from datapoint → feature vector such that classification tasks are easier to carry out on feature vectors rather than datapoints. For example, unsupervised feature learning could help lower the amount of labeled samples needed for learning a classifier, or be useful for domain adaptation.

Task B is often a subcase of Task C, as the intended user of “structure found in data” are humans (scientists) who pour over the representation of data to gain some intuition about its properties, and these “properties” can be often phrased as a classification task.

This post explains the relationship between Tasks A and C, and why they get mixed up in students’ mind. We hope there is also some food for thought here for experts, namely, our discussion about the fragility of the usual “perplexity” definition of unsupervised learning. It explains why Task A doesn’t in practice lead to good enough solution for Task C. For example, it has been believed for many years that for deep learning, unsupervised pretraining should help supervised training, but this has been hard to show in practice.
acmtariat  org:bleg  nibble  machine-learning  acm  thinking  clarity  unsupervised  conceptual-vocab  concept  explanation  features  bayesian  off-convex  deep-learning  latent-variables  generative  intricacy  distribution  sampling  grokkability-clarity  org:popup 
june 2017 by nhaliday
[1705.03394] That is not dead which can eternal lie: the aestivation hypothesis for resolving Fermi's paradox
If a civilization wants to maximize computation it appears rational to aestivate until the far future in order to exploit the low temperature environment: this can produce a 10^30 multiplier of achievable computation. We hence suggest the "aestivation hypothesis": the reason we are not observing manifestations of alien civilizations is that they are currently (mostly) inactive, patiently waiting for future cosmic eras. This paper analyzes the assumptions going into the hypothesis and how physical law and observational evidence constrain the motivations of aliens compatible with the hypothesis.

http://aleph.se/andart2/space/the-aestivation-hypothesis-popular-outline-and-faq/

simpler explanation (just different math for Drake equation):
Dissolving the Fermi Paradox: http://www.jodrellbank.manchester.ac.uk/media/eps/jodrell-bank-centre-for-astrophysics/news-and-events/2017/uksrn-slides/Anders-Sandberg---Dissolving-Fermi-Paradox-UKSRN.pdf
http://marginalrevolution.com/marginalrevolution/2017/07/fermi-paradox-resolved.html
Overall the argument is that point estimates should not be shoved into a Drake equation and then multiplied by each, as that requires excess certainty and masks much of the ambiguity of our knowledge about the distributions. Instead, a Bayesian approach should be used, after which the fate of humanity looks much better. Here is one part of the presentation:

Life Versus Dark Energy: How An Advanced Civilization Could Resist the Accelerating Expansion of the Universe: https://arxiv.org/abs/1806.05203
The presence of dark energy in our universe is causing space to expand at an accelerating rate. As a result, over the next approximately 100 billion years, all stars residing beyond the Local Group will fall beyond the cosmic horizon and become not only unobservable, but entirely inaccessible, thus limiting how much energy could one day be extracted from them. Here, we consider the likely response of a highly advanced civilization to this situation. In particular, we argue that in order to maximize its access to useable energy, a sufficiently advanced civilization would chose to expand rapidly outward, build Dyson Spheres or similar structures around encountered stars, and use the energy that is harnessed to accelerate those stars away from the approaching horizon and toward the center of the civilization. We find that such efforts will be most effective for stars with masses in the range of M∼(0.2−1)M⊙, and could lead to the harvesting of stars within a region extending out to several tens of Mpc in radius, potentially increasing the total amount of energy that is available to a future civilization by a factor of several thousand. We also discuss the observable signatures of a civilization elsewhere in the universe that is currently in this state of stellar harvesting.
preprint  study  essay  article  bostrom  ratty  anthropic  philosophy  space  xenobio  computation  physics  interdisciplinary  ideas  hmm  cocktail  temperature  thermo  information-theory  bits  🔬  threat-modeling  time  scale  insight  multi  commentary  liner-notes  pdf  slides  error  probability  ML-MAP-E  composition-decomposition  econotariat  marginal-rev  fermi  risk  org:mat  questions  paradox  intricacy  multiplicative  calculation  street-fighting  methodology  distribution  expectancy  moments  bayesian  priors-posteriors  nibble  measurement  existence  technology  geoengineering  magnitude  spatial  density  spreading  civilization  energy-resources  phys-energy  measure  direction  speculation  structure 
may 2017 by nhaliday
probability - Why does a 95% Confidence Interval (CI) not imply a 95% chance of containing the mean? - Cross Validated
The confidence interval is the answer to the request: "Give me an interval that will bracket the true value of the parameter in 100p% of the instances of an experiment that is repeated a large number of times." The credible interval is an answer to the request: "Give me an interval that brackets the true value with probability pp given the particular sample I've actually observed." To be able to answer the latter request, we must first adopt either (a) a new concept of the data generating process or (b) a different concept of the definition of probability itself.

http://stats.stackexchange.com/questions/139290/a-psychology-journal-banned-p-values-and-confidence-intervals-is-it-indeed-wise

PS. Note that my question is not about the ban itself; it is about the suggested approach. I am not asking about frequentist vs. Bayesian inference either. The Editorial is pretty negative about Bayesian methods too; so it is essentially about using statistics vs. not using statistics at all.

wut

http://stats.stackexchange.com/questions/6966/why-continue-to-teach-and-use-hypothesis-testing-when-confidence-intervals-are
http://stats.stackexchange.com/questions/2356/are-there-any-examples-where-bayesian-credible-intervals-are-obviously-inferior
http://stats.stackexchange.com/questions/2272/whats-the-difference-between-a-confidence-interval-and-a-credible-interval
http://stats.stackexchange.com/questions/6652/what-precisely-is-a-confidence-interval
http://stats.stackexchange.com/questions/1164/why-havent-robust-and-resistant-statistics-replaced-classical-techniques/
http://stats.stackexchange.com/questions/16312/what-is-the-difference-between-confidence-intervals-and-hypothesis-testing
http://stats.stackexchange.com/questions/31679/what-is-the-connection-between-credible-regions-and-bayesian-hypothesis-tests
http://stats.stackexchange.com/questions/11609/clarification-on-interpreting-confidence-intervals
http://stats.stackexchange.com/questions/16493/difference-between-confidence-intervals-and-prediction-intervals
q-n-a  overflow  nibble  stats  data-science  science  methodology  concept  confidence  conceptual-vocab  confusion  explanation  thinking  hypothesis-testing  jargon  multi  meta:science  best-practices  error  discussion  bayesian  frequentist  hmm  publishing  intricacy  wut  comparison  motivation  clarity  examples  robust  metabuch  🔬  info-dynamics  reference  grokkability-clarity 
february 2017 by nhaliday
Predicting with confidence: the best machine learning idea you never heard of | Locklin on science
The advantages of conformal prediction are many fold. These ideas assume very little about the thing you are trying to forecast, the tool you’re using to forecast or how the world works, and they still produce a pretty good confidence interval. Even if you’re an unrepentant Bayesian, using some of the machinery of conformal prediction, you can tell when things have gone wrong with your prior. The learners work online, and with some modifications and considerations, with batch learning. One of the nice things about calculating confidence intervals as a part of your learning process is they can actually lower error rates or use in semi-supervised learning as well. Honestly, I think this is the best bag of tricks since boosting; everyone should know about and use these ideas.

The essential idea is that a “conformity function” exists. Effectively you are constructing a sort of multivariate cumulative distribution function for your machine learning gizmo using the conformity function. Such CDFs exist for classical stuff like ARIMA and linear regression under the correct circumstances; CP brings the idea to machine learning in general, and to models like ARIMA when the standard parametric confidence intervals won’t work. Within the framework, the conformity function, whatever may be, when used correctly can be guaranteed to give confidence intervals to within a probabilistic tolerance. The original proofs and treatments of conformal prediction, defined for sequences, is extremely computationally inefficient. The conditions can be relaxed in many cases, and the conformity function is in principle arbitrary, though good ones will produce narrower confidence regions. Somewhat confusingly, these good conformity functions are referred to as “efficient” -though they may not be computationally efficient.
techtariat  acmtariat  acm  machine-learning  bayesian  stats  exposition  research  online-learning  probability  decision-theory  frontier  unsupervised  confidence 
february 2017 by nhaliday
What is the difference between inference and learning? - Quora
- basically boils down to latent variables vs. (hyper-)parameters
- so computing p(x_h|x_v,θ) vs. computing p(θ|X_v)
- from a completely Bayesian perspective, no real difference
- described in more detail in [Kevin Murphy, 10.4]
q-n-a  qra  jargon  machine-learning  stats  acm  bayesian  graphical-models  latent-variables  confusion  comparison  nibble 
january 2017 by nhaliday
CS 731 Advanced Artificial Intelligence - Spring 2011
- statistical machine learning
- sparsity in regression
- graphical models
- exponential families
- variational methods
- MCMC
- dimensionality reduction, eg, PCA
- Bayesian nonparametrics
- compressive sensing, matrix completion, and Johnson-Lindenstrauss
course  lecture-notes  yoga  acm  stats  machine-learning  graphical-models  graphs  model-class  bayesian  learning-theory  sparsity  embeddings  markov  monte-carlo  norms  unit  nonparametric  compressed-sensing  matrix-factorization  features 
january 2017 by nhaliday
A Rejection of 'Broken Windows Policing' Over Race Actually Hurts Minority Neighborhoods | Manhattan Institute
https://twitter.com/BookOfTamara/status/778838226983268352
https://archive.is/ETLXf
Late-night slightly controversial criminal justice thread:

Proactive policing and crime control: https://www.nature.com/articles/s41562-017-0227-x
Evidence that curtailing proactive policing can reduce major crime: https://www.nature.com/articles/s41562-017-0211-5

Proactive Policing: Effects on Crime and Communities: http://sites.nationalacademies.org/dbasse/claj/proactive-policing/index.htm
This report from the Committee on Law and Justice finds evidence that a number of proactive policing practices are successful in reducing crime and disorder, at least in the short term, and that most of these strategies do not harm communities’ attitudes towards police.

Is Racial Profiling a Legitimate Strategy in the Fight against Violent Crime?: https://link.springer.com/epdf/10.1007/s11406-018-9945-1?author_access_token=nDM1xCesybebx7yUX2BxZ_e4RwlQNchNByi7wbcMAY6py69jTlOiEGDIgqW0Vv2HrAor6wlMLH695I2ykTiKUxf1RBnu1u_6gjXU-6vgh2gIy6CX2npHD9GR350T20x_TbCcq4MmJUPrxAqsJSe1QA%3D%3D
- Neven Sesardić

Are U.S. Cities Underpoliced?: http://marginalrevolution.com/marginalrevolution/2017/08/u-s-cities-underpoliced.html
Chalfin and McCrary acknowledge the endogeneity problem but they suggest that a more important reason why ordinary regression gives you poor results is that the number of police is poorly measured. Suppose the number of police jumps up and down in the data even when the true number stays constant. Fake variation obviously can’t influence real crime so when your regression “sees” a lot of (fake) variation in police which is not associated with variation in crime it’s naturally going to conclude that the effect of police on crime is small, i.e. attenuation bias.

By comparing two different measures of the number of police, Chalfin and McCrary show that a surprising amount of the ups and downs in the number of police is measurement error. Using their two measures, however, Chalfin and McCrary produce a third measure which is better than either alone. Using this cleaned-up estimate, they find that ordinary regression (with controls) gives you estimates of the effect of police on crime which are plausible and similar to those found using other techniques like natural experiments. Chalfin and McCrary’s estimates, however, are more precise since they use much more of the variation in the data.

Using these new estimates of the effect of police and crime along with estimates of the social cost of crime they conclude (as I have argued before) that U.S. cities are substantially under-policed.

Crime Imprisons and Kills: http://marginalrevolution.com/marginalrevolution/2018/01/crime-imprisons-kills.html
…The everyday lived experience of urban poverty has also been transformed. Analyzing rates of violent victimization over time, I found that the poorest Americans today are victimized at about the same rate as the richest Americans were at the start of the 1990s. That means that a poor, unemployed city resident walking the streets of an average city today has about the same chance of being robbed, beaten up, stabbed or shot as a well-off urbanite in 1993. Living in poverty used to mean living with the constant threat of violence. In most of the country, that is no longer true.

http://marginalrevolution.com/marginalrevolution/2015/09/what-was-gary-beckers-biggest-mistake.html
http://www.sentencingproject.org/wp-content/uploads/2016/01/Deterrence-in-Criminal-Justice.pdf
Do parole abolition and Truth-in-Sentencing deter violent crimes in Virginia?: http://link.springer.com.sci-hub.tw/article/10.1007/s00181-017-1332-4

Death penalty: https://offsettingbehaviour.blogspot.com/2011/09/death-penalty.html
And so I revise: the death penalty is wrong, and it also likely has little measurable deterrent effect. There may still be a deterrent effect; we just can't show one given available data.

The effects of DNA databases on the deterrence and detection of offenders: http://jenniferdoleac.com/wp-content/uploads/2015/03/DNA_Denmark.pdf
We exploit a large expansion of Denmark’s DNA database in 2005 to measure the effect of DNA registration on criminal behavior. Using a regression discontinuity strategy, we find that DNA registration reduces recidivism by 43%. Using rich data on the timing of subsequent charges to separate the deterrence and detection effects of DNA databases, we also find that DNA registration increases the probability that repeat offenders get caught, by 4%. We estimate an elasticity of criminal behavior with respect to the probability of detection to be -1.7. We also find suggestive evidence that DNA profiling changes non-criminal behavior: offenders added to the DNA database are more likely to get married, remain in a stable relationship, and live with their children.

Short- and long-term effects of imprisonment on future felony convictions and prison admissions: http://www.pnas.org/content/early/2017/09/26/1701544114.short
https://twitter.com/bswud/status/917354893907779585
Prison isn't criminogenic—offenders have higher rates of re-incarceration because of technical parole violations
news  org:mag  right-wing  policy  criminal-justice  nyc  urban  race  culture-war  rhetoric  org:ngo  sociology  criminology  journos-pundits  multi  law  crime  contrarianism  twitter  discussion  gnon  ratty  albion  wonkish  time-preference  econotariat  marginal-rev  economics  models  map-territory  error  behavioral-econ  intervention  pdf  white-paper  expectancy  microfoundations  big-peeps  piracy  study  org:nat  chart  🎩  hmm  order-disorder  morality  values  data  authoritarianism  genomics  econometrics  europe  nordic  natural-experiment  endo-exo  debate  intricacy  measurement  signal-noise  regression  methodology  summary  explanation  social  commentary  evidence-based  endogenous-exogenous  bounded-cognition  urban-rural  philosophy  essay  article  letters  ethnocentrism  prejudice  ethics  formal-values  africa  pro-rata  bayesian  priors-posteriors  discrimination  civil-liberty  garett-jones 
january 2017 by nhaliday
D-separation
collider C = A->C<-B
A, B d-connected (resp. conditioned on Z) iff path A~>B or B~>A w/o colliders (resp. path excluding vertices in Z)
A,B d-separated conditioned on Z iff not d-connected conditioned on Z

http://bayes.cs.ucla.edu/BOOK-2K/d-sep.html
concept  explanation  causation  bayesian  graphical-models  cmu  org:edu  stats  methodology  tutorial  jargon  graphs  hypothesis-testing  confounding  🔬  direct-indirect  philosophy  definition  volo-avolo  multi  org:junk 
january 2017 by nhaliday
Epistemic learned helplessness - Jackdaws love my big sphinx of quartz
I don’t think I’m overselling myself too much to expect that I could argue circles around the average uneducated person. Like I mean that on most topics, I could demolish their position and make them look like an idiot. Reduce them to some form of “Look, everything you say fits together and I can’t explain why you’re wrong, I just know you are!” Or, more plausibly, “Shut up I don’t want to talk about this!”

And there are people who can argue circles around me. Maybe not on every topic, but on topics where they are experts and have spent their whole lives honing their arguments. When I was young I used to read pseudohistory books; Immanuel Velikovsky’s Ages in Chaos is a good example of the best this genre has to offer. I read it and it seemed so obviously correct, so perfect, that I could barely bring myself to bother to search out rebuttals.

And then I read the rebuttals, and they were so obviously correct, so devastating, that I couldn’t believe I had ever been so dumb as to believe Velikovsky.

And then I read the rebuttals to the rebuttals, and they were so obviously correct that I felt silly for ever doubting.

And so on for several more iterations, until the labyrinth of doubt seemed inescapable. What finally broke me out wasn’t so much the lucidity of the consensus view so much as starting to sample different crackpots. Some were almost as bright and rhetorically gifted as Velikovsky, all presented insurmountable evidence for their theories, and all had mutually exclusive ideas. After all, Noah’s Flood couldn’t have been a cultural memory both of the fall of Atlantis and of a change in the Earth’s orbit, let alone of a lost Ice Age civilization or of megatsunamis from a meteor strike. So given that at least some of those arguments are wrong and all seemed practically proven, I am obviously just gullible in the field of ancient history. Given a total lack of independent intellectual steering power and no desire to spend thirty years building an independent knowledge base of Near Eastern history, I choose to just accept the ideas of the prestigious people with professorships in Archaeology, rather than those of the universally reviled crackpots who write books about Venus being a comet.

You could consider this a form of epistemic learned helplessness, where I know any attempt to evaluate the arguments is just going to be a bad idea so I don’t even try. If you have a good argument that the Early Bronze Age worked completely differently from the way mainstream historians believe, I just don’t want to hear about it. If you insist on telling me anyway, I will nod, say that your argument makes complete sense, and then totally refuse to change my mind or admit even the slightest possibility that you might be right.

(This is the correct Bayesian action: if I know that a false argument sounds just as convincing as a true argument, argument convincingness provides no evidence either way. I should ignore it and stick with my prior.)

...

Even the smartest people I know have a commendable tendency not to take certain ideas seriously. Bostrom’s simulation argument, the anthropic doomsday argument, Pascal’s Mugging – I’ve never heard anyone give a coherent argument against any of these, but I’ve also never met anyone who fully accepts them and lives life according to their implications.

A friend tells me of a guy who once accepted fundamentalist religion because of Pascal’s Wager. I will provisionally admit that this person “takes ideas seriously”. Everyone else gets partial credit, at best.

...

Responsible doctors are at the other end of the spectrum from terrorists here. I once heard someone rail against how doctors totally ignored all the latest and most exciting medical studies. The same person, practically in the same breath, then railed against how 50% to 90% of medical studies are wrong. These two observations are not unrelated. Not only are there so many terrible studies, but pseudomedicine (not the stupid homeopathy type, but the type that links everything to some obscure chemical on an out-of-the-way metabolic pathway) has, for me, proven much like pseudohistory – unless I am an expert in that particular subsubfield of medicine, it can sound very convincing even when it’s very wrong.

The medical establishment offers a shiny tempting solution. First, a total unwillingness to trust anything, no matter how plausible it sounds, until it’s gone through an endless cycle of studies and meta-analyses. Second, a bunch of Institutes and Collaborations dedicated to filtering through all these studies and analyses and telling you what lessons you should draw from them.

I’m glad that some people never develop epistemic learned helplessness, or develop only a limited amount of it, or only in certain domains. It seems to me that although these people are more likely to become terrorists or Velikovskians or homeopaths, they’re also the only people who can figure out if something basic and unquestionable is wrong, and make this possibility well-known enough that normal people start becoming willing to consider it.

But I’m also glad epistemic learned helplessness exists. It seems like a pretty useful social safety valve most of the time.
yvain  essay  thinking  rationality  philosophy  reflection  ratty  ssc  epistemic  🤖  2013  minimalism  intricacy  p:null  info-dynamics  truth  reason  s:**  contrarianism  subculture  inference  bayesian  priors-posteriors  debate  rhetoric  pessimism  nihil  spreading  flux-stasis  robust  parsimony  dark-arts  illusion 
october 2016 by nhaliday
A Fervent Defense of Frequentist Statistics - Less Wrong
Short summary. This essay makes many points, each of which I think is worth reading, but if you are only going to understand one point I think it should be “Myth 5″ below, which describes the online learning framework as a response to the claim that frequentist methods need to make strong modeling assumptions. Among other things, online learning allows me to perform the following remarkable feat: if I’m betting on horses, and I get to place bets after watching other people bet but before seeing which horse wins the race, then I can guarantee that after a relatively small number of races, I will do almost as well overall as the best other person, even if the number of other people is very large (say, 1 billion), and their performance is correlated in complicated ways.

If you’re only going to understand two points, then also read about the frequentist version of Solomonoff induction, which is described in “Myth 6″.

...

If you are like me from, say, two years ago, you are firmly convinced that Bayesian methods are superior and that you have knockdown arguments in favor of this. If this is the case, then I hope this essay will give you an experience that I myself found life-altering: the experience of having a way of thinking that seemed unquestionably true slowly dissolve into just one of many imperfect models of reality. This experience helped me gain more explicit appreciation for the skill of viewing the world from many different angles, and of distinguishing between a very successful paradigm and reality.

If you are not like me, then you may have had the experience of bringing up one of many reasonable objections to normative Bayesian epistemology, and having it shot down by one of many “standard” arguments that seem wrong but not for easy-to-articulate reasons. I hope to lend some reprieve to those of you in this camp, by providing a collection of “standard” replies to these standard arguments.
bayesian  philosophy  stats  rhetoric  advice  debate  critique  expert  lesswrong  commentary  discussion  regularizer  essay  exposition  🤖  aphorism  spock  synthesis  clever-rats  ratty  hi-order-bits  top-n  2014  acmtariat  big-picture  acm  iidness  online-learning  lens  clarity  unit  nibble  frequentist  s:**  expert-experience  subjective-objective  grokkability-clarity 
september 2016 by nhaliday
Dirichlet process - Wikipedia, the free encyclopedia
In probability theory, Dirichlet processes (after Peter Gustav Lejeune Dirichlet) are a family of stochastic processes whose realizations are probability distributions. In other words, a Dirichlet process is a probability distribution whose range is itself a set of probability distributions. It is often used in Bayesian inference to describe the prior knowledge about the distribution of random variables—how likely it is that the random variables are distributed according to one or another particular distribution.
probability  stats  wiki  reference  concept  acm  distribution  bayesian  simplex  nibble 
june 2016 by nhaliday
The News on Auto-tuning – arg min blog
bayesian optimization is not necessarily obviously better than randomized search on all fronts
critique  bayesian  optimization  machine-learning  expert  hmm  liner-notes  rhetoric  debate  acmtariat  ben-recht  mrtz  gwern  random  org:bleg  nibble  expert-experience 
june 2016 by nhaliday
A Variant on “Statistically Controlling for Confounding Constructs is Harder than you Think”
It’s taken me some time to master this formalism, but I now find it quite easy to reason about these kinds of issues thanks to the brevity of graphical models as a notational technique. I’d love to see this approach become more popular in psychology, given that it has already become quite widespread in other fields. Of course, Westfall and Yarkoni are already advocating for something very similar by advocating for the use of SEM’s, but the graphical approach is strictly more general than SEM’s and, in my personal opinion, strictly simpler to reason about.
bayesian  stats  thinking  visualization  study  science  gelman  hmm  methodology  causation  acmtariat  meta:science  graphs  commentary  techtariat  hypothesis-testing  org:bleg  nibble  scitariat  confounding  🔬  info-dynamics  direct-indirect  volo-avolo  endo-exo  endogenous-exogenous  control  graphical-models 
may 2016 by nhaliday
Stan
andrew gelman's language
can this do graphical models (I remember some issues w/ that)?
programming  bayesian  stats  python  ppl  libraries  oss  pls  monte-carlo  r-lang  DSL 
april 2016 by nhaliday
Bayesianism, frequentism, and the planted clique, or do algorithms believe in unicorns? | Windows On Theory
But if you consider probabilities as encoding beliefs, then it’s quite likely that a computationally bounded observer is not certain whether {17} is in the clique or not. After all, finding a maximum clique is a hard computational problem. So if T is much smaller than the time it takes to solve the k-clique problem (which is n^{const\cdot k} as far as we know), then it might make sense for time T observers to assign a probability between 0 and 1 to this event. Can we come up with a coherent theory of such probabilities?
research  tcs  complexity  probability  stats  algorithms  yoga  speculation  tcstariat  frontier  insight  exposition  rand-approx  synthesis  big-picture  boaz-barak  org:bleg  nibble  frequentist  bayesian  subjective-objective 
april 2016 by nhaliday

bundles : academeacmframe

related tags

accretion  acm  acmtariat  advanced  adversarial  advertising  advice  africa  aggregator  ai  ai-control  albion  algebra  algorithms  analysis  anglo  anthropic  aphorism  apollonian-dionysian  applications  approximation  arrows  article  atoms  authoritarianism  auto-learning  axioms  backup  baez  bayesian  behavioral-econ  being-right  ben-recht  berkeley  best-practices  bias-variance  biases  big-peeps  big-picture  big-yud  biodet  bioinformatics  bits  boaz-barak  boltzmann  bonferroni  books  bostrom  bounded-cognition  brain-scan  browser  business  calculation  cartoons  causation  chart  cheatsheet  chemistry  civil-liberty  civilization  clarity  classic  clever-rats  cmu  coalitions  cocktail  coding-theory  cog-psych  collaboration  columbia  comics  commentary  communism  comparison  complexity  composition-decomposition  compressed-sensing  computation  computer-memory  concept  conceptual-vocab  conference  confidence  config  confluence  confounding  confusion  consumerism  contrarianism  control  convexity-curvature  cool  core-rats  counterfactual  course  creative  crime  criminal-justice  criminology  critique  cs  culture  culture-war  curvature  dark-arts  data  data-science  database  debate  decision-making  decision-theory  deep-learning  definition  dennett  density  descriptive  desktop  differential  differential-privacy  dimensionality  direct-indirect  direction  discipline  discrimination  discussion  disease  distribution  draft  DSL  duality  dumb-ML  duplication  dynamic  dynamical  eastern-europe  econometrics  economics  econotariat  editors  education  EGT  electromag  elegance  embeddings  emotion  encyclopedic  endo-exo  endocrine  endogenous-exogenous  energy-resources  entropy-like  epistemic  ergo  error  essay  estimate  ethics  ethnocentrism  europe  events  evidence-based  evolution  examples  existence  exocortex  expectancy  expert  expert-experience  explanans  explanation  exploratory  exposition  extra-introversion  extrema  fall-2016  features  fermi  finiteness  fisher  flux-stasis  foreign-lang  formal-values  frameworks  french  frequentist  frontier  garett-jones  gaussian-processes  gelman  gender  gender-diff  generalization  generative  genetics  genomics  geoengineering  geometry  gnon  google  gotchas  gradient-descent  graphical-models  graphs  grokkability-clarity  ground-up  growth  GWAS  gwern  hardware  haskell  heuristic  hi-order-bits  history  hmm  hn  homepage  homo-hetero  hsu  human-ml  hypothesis-testing  ideas  idk  iidness  illusion  immune  impro  inference  info-dynamics  info-foraging  infographic  information-theory  init  insight  interdisciplinary  interface-compatibility  internet  intervention  intricacy  iq  ising  iteration-recursion  jargon  journos-pundits  keyboard  language  latent-variables  law  learning  learning-theory  lecture-notes  lectures  lens  lesswrong  letters  libraries  linear-algebra  linear-models  linearity  liner-notes  links  linux  list  logic  lower-bounds  machine-learning  macro  magnitude  manifolds  map-territory  marginal-rev  markov  matching  math  math.CA  math.DS  math.GR  math.NT  math.RT  mathtariat  matrix-factorization  measure  measurement  media  medicine  meta:science  metabuch  metameta  methodology  metrics  michael-jordan  michael-nielsen  micro  microfoundations  minimalism  miri-cfar  mit  ML-MAP-E  model-class  model-selection  models  moments  money  monte-carlo  mooc  morality  mostly-modern  motivation  mrtz  multi  multiplicative  music-theory  mutation  natural-experiment  nature  neuro  neurons  news  nibble  nihil  nips  nitty-gritty  nlp  no-go  nonlinearity  nonparametric  nordic  norms  notetaking  numerics  nyc  occam  off-convex  oly  online-learning  open-closed  openai  optimization  order-disorder  orders  org:bleg  org:edu  org:junk  org:mag  org:mat  org:nat  org:ngo  org:popup  org:sci  organization  oss  overflow  oxbridge  p:*  p:***  p:null  p:someday  p:whenever  PAC  papers  paradox  parametric  parasites-microbiome  parsimony  pdf  peace-violence  people  personality  perturbation  pessimism  phase-transition  philosophy  phys-energy  physics  pinboard  piracy  pls  poast  policy  politics  pop-structure  popsci  population-genetics  positivity  ppl  pre-2013  prediction  prejudice  preprint  princeton  prioritizing  priors-posteriors  privacy  pro-rata  probability  prof  profile  programming  project  proofs  psych-architecture  psychiatry  psychology  psychometrics  public-health  publishing  python  q-n-a  qra  quantum  questions  quixotic  quiz  quotes  r-lang  race  rand-approx  random  ranking  rationality  ratty  reading  reason  recommendations  reddit  reference  reflection  regression  regularization  regularizer  reinforcement  relativity  replication  research  research-program  retention  review  rhetoric  right-wing  rigor  risk  roadmap  robust  roots  s:*  s:**  sample-complexity  sampling  sanctity-degradation  sanjeev-arora  scale  science  scitariat  search  sebastien-bubeck  selection  sensitivity  shalizi  signal-noise  signum  similarity  simplex  simulation  skeleton  sleep  slides  social  social-psych  sociology  software  space  sparsity  spatial  speculation  spock  spreading  ssc  stanford  startups  stat-mech  stat-power  state-of-art  stats  stochastic-processes  stream  street-fighting  stress  structure  study  subculture  subjective-objective  summary  synchrony  synthesis  talks  tcs  tcstariat  tech  technology  techtariat  telos-atelos  temperature  terminal  the-trenches  thermo  thesis  things  thinking  threat-modeling  time  time-preference  todo  toolkit  tools  top-n  topology  toxo-gondii  troll  trust  truth  tumblr  tutorial  tv  twitter  uncertainty  unit  unsupervised  urban  urban-rural  values  vc-dimension  video  virginia-DC  visual-understanding  visualization  volo-avolo  vulgar  white-paper  wiki  wonkish  workflow  world-war  wut  xenobio  yak-shaving  yoga  yvain  🎩  👳  🔬  🖥  🤖  🦉 

Copy this bookmark:



description:


tags: