nhaliday + probability   167

 « earlier
Prisoner's dilemma - Wikipedia
caveat to result below:
An extension of the IPD is an evolutionary stochastic IPD, in which the relative abundance of particular strategies is allowed to change, with more successful strategies relatively increasing. This process may be accomplished by having less successful players imitate the more successful strategies, or by eliminating less successful players from the game, while multiplying the more successful ones. It has been shown that unfair ZD strategies are not evolutionarily stable. The key intuition is that an evolutionarily stable strategy must not only be able to invade another population (which extortionary ZD strategies can do) but must also perform well against other players of the same type (which extortionary ZD players do poorly, because they reduce each other's surplus).[14]

Theory and simulations confirm that beyond a critical population size, ZD extortion loses out in evolutionary competition against more cooperative strategies, and as a result, the average payoff in the population increases when the population is bigger. In addition, there are some cases in which extortioners may even catalyze cooperation by helping to break out of a face-off between uniform defectors and win–stay, lose–switch agents.[8]

https://alfanl.com/2018/04/12/defection/
Nature boils down to a few simple concepts.

Haters will point out that I oversimplify. The haters are wrong. I am good at saying a lot with few words. Nature indeed boils down to a few simple concepts.

In life, you can either cooperate or defect.

Used to be that defection was the dominant strategy, say in the time when the Roman empire started to crumble. Everybody complained about everybody and in the end nothing got done. Then came Jesus, who told people to be loving and cooperative, and boom: 1800 years later we get the industrial revolution.

Because of Jesus we now find ourselves in a situation where cooperation is the dominant strategy. A normie engages in a ton of cooperation: with the tax collector who wants more and more of his money, with schools who want more and more of his kid’s time, with media who wants him to repeat more and more party lines, with the Zeitgeist of the Collective Spirit of the People’s Progress Towards a New Utopia. Essentially, our normie is cooperating himself into a crumbling Western empire.

Turns out that if everyone blindly cooperates, parasites sprout up like weeds until defection once again becomes the standard.

The point of a post-Christian religion is to once again create conditions for the kind of cooperation that led to the industrial revolution. This necessitates throwing out undead Christianity: you do not blindly cooperate. You cooperate with people that cooperate with you, you defect on people that defect on you. Christianity mixed with Darwinism. God and Gnon meet.

This also means we re-establish spiritual hierarchy, which, like regular hierarchy, is a prerequisite for cooperation. It is this hierarchical cooperation that turns a household into a force to be reckoned with, that allows a group of men to unite as a front against their enemies, that allows a tribe to conquer the world. Remember: Scientology bullied the Cathedral’s tax department into submission.

With a functioning hierarchy, men still gossip, lie and scheme, but they will do so in whispers behind closed doors. In your face they cooperate and contribute to the group’s wellbeing because incentives are thus that contributing to group wellbeing heightens status.

Without a functioning hierarchy, men gossip, lie and scheme, but they do so in your face, and they tell you that you are positively deluded for accusing them of gossiping, lying and scheming. Seeds will not sprout in such ground.

Spiritual dominance is established in the same way any sort of dominance is established: fought for, taken. But the fight is ritualistic. You can’t force spiritual dominance if no one listens, or if you are silenced the ritual is not allowed to happen.

If one of our priests is forbidden from establishing spiritual dominance, that is a sure sign an enemy priest is in better control and has vested interest in preventing you from establishing spiritual dominance..

They defect on you, you defect on them. Let them suffer the consequences of enemy priesthood, among others characterized by the annoying tendency that very little is said with very many words.

https://contingentnotarbitrary.com/2018/04/14/rederiving-christianity/
To recap, we started with a secular definition of Logos and noted that its telos is existence. Given human nature, game theory and the power of cooperation, the highest expression of that telos is freely chosen universal love, tempered by constant vigilance against defection while maintaining compassion for the defectors and forgiving those who repent. In addition, we must know the telos in order to fulfill it.

In Christian terms, looks like we got over half of the Ten Commandments (know Logos for the First, don’t defect or tempt yourself to defect for the rest), the importance of free will, the indestructibility of evil (group cooperation vs individual defection), loving the sinner and hating the sin (with defection as the sin), forgiveness (with conditions), and love and compassion toward all, assuming only secular knowledge and that it’s good to exist.

Iterated Prisoner's Dilemma is an Ultimatum Game: http://infoproc.blogspot.com/2012/07/iterated-prisoners-dilemma-is-ultimatum.html
The history of IPD shows that bounded cognition prevented the dominant strategies from being discovered for over over 60 years, despite significant attention from game theorists, computer scientists, economists, evolutionary biologists, etc. Press and Dyson have shown that IPD is effectively an ultimatum game, which is very different from the Tit for Tat stories told by generations of people who worked on IPD (Axelrod, Dawkins, etc., etc.).

...

For evolutionary biologists: Dyson clearly thinks this result has implications for multilevel (group vs individual selection):
... Cooperation loses and defection wins. The ZD strategies confirm this conclusion and make it sharper. ... The system evolved to give cooperative tribes an advantage over non-cooperative tribes, using punishment to give cooperation an evolutionary advantage within the tribe. This double selection of tribes and individuals goes way beyond the Prisoners' Dilemma model.

implications for fractionalized Europe vis-a-vis unified China?

and more broadly does this just imply we're doomed in the long run RE: cooperation, morality, the "good society", so on...? war and group-selection is the only way to get a non-crab bucket civilization?

Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent:
http://www.pnas.org/content/109/26/10409.full
http://www.pnas.org/content/109/26/10409.full.pdf
https://www.edge.org/conversation/william_h_press-freeman_dyson-on-iterated-prisoners-dilemma-contains-strategies-that

https://en.wikipedia.org/wiki/Ultimatum_game

analogy for ultimatum game: the state gives the demos a bargain take-it-or-leave-it, and...if the demos refuses...violence?

The nature of human altruism: http://sci-hub.tw/https://www.nature.com/articles/nature02043
- Ernst Fehr & Urs Fischbacher

Some of the most fundamental questions concerning our evolutionary origins, our social relations, and the organization of society are centred around issues of altruism and selfishness. Experimental evidence indicates that human altruism is a powerful force and is unique in the animal world. However, there is much individual heterogeneity and the interaction between altruists and selfish individuals is vital to human cooperation. Depending on the environment, a minority of altruists can force a majority of selfish individuals to cooperate or, conversely, a few egoists can induce a large number of altruists to defect. Current gene-based evolutionary theories cannot explain important patterns of human altruism, pointing towards the importance of both theories of cultural evolution as well as gene–culture co-evolution.

...

Why are humans so unusual among animals in this respect? We propose that quantitatively, and probably even qualitatively, unique patterns of human altruism provide the answer to this question. Human altruism goes far beyond that which has been observed in the animal world. Among animals, fitness-reducing acts that confer fitness benefits on other individuals are largely restricted to kin groups; despite several decades of research, evidence for reciprocal altruism in pair-wise repeated encounters4,5 remains scarce6–8. Likewise, there is little evidence so far that individual reputation building affects cooperation in animals, which contrasts strongly with what we find in humans. If we randomly pick two human strangers from a modern society and give them the chance to engage in repeated anonymous exchanges in a laboratory experiment, there is a high probability that reciprocally altruistic behaviour will emerge spontaneously9,10.

However, human altruism extends far beyond reciprocal altruism and reputation-based cooperation, taking the form of strong reciprocity11,12. Strong reciprocity is a combination of altruistic rewarding, which is a predisposition to reward others for cooperative, norm-abiding behaviours, and altruistic punishment, which is a propensity to impose sanctions on others for norm violations. Strong reciprocators bear the cost of rewarding or punishing even if they gain no individual economic benefit whatsoever from their acts. In contrast, reciprocal altruists, as they have been defined in the biological literature4,5, reward and punish only if this is in their long-term self-interest. Strong reciprocity thus constitutes a powerful incentive for cooperation even in non-repeated interactions and when reputation gains are absent, because strong reciprocators will reward those who cooperate and punish those who defect.

...

We will show that the interaction between selfish and strongly reciprocal … [more]
concept  conceptual-vocab  wiki  reference  article  models  GT-101  game-theory  anthropology  cultural-dynamics  trust  cooperate-defect  coordination  iteration-recursion  sequential  axelrod  discrete  smoothness  evolution  evopsych  EGT  economics  behavioral-econ  sociology  new-religion  deep-materialism  volo-avolo  characterization  hsu  scitariat  altruism  justice  group-selection  decision-making  tribalism  organizing  hari-seldon  theory-practice  applicability-prereqs  bio  finiteness  multi  history  science  social-science  decision-theory  commentary  study  summary  giants  the-trenches  zero-positive-sum  🔬  bounded-cognition  info-dynamics  org:edge  explanation  exposition  org:nat  eden  retention  long-short-run  darwinian  markov  equilibrium  linear-algebra  nitty-gritty  competition  war  explanans  n-factor  europe  the-great-west-whale  occident  china  asia  sinosphere  orient  decentralized  markets  market-failure  cohesion  metabuch  stylized-facts  interdisciplinary  physics  pdf  pessimism  time  insight  the-basilisk  noblesse-oblige  the-watchers  ideas  l
march 2018 by nhaliday
Stein's example - Wikipedia
Stein's example (or phenomenon or paradox), in decision theory and estimation theory, is the phenomenon that when three or more parameters are estimated simultaneously, there exist combined estimators more accurate on average (that is, having lower expected mean squared error) than any method that handles the parameters separately. It is named after Charles Stein of Stanford University, who discovered the phenomenon in 1955.[1]

An intuitive explanation is that optimizing for the mean-squared error of a combined estimator is not the same as optimizing for the errors of separate estimators of the individual parameters. In practical terms, if the combined error is in fact of interest, then a combined estimator should be used, even if the underlying parameters are independent; this occurs in channel estimation in telecommunications, for instance (different factors affect overall channel performance). On the other hand, if one is instead interested in estimating an individual parameter, then using a combined estimator does not help and is in fact worse.

...

Many simple, practical estimators achieve better performance than the ordinary estimator. The best-known example is the James–Stein estimator, which works by starting at X and moving towards a particular point (such as the origin) by an amount inversely proportional to the distance of X from that point.
nibble  concept  levers  wiki  reference  acm  stats  probability  decision-theory  estimate  distribution  atoms
february 2018 by nhaliday
'P' Versus 'Q': Differences and Commonalities between the Two Areas of Quantitative Finance by Attilio Meucci :: SSRN
There exist two separate branches of finance that require advanced quantitative techniques: the "Q" area of derivatives pricing, whose task is to "extrapolate the present"; and the "P" area of quantitative risk and portfolio management, whose task is to "model the future."

We briefly trace the history of these two branches of quantitative finance, highlighting their different goals and challenges. Then we provide an overview of their areas of intersection: the notion of risk premium; the stochastic processes used, often under different names and assumptions in the Q and in the P world; the numerical methods utilized to simulate those processes; hedging; and statistical arbitrage.
study  essay  survey  ORFE  finance  investing  probability  measure  stochastic-processes  outcome-risk
december 2017 by nhaliday
multivariate analysis - Is it possible to have a pair of Gaussian random variables for which the joint distribution is not Gaussian? - Cross Validated
The bivariate normal distribution is the exception, not the rule!

It is important to recognize that "almost all" joint distributions with normal marginals are not the bivariate normal distribution. That is, the common viewpoint that joint distributions with normal marginals that are not the bivariate normal are somehow "pathological", is a bit misguided.

Certainly, the multivariate normal is extremely important due to its stability under linear transformations, and so receives the bulk of attention in applications.

note: there is a multivariate central limit theorem, so those such applications have no problem
nibble  q-n-a  overflow  stats  math  acm  probability  distribution  gotchas  intricacy  characterization  structure  composition-decomposition  counterexample  limits  concentration-of-measure
october 2017 by nhaliday
Section 10 Chi-squared goodness-of-fit test.
- pf that chi-squared statistic for Pearson's test (multinomial goodness-of-fit) actually has chi-squared distribution asymptotically
- the gotcha: terms Z_j in sum aren't independent
- solution:
- compute the covariance matrix of the terms to be E[Z_iZ_j] = -sqrt(p_ip_j)
- note that an equivalent way of sampling the Z_j is to take a random standard Gaussian and project onto the plane orthogonal to (sqrt(p_1), sqrt(p_2), ..., sqrt(p_r))
- that is equivalent to just sampling a Gaussian w/ 1 less dimension (hence df=r-1)
QED
pdf  nibble  lecture-notes  mit  stats  hypothesis-testing  acm  probability  methodology  proofs  iidness  distribution  limits  identity  direction  lifts-projections
october 2017 by nhaliday
Variance of product of multiple random variables - Cross Validated
prod_i (var[X_i] + (E[X_i])^2) - prod_i (E[X_i])^2

two variable case: var[X] var[Y] + var[X] (E[Y])^2 + (E[X])^2 var[Y]
nibble  q-n-a  overflow  stats  probability  math  identity  moments  arrows  multiplicative  iidness  dependence-independence
october 2017 by nhaliday
Lecture 14: When's that meteor arriving
- Meteors as a random process
- Limiting approximations
- Derivation of the Exponential distribution
- Derivation of the Poisson distribution
- A "Poisson process"
nibble  org:junk  org:edu  exposition  lecture-notes  physics  mechanics  space  earth  probability  stats  distribution  stochastic-processes  closure  additive  limits  approximation  tidbits  acm  binomial  multiplicative
september 2017 by nhaliday
Kelly criterion - Wikipedia
In probability theory and intertemporal portfolio choice, the Kelly criterion, Kelly strategy, Kelly formula, or Kelly bet, is a formula used to determine the optimal size of a series of bets. In most gambling scenarios, and some investing scenarios under some simplifying assumptions, the Kelly strategy will do better than any essentially different strategy in the long run (that is, over a span of time in which the observed fraction of bets that are successful equals the probability that any given bet will be successful). It was described by J. L. Kelly, Jr, a researcher at Bell Labs, in 1956.[1] The practical use of the formula has been demonstrated.[2][3][4]

The Kelly Criterion is to bet a predetermined fraction of assets and can be counterintuitive. In one study,[5][6] each participant was given \$25 and asked to bet on a coin that would land heads 60% of the time. Participants had 30 minutes to play, so could place about 300 bets, and the prizes were capped at \$250. Behavior was far from optimal. "Remarkably, 28% of the participants went bust, and the average payout was just \$91. Only 21% of the participants reached the maximum. 18 of the 61 participants bet everything on one toss, while two-thirds gambled on tails at some stage in the experiment." Using the Kelly criterion and based on the odds in the experiment, the right approach would be to bet 20% of the pot on each throw (see first example in Statement below). If losing, the size of the bet gets cut; if winning, the stake increases.
nibble  betting  investing  ORFE  acm  checklists  levers  probability  algorithms  wiki  reference  atoms  extrema  parsimony  tidbits  decision-theory  decision-making  street-fighting  mental-math  calculation
august 2017 by nhaliday
trees are harlequins, words are harlequins — bayes: a kinda-sorta masterpost
> What sort of person thinks “oh yeah, my beliefs about these coefficients correspond to a Gaussian with variance 2.5″? And what if I do cross-validation, like I always do, and find that variance 200 works better for the problem? Was the other person wrong? But how could they have known?
> ...Even ignoring the mode vs. mean issue, I have never met anyone who could tell whether their beliefs were normally distributed vs. Laplace distributed. Have you?
I must have spent too much time in Bayesland because both those strike me as very easy and I often think them! My beliefs usually are Laplace distributed when it comes to things like genetics (it makes me very sad to see GWASes with flat priors), and my Gaussian coefficients are actually a variance of 0.70 (assuming standardized variables w.l.o.g.) as is consistent with field-wide meta-analyses indicating that d>1 is pretty rare.
ratty  ssc  core-rats  tumblr  social  explanation  init  philosophy  bayesian  thinking  probability  stats  frequentist  big-yud  lesswrong  synchrony  similarity  critique  intricacy  shalizi  scitariat  selection  mutation  evolution  priors-posteriors  regularization  bias-variance  gwern  reddit  commentary  GWAS  genetics  regression  spock  nitty-gritty  generalization  epistemic  🤖  rationality  poast  multi  best-practices  methodology  data-science
august 2017 by nhaliday
Stat 260/CS 294: Bayesian Modeling and Inference
Topics
- Priors (conjugate, noninformative, reference)
- Hierarchical models, spatial models, longitudinal models, dynamic models, survival models
- Testing
- Model choice
- Inference (importance sampling, MCMC, sequential Monte Carlo)
- Nonparametric models (Dirichlet processes, Gaussian processes, neutral-to-the-right processes, completely random measures)
- Decision theory and frequentist perspectives (complete class theorems, consistency, empirical Bayes)
- Experimental design
unit  course  berkeley  expert  michael-jordan  machine-learning  acm  bayesian  probability  stats  lecture-notes  priors-posteriors  markov  monte-carlo  frequentist  latent-variables  decision-theory  expert-experience  confidence  sampling
july 2017 by nhaliday
[1705.03394] That is not dead which can eternal lie: the aestivation hypothesis for resolving Fermi's paradox
If a civilization wants to maximize computation it appears rational to aestivate until the far future in order to exploit the low temperature environment: this can produce a 10^30 multiplier of achievable computation. We hence suggest the "aestivation hypothesis": the reason we are not observing manifestations of alien civilizations is that they are currently (mostly) inactive, patiently waiting for future cosmic eras. This paper analyzes the assumptions going into the hypothesis and how physical law and observational evidence constrain the motivations of aliens compatible with the hypothesis.

http://aleph.se/andart2/space/the-aestivation-hypothesis-popular-outline-and-faq/

simpler explanation (just different math for Drake equation):
Overall the argument is that point estimates should not be shoved into a Drake equation and then multiplied by each, as that requires excess certainty and masks much of the ambiguity of our knowledge about the distributions. Instead, a Bayesian approach should be used, after which the fate of humanity looks much better. Here is one part of the presentation:

Life Versus Dark Energy: How An Advanced Civilization Could Resist the Accelerating Expansion of the Universe: https://arxiv.org/abs/1806.05203
The presence of dark energy in our universe is causing space to expand at an accelerating rate. As a result, over the next approximately 100 billion years, all stars residing beyond the Local Group will fall beyond the cosmic horizon and become not only unobservable, but entirely inaccessible, thus limiting how much energy could one day be extracted from them. Here, we consider the likely response of a highly advanced civilization to this situation. In particular, we argue that in order to maximize its access to useable energy, a sufficiently advanced civilization would chose to expand rapidly outward, build Dyson Spheres or similar structures around encountered stars, and use the energy that is harnessed to accelerate those stars away from the approaching horizon and toward the center of the civilization. We find that such efforts will be most effective for stars with masses in the range of M∼(0.2−1)M⊙, and could lead to the harvesting of stars within a region extending out to several tens of Mpc in radius, potentially increasing the total amount of energy that is available to a future civilization by a factor of several thousand. We also discuss the observable signatures of a civilization elsewhere in the universe that is currently in this state of stellar harvesting.
preprint  study  essay  article  bostrom  ratty  anthropic  philosophy  space  xenobio  computation  physics  interdisciplinary  ideas  hmm  cocktail  temperature  thermo  information-theory  bits  🔬  threat-modeling  time  scale  insight  multi  commentary  liner-notes  pdf  slides  error  probability  ML-MAP-E  composition-decomposition  econotariat  marginal-rev  fermi  risk  org:mat  questions  paradox  intricacy  multiplicative  calculation  street-fighting  methodology  distribution  expectancy  moments  bayesian  priors-posteriors  nibble  measurement  existence  technology  geoengineering  magnitude  spatial  density  spreading  civilization  energy-resources  phys-energy  measure  direction  speculation  structure
may 2017 by nhaliday
Asking the question | West Hunter
Sometimes simply asking the question in the first place is a key step, even when it takes a genius to actually solve the problem. So, even though he couldn’t calculate his way out of a paper bag, Antoine Gombaud, Chevalier de Méré , played an important role in birthing probability theory – by asking Pascal and Fermat to solve the the problem of points – how to divide the stakes of an unfinished series of games. Of course asking the right people is also part of the goodness.

Franciszek Pokorny, who headed the Polish General Staff’s Cipher bureau after World War I, was the first to realize that cryptography and cryptanalysis are essentially mathematical in nature – and that you therefore want to hire mathematicians, rather than classical scholars or members of the band of the battleship California. He recruited Marian Rejewski, Henryk Zygalski and Jerzy Różycki: they weren’t considered world-beaters by other Polish mathematicians – not like Arne Beurling – but they broke Enigma.
west-hunter  scitariat  discussion  history  mostly-modern  science  innovation  discovery  the-trenches  curiosity  info-dynamics  ideas  individualism-collectivism  stories  early-modern  eastern-europe  crypto  probability  low-hanging  alt-inst  organizing  creative
may 2017 by nhaliday
A Unified Theory of Randomness | Quanta Magazine
Beyond the one-dimensional random walk, there are many other kinds of random shapes. There are varieties of random paths, random two-dimensional surfaces, random growth models that approximate, for example, the way a lichen spreads on a rock. All of these shapes emerge naturally in the physical world, yet until recently they’ve existed beyond the boundaries of rigorous mathematical thought. Given a large collection of random paths or random two-dimensional shapes, mathematicians would have been at a loss to say much about what these random objects shared in common.

Yet in work over the past few years, Sheffield and his frequent collaborator, Jason Miller, a professor at the University of Cambridge, have shown that these random shapes can be categorized into various classes, that these classes have distinct properties of their own, and that some kinds of random objects have surprisingly clear connections with other kinds of random objects. Their work forms the beginning of a unified theory of geometric randomness.
news  org:mag  org:sci  math  research  probability  profile  structure  geometry  random  popsci  nibble  emergent  org:inst
february 2017 by nhaliday
More on Multivariate Gaussians
Fact #1: mean and covariance uniquely determine distribution
Fact #3: closure under sum, marginalizing, and conditioning
covariance of conditional distribution is given by a Schur complement (independent of x_B. is that obvious?)
pdf  exposition  lecture-notes  stanford  nibble  distribution  acm  machine-learning  probability  levers  calculation  ground-up  characterization  rigidity  closure  nitty-gritty  linear-algebra  properties
february 2017 by nhaliday
Hoeffding’s Inequality
basic idea of standard pf: bound e^{tX} by line segment (convexity) then use Taylor expansion (in p = b/(b-a), the fraction of range to right of 0) of logarithm
pdf  lecture-notes  exposition  nibble  concentration-of-measure  estimate  proofs  ground-up  acm  probability  series  s:null
february 2017 by nhaliday
Mixing (mathematics) - Wikipedia
One way to describe this is that strong mixing implies that for any two possible states of the system (realizations of the random variable), when given a sufficient amount of time between the two states, the occurrence of the states is independent.

Mixing coefficient is
α(n) = sup{|P(A∪B) - P(A)P(B)| : A in σ(X_0, ..., X_{t-1}), B in σ(X_{t+n}, ...), t >= 0}
for σ(...) the sigma algebra generated by those r.v.s.

So it's a notion of total variational distance between the true distribution and the product distribution.
concept  math  acm  physics  probability  stochastic-processes  definition  mixing  iidness  wiki  reference  nibble  limits  ergodic  math.DS  measure  dependence-independence
february 2017 by nhaliday
st.statistics - Lower bound for sum of binomial coefficients? - MathOverflow
- basically approximate w/ geometric sum (which scales as final term) and you can get it up to O(1) factor
- not good enough for many applications (want 1+o(1) approx.)
- Stirling can also give bound to constant factor precision w/ more calculation I believe
- tighter bound at Section 7.3 here: http://webbuild.knu.ac.kr/~trj/Combin/matousek-vondrak-prob-ln.pdf
q-n-a  overflow  nibble  math  math.CO  estimate  tidbits  magnitude  concentration-of-measure  stirling  binomial  metabuch  tricki  multi  tightness  pdf  lecture-notes  exposition  probability  probabilistic-method  yoga
february 2017 by nhaliday
Relationships among probability distributions - Wikipedia
- One distribution is a special case of another with a broader parameter space
- Transforms (function of a random variable);
- Combinations (function of several variables);
- Approximation (limit) relationships;
- Compound relationships (useful for Bayesian inference);
- Duality;
- Conjugate priors.
stats  probability  characterization  list  levers  wiki  reference  objektbuch  calculation  distribution  nibble  cheatsheet  closure  composition-decomposition  properties
february 2017 by nhaliday
probability - Variance of maximum of Gaussian random variables - Cross Validated
In full generality it is rather hard to find the right order of magnitude of the variance of a Gaussien supremum since the tools from concentration theory are always suboptimal for the maximum function.

order ~ 1/log n
q-n-a  overflow  stats  probability  acm  orders  tails  bias-variance  moments  concentration-of-measure  magnitude  tidbits  distribution  yoga  structure  extrema  nibble
february 2017 by nhaliday
bounds - What is the variance of the maximum of a sample? - Cross Validated
- sum of variances is always a bound
- can't do better even for iid Bernoulli
- looks like nice argument from well-known probabilist (using E[(X-Y)^2] = 2Var X), but not clear to me how he gets to sum_i instead of sum_{i,j} in the union bound?
edit: argument is that, for j = argmax_k Y_k, we have r < X_i - Y_j <= X_i - Y_i for all i, including i = argmax_k X_k
- different proof here (later pages): http://www.ism.ac.jp/editsec/aism/pdf/047_1_0185.pdf
Var(X_n:n) <= sum Var(X_k:n) + 2 sum_{i < j} Cov(X_i:n, X_j:n) = Var(sum X_k:n) = Var(sum X_k) = nσ^2
why are the covariances nonnegative? (are they?). intuitively seems true.
- for that, see https://pinboard.in/u:nhaliday/b:ed4466204bb1
- note that this proof shows more generally that sum Var(X_k:n) <= sum Var(X_k)
- apparently that holds for dependent X_k too? http://mathoverflow.net/a/96943/20644
q-n-a  overflow  stats  acm  distribution  tails  bias-variance  moments  estimate  magnitude  probability  iidness  tidbits  concentration-of-measure  multi  orders  levers  extrema  nibble  bonferroni  coarse-fine  expert  symmetry  s:*  expert-experience  proofs
february 2017 by nhaliday
per page:    204080120160