nhaliday + ml-map-e   24

Fitting a Structural Equation Model
seems rather unrigorous: nonlinear optimization, possibility of nonconvergence, doesn't even mention local vs. global optimality...
pdf  slides  lectures  acm  stats  hypothesis-testing  graphs  graphical-models  latent-variables  model-class  optimization  nonlinearity  gotchas  nibble  ML-MAP-E  iteration-recursion  convergence 
november 2017 by nhaliday
Atrocity statistics from the Roman Era
Christian Martyrs [make link]
Gibbon, Decline & Fall v.2 ch.XVI: < 2,000 k. under Roman persecution.
Ludwig Hertling ("Die Zahl de Märtyrer bis 313", 1944) estimated 100,000 Christians killed between 30 and 313 CE. (cited -- unfavorably -- by David Henige, Numbers From Nowhere, 1998)
Catholic Encyclopedia, "Martyr": number of Christian martyrs under the Romans unknown, unknowable. Origen says not many. Eusebius says thousands.

...

General population decline during The Fall of Rome: 7,000,000 [make link]
- Colin McEvedy, The New Penguin Atlas of Medieval History (1992)
- From 2nd Century CE to 4th Century CE: Empire's population declined from 45M to 36M [i.e. 9M]
- From 400 CE to 600 CE: Empire's population declined by 20% [i.e. 7.2M]
- Paul Bairoch, Cities and economic development: from the dawn of history to the present, p.111
- "The population of Europe except Russia, then, having apparently reached a high point of some 40-55 million people by the start of the third century [ca.200 C.E.], seems to have fallen by the year 500 to about 30-40 million, bottoming out at about 20-35 million around 600." [i.e. ca.20M]
- Francois Crouzet, A History of the European Economy, 1000-2000 (University Press of Virginia: 2001) p.1.
- "The population of Europe (west of the Urals) in c. AD 200 has been estimated at 36 million; by 600, it had fallen to 26 million; another estimate (excluding ‘Russia’) gives a more drastic fall, from 44 to 22 million." [i.e. 10M or 22M]

also:
The geometric mean of these two extremes would come to 4½ per day, which is a credible daily rate for the really bad years.

why geometric mean? can you get it as the MLE given min{X1, ..., Xn} and max{X1, ..., Xn} for {X_i} iid Poissons? some kinda limit? think it might just be a rule of thumb.

yeah, it's a rule of thumb. found it it his book (epub).
org:junk  data  let-me-see  scale  history  iron-age  mediterranean  the-classics  death  nihil  conquest-empire  war  peace-violence  gibbon  trivia  multi  todo  AMT  expectancy  heuristic  stats  ML-MAP-E  data-science  estimate  magnitude  population  demographics  database  list  religion  christianity  leviathan 
september 2017 by nhaliday
[1705.03394] That is not dead which can eternal lie: the aestivation hypothesis for resolving Fermi's paradox
If a civilization wants to maximize computation it appears rational to aestivate until the far future in order to exploit the low temperature environment: this can produce a 10^30 multiplier of achievable computation. We hence suggest the "aestivation hypothesis": the reason we are not observing manifestations of alien civilizations is that they are currently (mostly) inactive, patiently waiting for future cosmic eras. This paper analyzes the assumptions going into the hypothesis and how physical law and observational evidence constrain the motivations of aliens compatible with the hypothesis.

http://aleph.se/andart2/space/the-aestivation-hypothesis-popular-outline-and-faq/

simpler explanation (just different math for Drake equation):
Dissolving the Fermi Paradox: http://www.jodrellbank.manchester.ac.uk/media/eps/jodrell-bank-centre-for-astrophysics/news-and-events/2017/uksrn-slides/Anders-Sandberg---Dissolving-Fermi-Paradox-UKSRN.pdf
http://marginalrevolution.com/marginalrevolution/2017/07/fermi-paradox-resolved.html
Overall the argument is that point estimates should not be shoved into a Drake equation and then multiplied by each, as that requires excess certainty and masks much of the ambiguity of our knowledge about the distributions. Instead, a Bayesian approach should be used, after which the fate of humanity looks much better. Here is one part of the presentation:

Life Versus Dark Energy: How An Advanced Civilization Could Resist the Accelerating Expansion of the Universe: https://arxiv.org/abs/1806.05203
The presence of dark energy in our universe is causing space to expand at an accelerating rate. As a result, over the next approximately 100 billion years, all stars residing beyond the Local Group will fall beyond the cosmic horizon and become not only unobservable, but entirely inaccessible, thus limiting how much energy could one day be extracted from them. Here, we consider the likely response of a highly advanced civilization to this situation. In particular, we argue that in order to maximize its access to useable energy, a sufficiently advanced civilization would chose to expand rapidly outward, build Dyson Spheres or similar structures around encountered stars, and use the energy that is harnessed to accelerate those stars away from the approaching horizon and toward the center of the civilization. We find that such efforts will be most effective for stars with masses in the range of M∼(0.2−1)M⊙, and could lead to the harvesting of stars within a region extending out to several tens of Mpc in radius, potentially increasing the total amount of energy that is available to a future civilization by a factor of several thousand. We also discuss the observable signatures of a civilization elsewhere in the universe that is currently in this state of stellar harvesting.
preprint  study  essay  article  bostrom  ratty  anthropic  philosophy  space  xenobio  computation  physics  interdisciplinary  ideas  hmm  cocktail  temperature  thermo  information-theory  bits  🔬  threat-modeling  time  scale  insight  multi  commentary  liner-notes  pdf  slides  error  probability  ML-MAP-E  composition-decomposition  econotariat  marginal-rev  fermi  risk  org:mat  questions  paradox  intricacy  multiplicative  calculation  street-fighting  methodology  distribution  expectancy  moments  bayesian  priors-posteriors  nibble  measurement  existence  technology  geoengineering  magnitude  spatial  density  spreading  civilization  energy-resources  phys-energy  measure  direction  speculation  structure 
may 2017 by nhaliday
interpretation - How to understand degrees of freedom? - Cross Validated
From Wikipedia, there are three interpretations of the degrees of freedom of a statistic:

In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.

Estimates of statistical parameters can be based upon different amounts of information or data. The number of independent pieces of information that go into the estimate of a parameter is called the degrees of freedom (df). In general, the degrees of freedom of an estimate of a parameter is equal to the number of independent scores that go into the estimate minus the number of parameters used as intermediate steps in the estimation of the parameter itself (which, in sample variance, is one, since the sample mean is the only intermediate step).

Mathematically, degrees of freedom is the dimension of the domain of a random vector, or essentially the number of 'free' components: how many components need to be known before the vector is fully determined.

...

This is a subtle question. It takes a thoughtful person not to understand those quotations! Although they are suggestive, it turns out that none of them is exactly or generally correct. I haven't the time (and there isn't the space here) to give a full exposition, but I would like to share one approach and an insight that it suggests.

Where does the concept of degrees of freedom (DF) arise? The contexts in which it's found in elementary treatments are:

- The Student t-test and its variants such as the Welch or Satterthwaite solutions to the Behrens-Fisher problem (where two populations have different variances).
- The Chi-squared distribution (defined as a sum of squares of independent standard Normals), which is implicated in the sampling distribution of the variance.
- The F-test (of ratios of estimated variances).
- The Chi-squared test, comprising its uses in (a) testing for independence in contingency tables and (b) testing for goodness of fit of distributional estimates.

In spirit, these tests run a gamut from being exact (the Student t-test and F-test for Normal variates) to being good approximations (the Student t-test and the Welch/Satterthwaite tests for not-too-badly-skewed data) to being based on asymptotic approximations (the Chi-squared test). An interesting aspect of some of these is the appearance of non-integral "degrees of freedom" (the Welch/Satterthwaite tests and, as we will see, the Chi-squared test). This is of especial interest because it is the first hint that DF is not any of the things claimed of it.

...

Having been alerted by these potential ambiguities, let's hold up the Chi-squared goodness of fit test for examination, because (a) it's simple, (b) it's one of the common situations where people really do need to know about DF to get the p-value right and (c) it's often used incorrectly. Here's a brief synopsis of the least controversial application of this test:

...

This, many authorities tell us, should have (to a very close approximation) a Chi-squared distribution. But there's a whole family of such distributions. They are differentiated by a parameter νν often referred to as the "degrees of freedom." The standard reasoning about how to determine νν goes like this

I have kk counts. That's kk pieces of data. But there are (functional) relationships among them. To start with, I know in advance that the sum of the counts must equal nn. That's one relationship. I estimated two (or pp, generally) parameters from the data. That's two (or pp) additional relationships, giving p+1p+1 total relationships. Presuming they (the parameters) are all (functionally) independent, that leaves only k−p−1k−p−1 (functionally) independent "degrees of freedom": that's the value to use for νν.

The problem with this reasoning (which is the sort of calculation the quotations in the question are hinting at) is that it's wrong except when some special additional conditions hold. Moreover, those conditions have nothing to do with independence (functional or statistical), with numbers of "components" of the data, with the numbers of parameters, nor with anything else referred to in the original question.

...

Things went wrong because I violated two requirements of the Chi-squared test:

1. You must use the Maximum Likelihood estimate of the parameters. (This requirement can, in practice, be slightly violated.)
2. You must base that estimate on the counts, not on the actual data! (This is crucial.)

...

The point of this comparison--which I hope you have seen coming--is that the correct DF to use for computing the p-values depends on many things other than dimensions of manifolds, counts of functional relationships, or the geometry of Normal variates. There is a subtle, delicate interaction between certain functional dependencies, as found in mathematical relationships among quantities, and distributions of the data, their statistics, and the estimators formed from them. Accordingly, it cannot be the case that DF is adequately explainable in terms of the geometry of multivariate normal distributions, or in terms of functional independence, or as counts of parameters, or anything else of this nature.

We are led to see, then, that "degrees of freedom" is merely a heuristic that suggests what the sampling distribution of a (t, Chi-squared, or F) statistic ought to be, but it is not dispositive. Belief that it is dispositive leads to egregious errors. (For instance, the top hit on Google when searching "chi squared goodness of fit" is a Web page from an Ivy League university that gets most of this completely wrong! In particular, a simulation based on its instructions shows that the chi-squared value it recommends as having 7 DF actually has 9 DF.)
q-n-a  overflow  stats  data-science  concept  jargon  explanation  methodology  things  nibble  degrees-of-freedom  clarity  curiosity  manifolds  dimensionality  ground-up  intricacy  hypothesis-testing  examples  list  ML-MAP-E  gotchas 
january 2017 by nhaliday

bundles : academeacm

related tags

accretion  acm  acmtariat  advanced  ai  AMT  anthropic  approximation  article  assortative-mating  atoms  automata-languages  bayesian  behavioral-gen  bio  biodet  bioinformatics  bits  boltzmann  books  bostrom  calculation  caltech  characterization  christianity  civilization  clarity  classic  clever-rats  cocktail  coding-theory  commentary  comparison  composition-decomposition  computation  concept  conceptual-vocab  confidence  confluence  confusion  conquest-empire  convergence  convexity-curvature  correlation  course  curiosity  curvature  data  data-science  database  death  definition  degrees-of-freedom  demographics  density  dimensionality  direction  distribution  dumb-ML  econotariat  energy-resources  enhancement  entropy-like  error  essay  estimate  examples  existence  expectancy  expert  expert-experience  explanation  exploratory  exposition  fermi  frequentist  GCTA  generalization  generative  genetics  genomics  geoengineering  gibbon  gotchas  graphical-models  graphs  ground-up  GxE  heuristic  history  hmm  howto  hypothesis-testing  ideas  identity  information-theory  init  insight  interdisciplinary  intricacy  iron-age  iteration-recursion  jargon  kernels  language  latent-variables  learning-theory  lecture-notes  lectures  lens  let-me-see  levers  leviathan  linear-models  liner-notes  links  list  machine-learning  magnitude  manifolds  marginal-rev  markov  math  mathtariat  matrix-factorization  measure  measurement  mediterranean  methodology  metrics  missing-heritability  mit  ML-MAP-E  model-class  models  moments  monte-carlo  multi  multiplicative  nibble  nihil  nlp  nonlinearity  optimization  org:bleg  org:junk  org:mat  overflow  p:*  papers  paradox  parametric  pdf  peace-violence  philosophy  phys-energy  physics  piracy  poast  population  population-genetics  preprint  priors-posteriors  probability  programming  properties  python  q-n-a  qra  QTL  questions  ratty  recommendations  reference  reflection  regression  religion  review  risk  sampling  scale  scaling-up  slides  space  spatial  spearhead  speculation  spreading  stackex  stanford  stat-power  stats  street-fighting  structure  study  summary  synthesis  talks  technology  temperature  the-classics  thermo  things  thinking  threat-modeling  time  todo  top-n  trivia  tutorial  twin-study  unit  variance-components  war  wiki  winter-2017  xenobio  yoga  🌞  🔬 

Copy this bookmark:



description:


tags: