nhaliday + q-n-a   916

What is "vectorization"? - Stack Overflow
Many CPUs have "vector" or "SIMD" instruction sets which apply the same operation simultaneously to two, four, or more pieces of data. Modern x86 chips have the SSE instructions, many PPC chips have the "Altivec" instructions, and even some ARM chips have a vector instruction set, called NEON.

"Vectorization" (simplified) is the process of rewriting a loop so that instead of processing a single element of an array N times, it processes (say) 4 elements of the array simultaneously N/4 times.

(I chose 4 because it's what modern hardware is most likely to directly support; the term "vectorization" is also used to describe a higher level software transformation where you might just abstract away the loop altogether and just describe operating on arrays instead of the elements that comprise them)
q-n-a  stackex  programming  systems  performance  concurrency  numerics 
7 hours ago by nhaliday
quality - Is the average number of bugs per loc the same for different programming languages? - Software Engineering Stack Exchange
Contrary to intuition, the number of errors per 1000 lines of does seem to be relatively constant, reguardless of the specific language involved. Steve McConnell, author of Code Complete and Software Estimation: Demystifying the Black Art goes over this area in some detail.

I don't have my copies readily to hand - they're sitting on my bookshelf at work - but a quick Google found a relevant quote:

Industry Average: "about 15 - 50 errors per 1000 lines of delivered code."
(Steve) further says this is usually representative of code that has some level of structured programming behind it, but probably includes a mix of coding techniques.

Quoted from Code Complete, found here: http://mayerdan.com/ruby/2012/11/11/bugs-per-line-of-code-ratio/

If memory serves correctly, Steve goes into a thorough discussion of this, showing that the figures are constant across languages (C, C++, Java, Assembly and so on) and despite difficulties (such as defining what "line of code" means).

Most importantly he has lots of citations for his sources - he's not offering unsubstantiated opinions, but has the references to back them up.
q-n-a  stackex  programming  engineering  nitty-gritty  error  flux-stasis  books  recommendations  software  checking  debugging  pro-rata  pls  comparison  parsimony  measure 
2 days ago by nhaliday
coding style - C++ code in header files - Stack Overflow
There is occasionally some merit to putting code in the header, this can allow more clever inlining by the compiler. But at the same time, it can destroy your compile times since all code has to be processed every time it is included by the compiler.

Finally, it is often annoying to have circular object relationships (sometimes desired) when all the code is the headers.

Bottom line, you were right, he is wrong.

EDIT: I have been thinking about your question. There is one case where what he says is true. templates. Many newer "modern" libraries such as boost make heavy use of templates and often are "header only." However, this should only be done when dealing with templates as it is the only way to do it when dealing with them.
q-n-a  stackex  programming  best-practices  c(pp)  pls  compilers  types 
5 days ago by nhaliday
c - What REALLY happens when you don't free after malloc? - Stack Overflow
keep this stuff in mind when writing competition stuff, can usually just omit deletes/frees unless you're really running up against the memory limit:
Just about every modern operating system will recover all the allocated memory space after a program exits.

...

On the other hand, the similar admonition to close your files on exit has a much more concrete result - if you don't, the data you wrote to them might not get flushed, or if they're a temp file, they might not get deleted when you're done. Also, database handles should have their transactions committed and then closed when you're done with them. Similarly, if you're using an object oriented language like C++ or Objective C, not freeing an object when you're done with it will mean the destructor will never get called, and any resources the class is responsible might not get cleaned up.

--

I really consider this answer wrong.One should always deallocate resources after one is done with them, be it file handles/memory/mutexs. By having that habit, one will not make that sort of mistake when building servers. Some servers are expected to run 24x7. In those cases, any leak of any sort means that your server will eventually run out of that resource and hang/crash in some way. A short utility program, ya a leak isn't that bad. Any server, any leak is death. Do yourself a favor. Clean up after yourself. It's a good habit.

--

Allocation Myth 4: Non-garbage-collected programs should always deallocate all memory they allocate.

The Truth: Omitted deallocations in frequently executed code cause growing leaks. They are rarely acceptable. but Programs that retain most allocated memory until program exit often perform better without any intervening deallocation. Malloc is much easier to implement if there is no free.

In most cases, deallocating memory just before program exit is pointless. The OS will reclaim it anyway. Free will touch and page in the dead objects; the OS won't.

Consequence: Be careful with "leak detectors" that count allocations. Some "leaks" are good!
q-n-a  stackex  programming  memory-management  performance  systems  c(pp)  oly-programming 
14 days ago by nhaliday
macos - AutoHotkey Equivalent for OS X? - Ask Different
hammerspoon looks like best option in that it's scriptable (but probably less featureful than the paid "Keyboard Maestro")
q-n-a  stackex  apple  osx  desktop  yak-shaving  integration-extension  tools 
22 days ago by nhaliday
ellipsis - Why is the subject omitted in sentences like "Thought you'd never ask"? - English Language & Usage Stack Exchange
This is due to a phenomenon that occurs in intimate conversational spoken English called "Conversational Deletion". It was discussed and exemplified quite thoroughly in a 1974 PhD dissertation in linguistics at the University of Michigan that I had the honor of directing.

Thrasher, Randolph H. Jr. 1974. Shouldn't Ignore These Strings: A Study of Conversational Deletion, Ph.D. Dissertation, Linguistics, University of Michigan, Ann Arbor

...

"The phenomenon can be viewed as erosion of the beginning of sentences, deleting (some, but not all) articles, dummies, auxiliaries, possessives, conditional if, and [most relevantly for this discussion -jl] subject pronouns. But it only erodes up to a point, and only in some cases.

"Whatever is exposed (in sentence initial position) can be swept away. If erosion of the first element exposes another vulnerable element, this too may be eroded. The process continues until a hard (non-vulnerable) element is encountered." [ibidem p.9]
q-n-a  stackex  anglo  language  writing  speaking  linguistics  thesis 
7 weeks ago by nhaliday
Applications of computational learning theory in the cognitive sciences - Psychology & Neuroscience Stack Exchange
1. Gold's theorem on the unlearnability in the limit of certain sets of languages, among them context-free ones.

2. Ronald de Wolf's master's thesis on the impossibility to PAC-learn context-free languages.

The first made quiet a stir in the poverty-of-the-stimulus debate, and the second has been unnoticed by cognitive science.
q-n-a  stackex  psychology  cog-psych  learning  learning-theory  machine-learning  PAC  lower-bounds  no-go  language  linguistics  models  fall-2015 
8 weeks ago by nhaliday
soft question - What are good non-English languages for mathematicians to know? - MathOverflow
I'm with Deane here: I think learning foreign languages is not a very mathematically productive thing to do; of course, there are lots of good reasons to learn foreign languages, but doing mathematics is not one of them. Not only are there few modern mathematics papers written in languages other than English, but the primary other language they are written (French) in is pretty easy to read without actually knowing it.

Even though I've been to France several times, my spoken French mostly consists of "merci," "si vous plait," "d'accord" and some food words; I've still skimmed 100 page long papers in French without a lot of trouble.

If nothing else, think of reading a paper in French as a good opportunity to teach Google Translate some mathematical French.
q-n-a  overflow  math  academia  learning  foreign-lang  publishing  science  french  soft-question  math.AG  nibble  quixotic 
10 weeks ago by nhaliday
Vladimir Novakovski's answer to What financial advice would you give to a 21-year-old? - Quora
Learn economics and see that investment and consumption levels (as percentages) depend only marginally on age and existing net worth and mostly on your risk preferences and utility function.
qra  q-n-a  oly  advice  reflection  personal-finance  ORFE  outcome-risk  investing  time-preference  age-generation  dependence-independence  economics 
11 weeks ago by nhaliday
Does left-handedness occur more in certain ethnic groups than others?
Yes. There are some aboriginal tribes in Australia who have about 70% of their population being left-handed. It’s also more than 50% for some South American tribes.

The reason is the same in both cases: a recent past of extreme aggression with other tribes. Left-handedness is caused by recessive genes, but being left-handed is a boost when in hand-to-hand combat with a right-handed guy (who usually has trained extensively with other right-handed guys, as this disposition is genetically dominant so right-handed are majority in most human populations, so lacks experience with a left-handed). Should a particular tribe enter too much war time periods, it’s proportion of left-handeds will naturally rise. As their enemy tribe’s proportion of left-handed people is rising as well, there’s a point at which the natural advantage they get in fighting disipates and can only climb higher should they continuously find new groups to fight with, who are also majority right-handed.

...

So the natural question is: given their advantages in 1-on-1 combat, why doesn’t the percentage grow all the way up to 50% or slightly higher? Because there are COSTS associated with being left-handed, as apparently our neural network is pre-wired towards right-handedness - showing as a reduced life expectancy for lefties. So a mathematical model was proposed to explain their distribution among different societies

THE FIGHTING HYPOTHESIS: STABILITY OF POLYMORPHISM IN HUMAN HANDEDNESS

http://gepv.univ-lille1.fr/downl...

Further, it appears the average left-handedness for humans (~10%) hasn’t changed in thousands of years (judging by the paintings of hands on caves)

Frequency-dependent maintenance of left handedness in humans.

Handedness frequency over more than 10,000 years

[ed.: Compare with Julius Evola's "left-hand path".]
q-n-a  qra  trivia  cocktail  farmers-and-foragers  history  antiquity  race  demographics  bio  EEA  evolution  context  peace-violence  war  ecology  EGT  unintended-consequences  game-theory  equilibrium  anthropology  cultural-dynamics  sapiens  data  database  trends  cost-benefit  strategy  time-series  art  archaeology  measurement  oscillation  pro-rata  iteration-recursion  gender  male-variability  cliometrics  roots  explanation  explanans  correlation  causation  branches 
july 2018 by nhaliday
etymology - What does "no love lost" mean and where does it come from? - English Language & Usage Stack Exchange
Searching Google books, I find that what the phrase originally meant in the 17th and 18th centuries was that "A loves B just as much as B loves A"; the amount of love is balanced, so there is no love lost. In other words, unrequited love was considered to be "lost". This could be used to say they both love each other equally, or they both hate each other equally. The idiom has now come to mean only the second possibility.

--

If two people love each other, then fall out (because of an argument or other reason), then there was love lost between them. But if two people don't care much for each other, then have a falling out, then there really was no love lost between them.

Interestingly, when it was originated in the 1500s, until about 1800, it could indicate either extreme love or extreme hate.
q-n-a  stackex  anglo  language  aphorism  jargon  emotion  sociality  janus  love-hate  literature  history  early-modern  quotes  roots  intricacy  britain  poetry  writing  europe  the-great-west-whale  paradox  parallax  duty  lexical 
april 2018 by nhaliday
Is the human brain analog or digital? - Quora
The brain is neither analog nor digital, but works using a signal processing paradigm that has some properties in common with both.
 
Unlike a digital computer, the brain does not use binary logic or binary addressable memory, and it does not perform binary arithmetic. Information in the brain is represented in terms of statistical approximations and estimations rather than exact values. The brain is also non-deterministic and cannot replay instruction sequences with error-free precision. So in all these ways, the brain is definitely not "digital".
 
At the same time, the signals sent around the brain are "either-or" states that are similar to binary. A neuron fires or it does not. These all-or-nothing pulses are the basic language of the brain. So in this sense, the brain is computing using something like binary signals. Instead of 1s and 0s, or "on" and "off", the brain uses "spike" or "no spike" (referring to the firing of a neuron).
q-n-a  qra  expert-experience  neuro  neuro-nitgrit  analogy  deep-learning  nature  discrete  smoothness  IEEE  bits  coding-theory  communication  trivia  bio  volo-avolo  causation  random  order-disorder  ems  models  methodology  abstraction  nitty-gritty  computation  physics  electromag  scale  coarse-fine 
april 2018 by nhaliday
"Really six people present": origin of phrase commonly attributed to William James - English Language & Usage Stack Exchange
Whenever two people meet, there are really six people present. There is each man as he sees himself, each man as the other person sees him, and each man as he really is.

...

Here's a graph of the number of references of the phrase "really six people present" Click on the first range (1800-1017) and you'll see this, which attributes this statement to Oliver Wendell Holmes. What's perhaps relevant is the reference to "John and James"--I'm guessing two placeholder names.
q-n-a  stackex  quotes  aphorism  law  big-peeps  old-anglo  illusion  truth  anthropology  psychology  cog-psych  social-psych  realness  dennett  biases  neurons  rationality  within-without  theory-of-mind  subjective-objective  forms-instances  parallax  the-self 
march 2018 by nhaliday
orbit - Best approximation for Sun's trajectory around galactic center? - Astronomy Stack Exchange
The Sun orbits in the Galactic potential. The motion is complex; it takes about 230 million years to make a circuit with an orbital speed of around 220 km/s, but at the same time it oscillates up and down with respect to the Galactic plane every ∼70∼70 million years and also wobbles in and out every ∼150∼150 million years (this is called epicyclic motion). The spatial amplitudes of these oscillations are around 100 pc vertically and 300 pc in the radial direction inwards and outwards around an average orbital radius (I am unable to locate a precise figure for the latter).
nibble  q-n-a  overflow  space  oscillation  time  cycles  spatial  trivia  manifolds 
december 2017 by nhaliday
light - Why doesn't the moon twinkle? - Astronomy Stack Exchange
As you mention, when light enters our atmosphere, it goes through several parcels of gas with varying density, temperature, pressure, and humidity. These differences make the refractive index of the parcels different, and since they move around (the scientific term for air moving around is "wind"), the light rays take slightly different paths through the atmosphere.

Stars are point sources
…the Moon is not
nibble  q-n-a  overflow  space  physics  trivia  cocktail  navigation  sky  visuo  illusion  measure  random  electromag  signal-noise  flux-stasis  explanation  explanans  magnitude  atmosphere  roots 
december 2017 by nhaliday
galaxy - How do astronomers estimate the total mass of dust in clouds and galaxies? - Astronomy Stack Exchange
Dust absorbs stellar light (primarily in the ultraviolet), and is heated up. Subsequently it cools by emitting infrared, "thermal" radiation. Assuming a dust composition and grain size distribution, the amount of emitted IR light per unit dust mass can be calculated as a function of temperature. Observing the object at several different IR wavelengths, a Planck curve can be fitted to the data points, yielding the dust temperature. The more UV light incident on the dust, the higher the temperature.

The result is somewhat sensitive to the assumptions, and thus the uncertainties are sometimes quite large. The more IR data points obtained, the better. If only one IR point is available, the temperature cannot be calculated. Then there's a degeneracy between incident UV light and the amount of dust, and the mass can only be estimated to within some orders of magnitude (I think).
nibble  q-n-a  overflow  space  measurement  measure  estimate  physics  electromag  visuo  methodology 
december 2017 by nhaliday
How do you measure the mass of a star? (Beginner) - Curious About Astronomy? Ask an Astronomer
Measuring the mass of stars in binary systems is easy. Binary systems are sets of two or more stars in orbit about each other. By measuring the size of the orbit, the stars' orbital speeds, and their orbital periods, we can determine exactly what the masses of the stars are. We can take that knowledge and then apply it to similar stars not in multiple systems.

We also can easily measure the luminosity and temperature of any star. A plot of luminocity versus temperature for a set of stars is called a Hertsprung-Russel (H-R) diagram, and it turns out that most stars lie along a thin band in this diagram known as the main Sequence. Stars arrange themselves by mass on the Main Sequence, with massive stars being hotter and brighter than their small-mass bretheren. If a star falls on the Main Sequence, we therefore immediately know its mass.

In addition to these methods, we also have an excellent understanding of how stars work. Our models of stellar structure are excellent predictors of the properties and evolution of stars. As it turns out, the mass of a star determines its life history from day 1, for all times thereafter, not only when the star is on the Main Sequence. So actually, the position of a star on the H-R diagram is a good indicator of its mass, regardless of whether it's on the Main Sequence or not.
nibble  q-n-a  org:junk  org:edu  popsci  space  physics  electromag  measurement  mechanics  gravity  cycles  oscillation  temperature  visuo  plots  correlation  metrics  explanation  measure  methodology 
december 2017 by nhaliday
microeconomics - Partial vs. general equilibrium - Economics Stack Exchange
The main difference between partial and general equilibrium models is, that partial equilibrium models assume that what happens on the market one wants to analyze has no effect on other markets.
q-n-a  stackex  explanation  jargon  comparison  concept  models  economics  micro  macro  equilibrium  supply-demand  markets  methodology  competition 
november 2017 by nhaliday
parsing - lexers vs parsers - Stack Overflow
Yes, they are very different in theory, and in implementation.

Lexers are used to recognize "words" that make up language elements, because the structure of such words is generally simple. Regular expressions are extremely good at handling this simpler structure, and there are very high-performance regular-expression matching engines used to implement lexers.

Parsers are used to recognize "structure" of a language phrases. Such structure is generally far beyond what "regular expressions" can recognize, so one needs "context sensitive" parsers to extract such structure. Context-sensitive parsers are hard to build, so the engineering compromise is to use "context-free" grammars and add hacks to the parsers ("symbol tables", etc.) to handle the context-sensitive part.

Neither lexing nor parsing technology is likely to go away soon.

They may be unified by deciding to use "parsing" technology to recognize "words", as is currently explored by so-called scannerless GLR parsers. That has a runtime cost, as you are applying more general machinery to what is often a problem that doesn't need it, and usually you pay for that in overhead. Where you have lots of free cycles, that overhead may not matter. If you process a lot of text, then the overhead does matter and classical regular expression parsers will continue to be used.
q-n-a  stackex  programming  compilers  automata  explanation  comparison  jargon  strings 
november 2017 by nhaliday
general relativity - What if the universe is rotating as a whole? - Physics Stack Exchange
To find out whether the universe is rotating, in principle the most straightforward test is to watch the motion of a gyroscope relative to the distant galaxies. If it rotates at an angular velocity -ω relative to them, then the universe is rotating at angular velocity ω. In practice, we do not have mechanical gyroscopes with small enough random and systematic errors to put a very low limit on ω. However, we can use the entire solar system as a kind of gyroscope. Solar-system observations put a model-independent upper limit of 10^-7 radians/year on the rotation,[Clemence 1957] which is an order of magnitude too lax to rule out the Gödel metric.
nibble  q-n-a  overflow  physics  relativity  gedanken  direction  absolute-relative  big-picture  space  experiment  measurement  volo-avolo 
november 2017 by nhaliday
What is the connection between special and general relativity? - Physics Stack Exchange
Special relativity is the "special case" of general relativity where spacetime is flat. The speed of light is essential to both.
nibble  q-n-a  overflow  physics  relativity  explanation  synthesis  hi-order-bits  ground-up  gravity  summary  aphorism  differential  geometry 
november 2017 by nhaliday
What is the difference between general and special relativity? - Quora
General Relativity is, quite simply, needed to explain gravity.

Special Relativity is the special case of GR, when the metric is flat — which means no gravity.

You need General Relativity when the metric gets all curvy, and when things start to experience gravitation.
nibble  q-n-a  qra  explanation  physics  relativity  synthesis  hi-order-bits  ground-up  gravity  summary  aphorism  differential  geometry 
november 2017 by nhaliday
python - Short Description of the Scoping Rules? - Stack Overflow
Actually, a concise rule for Python Scope resolution, from Learning Python, 3rd. Ed.. (These rules are specific to variable names, not attributes. If you reference it without a period, these rules apply)

LEGB Rule.

L, Local — Names assigned in any way within a function (def or lambda)), and not declared global in that function.

E, Enclosing-function locals — Name in the local scope of any and all statically enclosing functions (def or lambda), from inner to outer.

G, Global (module) — Names assigned at the top-level of a module file, or by executing a global statement in a def within the file.

B, Built-in (Python) — Names preassigned in the built-in names module : open,range,SyntaxError,...

As a caveat to Global access - reading a global variable can happen without explicit declaration, but writing to it without declaring global(var_name) will instead create a new local instance.

--

Essentially, the only thing in Python that introduces a new scope is a function definition. Classes are a bit of a special case in that anything defined directly in the body is placed in the class's namespace, but they are not directly accessible from within the methods (or nested classes) they contain.
q-n-a  stackex  programming  intricacy  gotchas  python  pls  objektbuch  cheatsheet 
november 2017 by nhaliday
Homebrew: List only installed top level formulas - Stack Overflow
Use brew leaves: show installed formulae that are not dependencies of another installed formula.
q-n-a  stackex  howto  yak-shaving  programming  osx  terminal  network-structure  graphs  trivia  tip-of-tongue  workflow  build-packaging 
november 2017 by nhaliday
awk - Assigning system command's output to variable - Stack Overflow
awk 'BEGIN {"date" | getline mydate; close("date"); print "returns", mydate}'
q-n-a  stackex  howto  yak-shaving  terminal  programming  gotchas 
november 2017 by nhaliday
functions - What are the use cases for different scoping constructs? - Mathematica Stack Exchange
As you mentioned there are many things to consider and a detailed discussion is possible. But here are some rules of thumb that I apply the majority of the time:

Module[{x}, ...] is the safest and may be needed if either

There are existing definitions for x that you want to avoid breaking during the evaluation of the Module, or
There is existing code that relies on x being undefined (for example code like Integrate[..., x]).
Module is also the only choice for creating and returning a new symbol. In particular, Module is sometimes needed in advanced Dynamic programming for this reason.

If you are confident there aren't important existing definitions for x or any code relying on it being undefined, then Block[{x}, ...] is often faster. (Note that, in a project entirely coded by you, being confident of these conditions is a reasonable "encapsulation" standard that you may wish to enforce anyway, and so Block is often a sound choice in these situations.)

With[{x = ...}, expr] is the only scoping construct that injects the value of x inside Hold[...]. This is useful and important. With can be either faster or slower than Block depending on expr and the particular evaluation path that is taken. With is less flexible, however, since you can't change the definition of x inside expr.
q-n-a  stackex  programming  CAS  trivia  howto  best-practices  checklists 
november 2017 by nhaliday
gn.general topology - Pair of curves joining opposite corners of a square must intersect---proof? - MathOverflow
In his 'Ordinary Differential Equations' (sec. 1.2) V.I. Arnold says "... every pair of curves in the square joining different pairs of opposite corners must intersect".

This is obvious geometrically but I was wondering how one could go about proving this rigorously. I have thought of a proof using Brouwer's Fixed Point Theorem which I describe below. I would greatly appreciate the group's comments on whether this proof is right and if a simpler proof is possible.

...

Since the full Jordan curve theorem is quite subtle, it might be worth pointing out that theorem in question reduces to the Jordan curve theorem for polygons, which is easier.

Suppose on the contrary that the curves A,BA,B joining opposite corners do not meet. Since A,BA,B are closed sets, their minimum distance apart is some ε>0ε>0. By compactness, each of A,BA,B can be partitioned into finitely many arcs, each of which lies in a disk of diameter <ε/3<ε/3. Then, by a homotopy inside each disk we can replace A,BA,B by polygonal paths A′,B′A′,B′ that join the opposite corners of the square and are still disjoint.

Also, we can replace A′,B′A′,B′ by simple polygonal paths A″,B″A″,B″ by omitting loops. Now we can close A″A″ to a polygon, and B″B″ goes from its "inside" to "outside" without meeting it, contrary to the Jordan curve theorem for polygons.

- John Stillwell
nibble  q-n-a  overflow  math  geometry  topology  tidbits  intricacy  intersection  proofs  gotchas  oly  mathtariat  fixed-point  math.AT  manifolds  intersection-connectedness 
october 2017 by nhaliday
multivariate analysis - Is it possible to have a pair of Gaussian random variables for which the joint distribution is not Gaussian? - Cross Validated
The bivariate normal distribution is the exception, not the rule!

It is important to recognize that "almost all" joint distributions with normal marginals are not the bivariate normal distribution. That is, the common viewpoint that joint distributions with normal marginals that are not the bivariate normal are somehow "pathological", is a bit misguided.

Certainly, the multivariate normal is extremely important due to its stability under linear transformations, and so receives the bulk of attention in applications.

note: there is a multivariate central limit theorem, so those such applications have no problem
nibble  q-n-a  overflow  stats  math  acm  probability  distribution  gotchas  intricacy  characterization  structure  composition-decomposition  counterexample  limits  concentration-of-measure 
october 2017 by nhaliday
Karl Pearson and the Chi-squared Test
Pearson's paper of 1900 introduced what subsequently became known as the chi-squared test of goodness of fit. The terminology and allusions of 80 years ago create a barrier for the modern reader, who finds that the interpretation of Pearson's test procedure and the assessment of what he achieved are less than straightforward, notwithstanding the technical advances made since then. An attempt is made here to surmount these difficulties by exploring Pearson's relevant activities during the first decade of his statistical career, and by describing the work by his contemporaries and predecessors which seem to have influenced his approach to the problem. Not all the questions are answered, and others remain for further study.

original paper: http://www.economics.soton.ac.uk/staff/aldrich/1900.pdf

How did Karl Pearson come up with the chi-squared statistic?: https://stats.stackexchange.com/questions/97604/how-did-karl-pearson-come-up-with-the-chi-squared-statistic
He proceeds by working with the multivariate normal, and the chi-square arises as a sum of squared standardized normal variates.

You can see from the discussion on p160-161 he's clearly discussing applying the test to multinomial distributed data (I don't think he uses that term anywhere). He apparently understands the approximate multivariate normality of the multinomial (certainly he knows the margins are approximately normal - that's a very old result - and knows the means, variances and covariances, since they're stated in the paper); my guess is that most of that stuff is already old hat by 1900. (Note that the chi-squared distribution itself dates back to work by Helmert in the mid-1870s.)

Then by the bottom of p163 he derives a chi-square statistic as "a measure of goodness of fit" (the statistic itself appears in the exponent of the multivariate normal approximation).

He then goes on to discuss how to evaluate the p-value*, and then he correctly gives the upper tail area of a χ212χ122 beyond 43.87 as 0.000016. [You should keep in mind, however, that he didn't correctly understand how to adjust degrees of freedom for parameter estimation at that stage, so some of the examples in his papers use too high a d.f.]
nibble  papers  acm  stats  hypothesis-testing  methodology  history  mostly-modern  pre-ww2  old-anglo  giants  science  the-trenches  stories  multi  q-n-a  overflow  explanation  summary  innovation  discovery  distribution  degrees-of-freedom  limits 
october 2017 by nhaliday
self study - Looking for a good and complete probability and statistics book - Cross Validated
I never had the opportunity to visit a stats course from a math faculty. I am looking for a probability theory and statistics book that is complete and self-sufficient. By complete I mean that it contains all the proofs and not just states results.
nibble  q-n-a  overflow  data-science  stats  methodology  books  recommendations  list  top-n  confluence  proofs  rigor  reference  accretion 
october 2017 by nhaliday
« earlier      
per page:    204080120160

bundles : meta

related tags

-_-  2016-election  :/  aaronson  ability-competence  absolute-relative  abstraction  academia  accessibility  accretion  accuracy  acm  acmtariat  additive  additive-combo  aDNA  adversarial  advertising  advice  africa  afterlife  age-generation  age-of-discovery  aggregator  aging  agri-mindset  agriculture  ai  ai-control  akrasia  albion  alg-combo  algebra  algebraic-complexity  algorithmic-econ  algorithms  alien-character  allodium  alt-inst  altruism  ama  AMT  analogy  analysis  analytical-holistic  anglo  anglosphere  announcement  anomie  anonymity  anthropology  antidemos  antiquity  aphorism  api  apollonian-dionysian  app  apple  applicability-prereqs  applications  approximation  arbitrage  archaeology  archaics  aristos  arms  arrows  art  article  asia  atmosphere  atoms  attaq  attention  audio  authoritarianism  autism  automata  automation  average-case  aversion  axelrod  axioms  backup  bangbang  bare-hands  barons  bayesian  beauty  behavioral-econ  behavioral-gen  being-right  berkeley  best-practices  better-explained  bias-variance  biases  bible  big-list  big-peeps  big-picture  big-surf  binomial  bio  biodet  bioinformatics  biomechanics  biophysical-econ  biotech  bitcoin  bits  blog  blowhards  boltzmann  bonferroni  books  boolean-analysis  bootstraps  borel-cantelli  branches  brands  britain  broad-econ  browser  brunn-minkowski  build-packaging  business  business-models  c(pp)  c:***  calculation  california  caltech  canada  cancer  canon  capital  capitalism  career  cartoons  CAS  causation  censorship  characterization  charity  chart  cheatsheet  checking  checklists  chemistry  china  christianity  circuits  civic  civil-liberty  civilization  cjones-like  clarity  class  classic  classical  classification  climate-change  cliometrics  closure  cloud  coalitions  coarse-fine  cochrane  cocktail  cocoa  code-dive  coding-theory  cog-psych  cohesion  collaboration  coloring  comedy  coming-apart  commentary  communication  communication-complexity  communism  community  commutativity  comparison  compensation  competition  compilers  complement-substitute  complex-systems  complexity  composition-decomposition  compressed-sensing  compression  computation  computational-geometry  computer-vision  concentration-of-measure  concept  conceptual-vocab  concrete  concurrency  confidence  confluence  confucian  confusion  conquest-empire  constraint-satisfaction  consumerism  context  contradiction  contrarianism  convergence  convexity-curvature  cooking  cool  cooperate-defect  coordination  core-rats  corporation  correlation  corruption  cost-benefit  cost-disease  counter-revolution  counterexample  counterfactual  counting  courage  cracker-econ  creative  crime  criminal-justice  criminology  CRISPR  critique  crooked  crypto  cryptocurrency  cs  cultural-dynamics  culture  culture-war  curiosity  current-events  curvature  cycles  cynicism-idealism  darwinian  data  data-science  data-structures  database  dataset  dataviz  death  debate  debt  debugging  decentralized  decision-making  decision-theory  deep-learning  deep-materialism  deepgoog  defense  definite-planning  definition  degrees-of-freedom  democracy  demographic-transition  demographics  dennett  density  dependence-independence  descriptive  design  desktop  deterrence  developing-world  developmental  devtools  diet  differential  dimensionality  direct-indirect  direction  dirty-hands  discipline  discovery  discrete  discrimination  discussion  disease  distributed  distribution  divergence  diversity  diy  documentation  domestication  douthatish  DP  draft  drama  driving  dropbox  drugs  duality  duplication  duty  dynamic  dynamical  dysgenics  early-modern  earth  eastern-europe  ecology  econ-metrics  econ-productivity  econometrics  economics  econotariat  eden  eden-heaven  education  EEA  effect-size  efficiency  egalitarianism-hierarchy  EGT  elections  electromag  elite  embedded  embedded-cognition  embeddings  embodied  embodied-pack  emergent  emotion  empirical  ems  encyclopedic  endo-exo  endocrine  endogenous-exogenous  endurance  energy-resources  engineering  enhancement  enlightenment-renaissance-restoration-reformation  ensembles  entertainment  entrepreneurialism  entropy-like  environment  epidemiology  epistemic  equilibrium  erdos  ergo  ergodic  error  essay  estimate  ethanol  ethics  ethnography  EU  europe  events  evidence  evidence-based  evolution  evopsych  examples  exegesis-hermeneutics  existence  exit-voice  exocortex  expansionism  expectancy  experiment  expert  expert-experience  explanans  explanation  exploratory  explore-exploit  exposition  expression-survival  externalities  extra-introversion  extrema  facebook  fall-2015  fall-2016  faq  farmers-and-foragers  fashun  features  fedja  fermi  fertility  feudal  feynman  fiction  fields  fighting  film  finance  finiteness  fire  fisher  fitness  fitsci  fixed-point  flexibility  fluid  flux-stasis  focus  food  foreign-lang  foreign-policy  formal-values  forms-instances  forum  fourier  free  free-riding  french  frequency  frequentist  frontend  frontier  functional  futurism  gallic  game-theory  games  garett-jones  gavisti  gbooks  GCTA  gedanken  gender  gender-diff  gene-flow  generalization  generative  genetics  genomics  geoengineering  geography  geometry  geopolitics  germanic  giants  gibbon  git  gnon  gnosis-logos  gnxp  good-evil  google  gotchas  government  gowers  grad-school  gradient-descent  graph-theory  graphical-models  graphics  graphs  gravity  gray-econ  great-powers  greedy  ground-up  group-level  group-selection  growth  growth-econ  GT-101  gtd  GWAS  gwern  h2o  habit  haidt  hamming  hanson  hanushek  hard-core  hard-tech  hardness  hardware  harvard  hashing  health  healthcare  heavy-industry  heterodox  heuristic  hi-order-bits  hidden-motives  hierarchy  high-dimension  high-variance  higher-ed  history  hive-mind  hmm  hn  homo-hetero  homogeneity  housing  howto  hsu  huge-data-the-biggest  human-bean  human-capital  human-ml  human-study  humility  hypochondria  hypothesis-testing  ideas  identity  identity-politics  ideology  idk  IEEE  iidness  illusion  immune  impact  impetus  impro  incentives  india  individualism-collectivism  induction  industrial-org  industrial-revolution  inequality  info-dynamics  info-econ  info-foraging  infographic  information-theory  init  inner-product  innovation  input-output  insight  instinct  institutions  integral  integration-extension  integrity  intel  intelligence  interdisciplinary  interests  internet  intersection  intersection-connectedness  intervention  interview  interview-prep  intricacy  intuition  invariance  investing  iq  iran  iraq-syria  iron-age  is-ought  islam  isotropy  israel  iteration-recursion  janus  japan  jargon  jobs  journos-pundits  judaism  justice  kinship  knowledge  korea  kumbaya-kult  labor  language  large-factor  latent-variables  latex  latin-america  lattice  law  leadership  learning  learning-theory  lecture-notes  lectures  left-wing  legacy  legibility  lens  lesswrong  let-me-see  letters  levers  leviathan  lexical  libraries  life-history  lifestyle  lifts-projections  limits  linear-algebra  linear-models  linear-programming  linearity  liner-notes  linguistics  links  linux  list  literature  lived-experience  local-global  logic  lol  long-short-run  long-term  longevity  longitudinal  love-hate  low-hanging  lower-bounds  machine-learning  macro  madisonian  magnitude  maker  malaise  male-variability  malthus  management  manifolds  maps  marginal  marginal-rev  market-failure  market-power  marketing  markets  markov  martial  martingale  matching  math  math.AC  math.AG  math.AT  math.CA  math.CO  math.CT  math.CV  math.DS  math.FA  math.GN  math.GR  math.MG  math.NT  math.RT  mathtariat  matrix-factorization  meaningness  measure  measurement  mechanics  media  medicine  medieval  mediterranean  memes(ew)  memory-management  MENA  mena4  mendel-randomization  mental-math  meta-analysis  meta:math  meta:medicine  meta:research  meta:rhetoric  meta:science  meta:war  metabolic  metabuch  metameta  methodology  metric-space  metrics  micro  microbiz  microfoundations  migrant-crisis  migration  military  minimum-viable  missing-heritability  mit  ML-MAP-E  mobile  mobility  model-class  model-organism  model-selection  models  modernity  mokyr-allen-mccloskey  moloch  moments  monetary-fiscal  money  monotonicity  monte-carlo  morality  mostly-modern  motivation  msr  multi  multiplicative  music  music-theory  mutation  mystic  myth  n-factor  narrative  nascent-state  nationalism-globalism  natural-experiment  naturality  nature  navigation  neocons  network-structure  networking  neuro  neuro-nitgrit  neurons  new-religion  news  nibble  nietzschean  nihil  nitty-gritty  nlp  no-go  noahpinion  noble-lie  noise-structure  nonlinearity  nonparametric  nordic  norms  northeast  notation  notetaking  novelty  nuclear  null-result  numerics  nutrition  nyc  objektbuch  ocaml-sml  occam  occident  oceans  ocw  off-convex  offense-defense  old-anglo  oly  oly-programming  online-learning  open-closed  open-problems  operational  opioids  opsec  optics  optimate  optimism  optimization  order-disorder  orders  ORFE  org:anglo  org:biz  org:bleg  org:data  org:davos  org:econlib  org:edge  org:edu  org:euro  org:fin  org:foreign  org:health  org:inst  org:junk  org:lite  org:local  org:mag  org:nat  org:ngo  org:popup  org:rec  org:sci  org:theos  organization  organizing  orient  orourke  orwellian  oscillation  oss  osx  outcome-risk  outdoors  outliers  overflow  oxbridge  p:null  p:someday  p:whenever  PAC  paganism  papers  paradox  parallax  parametric  parasites-microbiome  parenting  parsimony  paternal-age  path-dependence  patho-altruism  paul-romer  pdf  peace-violence  people  performance  personal-finance  personality  perturbation  pessimism  peter-singer  phalanges  phase-transition  phd  philosophy  photography  phys-energy  physics  pic  pigeonhole-markov  piketty  pinboard  pinker  piracy  planning  play  plots  pls  poast  podcast  poetry  polarization  policy  polisci  political-econ  politics  poll  polynomials  pop-structure  popsci  population  population-genetics  positivity  postmortem  potential  power  power-law  practice  pragmatic  pre-2013  pre-ww2  prediction  preference-falsification  prejudice  prepping  presentation  primitivism  princeton  prioritizing  privacy  pro-rata  probabilistic-method  probability  problem-solving  productivity  prof  programming  progression  project  proof-systems  proofs  propaganda  properties  property-rights  proposal  protestant-catholic  protocol  prudence  pseudoE  pseudorandomness  psych-architecture  psychiatry  psychology  psychometrics  public-goodish  public-health  publishing  puzzles  python  q-n-a  qra  QTL  quality  quantifiers-sums  quantitative-qualitative  quantum  quantum-info  questions  quixotic  quora  quotes  r-lang  race  rand-approx  rand-complexity  random  random-matrices  randy-ayndy  ranking  rant  rationality  ratty  reading  real-nominal  realness  realpolitik  reason  rec-math  recent-selection  recommendations  recruiting  red-queen  reddit  redistribution  reduction  reference  reflection  regression  regularity  regularization  regularizer  regulation  reinforcement  relativity  relativization  relaxation  religion  rent-seeking  replication  reputation  research  research-program  retention  retrofit  review  revolution  rhetoric  rhythm  right-wing  rigidity  rigor  rigorous-crypto  rindermann-thompson  risk  roadmap  robotics  robust  roots  rot  rounding  running  russia  ryan-odonnell  s-factor  s:*  s:**  s:***  s:null  saas  safety  sampling  sampling-bias  sanctity-degradation  sanjeev-arora  sapiens  scale  scaling-tech  scaling-up  schelling  scholar  scholar-pack  schools  science  science-anxiety  scifi-fantasy  scitariat  scott-sumner  SDP  search  securities  security  selection  sensitivity  separation  sequential  series  shannon  shift  shipping  short-circuit  sib-study  signal-noise  signaling  signum  similarity  simler  simplex  singularity  sinosphere  skeleton  skunkworks  sky  sleuthin  slides  slippery-slope  smoothness  soccer  social  social-capital  social-choice  social-psych  social-science  social-structure  sociality  society  sociology  socs-and-mops  soft-question  software  space  space-complexity  sparsity  spatial  speaking  spearhead  spectral  speculation  speed  speedometer  spock  sports  spreading  ssc  stackex  stamina  startups  stat-mech  stat-power  state  state-of-art  statesmen  stats  status  stereotypes  stirling  stochastic-processes  stock-flow  store  stories  strategy  straussian  stream  street-fighting  strings  structure  study  studying  stylized-facts  subculture  subjective-objective  sublinear  submodular  success  sulla  sum-of-squares  summary  supply-demand  survey  survival  sv  swimming  symmetry  synchrony  syntax  synthesis  system-design  systematic-ad-hoc  systems  tactics  tails  taxes  tcs  tcstariat  teaching  tech  technocracy  technology  techtariat  telos-atelos  temperance  temperature  tensors  terminal  terrorism  texas  the-basilisk  the-bones  the-classics  the-great-west-whale  the-prices  the-self  the-south  the-trenches  the-watchers  the-west  the-world-is-just-atoms  theory-of-mind  theos  thermo  thesis  thick-thin  thiel  things  thinking  threat-modeling  thucydides  thurston  tidbits  tightness  tim-roughgarden  time  time-complexity  time-preference  time-series  time-use  tip-of-tongue  todo  tolkienesque  toolkit  tools  top-n  topology  toxoplasmosis  toys  traces  track-record  trade  tradeoffs  tradition  transitions  transportation  travel  trees  trends  tribalism  tricki  tricks  trivia  troll  trump  trust  truth  turing  tutorial  tutoring  tv  twitter  types  ui  unaffiliated  uncertainty  unintended-consequences  uniqueness  unit  unix  unsupervised  urban  urban-rural  us-them  usa  usaco-ioi  ux  vague  values  vampire-squid  variance-components  vazirani  vcs  venture  video  virtu  virtualization  visual-understanding  visualization  visuo  vitality  volo-avolo  vulgar  walls  war  washington  water  waves  wealth  wealth-of-nations  web  webapp  weird  welfare-state  west-hunter  westminster  wigderson  wiki  wild-ideas  wire-guided  wisdom  within-without  wonkish  wordlessness  workflow  working-stiff  world  world-war  wormholes  worrydream  writing  wut  X-not-about-Y  x-sports  yak-shaving  yoga  yvain  zeitgeist  zero-positive-sum  zooming  🌞  🎓  🎩  🐸  👳  👽  🔬  🖥  🤖  🦉 

Copy this bookmark:



description:


tags: