Ask HN: What's your speciality, and what's your "FizzBuzz" equivalent? | Hacker News

hn discussion q-n-a tech programming recruiting checking short-circuit analogy lens init ground-up interdisciplinary cs IEEE electromag math probability finance ORFE marketing dbs audio writing data-science stats hypothesis-testing devops debugging security networking web frontend javascript chemistry gedanken examples fourier acm linear-algebra matrix-factorization iterative-methods embedded multi human-capital

6 days ago by nhaliday

hn discussion q-n-a tech programming recruiting checking short-circuit analogy lens init ground-up interdisciplinary cs IEEE electromag math probability finance ORFE marketing dbs audio writing data-science stats hypothesis-testing devops debugging security networking web frontend javascript chemistry gedanken examples fourier acm linear-algebra matrix-factorization iterative-methods embedded multi human-capital

6 days ago by nhaliday

Mathematics for Machine Learning | Companion webpage to the book “Mathematics for Machine Learning”. Copyright 2019 by Marc Peter Deisenroth, A Aldo Faisal, and Cheng Soon Ong. To be published by Cambridge University Press.

unit books draft math acm machine-learning data-science ground-up init encyclopedic yoga

4 weeks ago by nhaliday

unit books draft math acm machine-learning data-science ground-up init encyclopedic yoga

4 weeks ago by nhaliday

Rational Sines of Rational Multiples of p

july 2019 by nhaliday

For which rational multiples of p is the sine rational? We have the three trivial cases

[0, pi/2, pi/6]

and we wish to show that these are essentially the only distinct rational sines of rational multiples of p.

The assertion about rational sines of rational multiples of p follows from two fundamental lemmas. The first is

Lemma 1: For any rational number q the value of sin(qp) is a root of a monic polynomial with integer coefficients.

[Pf uses some ideas unfamiliar to me: similarity parameter of Moebius (linear fraction) transformations, and finding a polynomial for a desired root by constructing a Moebius transformation with a finite period.]

...

Lemma 2: Any root of a monic polynomial f(x) with integer coefficients must either be an integer or irrational.

[Gauss's Lemma, cf Dummit-Foote.]

...

nibble
tidbits
org:junk
analysis
trivia
math
algebra
polynomials
fields
characterization
direction
math.CA
math.CV
ground-up
[0, pi/2, pi/6]

and we wish to show that these are essentially the only distinct rational sines of rational multiples of p.

The assertion about rational sines of rational multiples of p follows from two fundamental lemmas. The first is

Lemma 1: For any rational number q the value of sin(qp) is a root of a monic polynomial with integer coefficients.

[Pf uses some ideas unfamiliar to me: similarity parameter of Moebius (linear fraction) transformations, and finding a polynomial for a desired root by constructing a Moebius transformation with a finite period.]

...

Lemma 2: Any root of a monic polynomial f(x) with integer coefficients must either be an integer or irrational.

[Gauss's Lemma, cf Dummit-Foote.]

...

july 2019 by nhaliday

Bareiss algorithm - Wikipedia

june 2019 by nhaliday

During the execution of Bareiss algorithm, every integer that is computed is the determinant of a submatrix of the input matrix. This allows, using the Hadamard inequality, to bound the size of these integers. Otherwise, the Bareiss algorithm may be viewed as a variant of Gaussian elimination and needs roughly the same number of arithmetic operations.

nibble
ground-up
cs
tcs
algorithms
complexity
linear-algebra
numerics
sci-comp
fields
calculation
nitty-gritty
june 2019 by nhaliday

What every computer scientist should know about floating-point arithmetic

may 2019 by nhaliday

Floating-point arithmetic is considered as esoteric subject by many people. This is rather surprising, because floating-point is ubiquitous in computer systems: Almost every language has a floating-point datatype; computers from PCs to supercomputers have floating-point accelerators; most compilers will be called upon to compile floating-point algorithms from time to time; and virtually every operating system must respond to floating-point exceptions such as overflow. This paper presents a tutorial on the aspects of floating-point that have a direct impact on designers of computer systems. It begins with background on floating-point representation and rounding error, continues with a discussion of the IEEE floating point standard, and concludes with examples of how computer system builders can better support floating point.

https://stackoverflow.com/questions/2729637/does-epsilon-really-guarantees-anything-in-floating-point-computations

"you must use an epsilon when dealing with floats" is a knee-jerk reaction of programmers with a superficial understanding of floating-point computations, for comparisons in general (not only to zero).

This is usually unhelpful because it doesn't tell you how to minimize the propagation of rounding errors, it doesn't tell you how to avoid cancellation or absorption problems, and even when your problem is indeed related to the comparison of two floats, it doesn't tell you what value of epsilon is right for what you are doing.

...

Regarding the propagation of rounding errors, there exists specialized analyzers that can help you estimate it, because it is a tedious thing to do by hand.

https://www.di.ens.fr/~cousot/projects/DAEDALUS/synthetic_summary/CEA/Fluctuat/index.html

This was part of HW1 of CS24:

https://en.wikipedia.org/wiki/Kahan_summation_algorithm

In particular, simply summing n numbers in sequence has a worst-case error that grows proportional to n, and a root mean square error that grows as {\displaystyle {\sqrt {n}}} {\sqrt {n}} for random inputs (the roundoff errors form a random walk).[2] With compensated summation, the worst-case error bound is independent of n, so a large number of values can be summed with an error that only depends on the floating-point precision.[2]

cf:

https://en.wikipedia.org/wiki/Pairwise_summation

In numerical analysis, pairwise summation, also called cascade summation, is a technique to sum a sequence of finite-precision floating-point numbers that substantially reduces the accumulated round-off error compared to naively accumulating the sum in sequence.[1] Although there are other techniques such as Kahan summation that typically have even smaller round-off errors, pairwise summation is nearly as good (differing only by a logarithmic factor) while having much lower computational cost—it can be implemented so as to have nearly the same cost (and exactly the same number of arithmetic operations) as naive summation.

In particular, pairwise summation of a sequence of n numbers xn works by recursively breaking the sequence into two halves, summing each half, and adding the two sums: a divide and conquer algorithm. Its worst-case roundoff errors grow asymptotically as at most O(ε log n), where ε is the machine precision (assuming a fixed condition number, as discussed below).[1] In comparison, the naive technique of accumulating the sum in sequence (adding each xi one at a time for i = 1, ..., n) has roundoff errors that grow at worst as O(εn).[1] Kahan summation has a worst-case error of roughly O(ε), independent of n, but requires several times more arithmetic operations.[1] If the roundoff errors are random, and in particular have random signs, then they form a random walk and the error growth is reduced to an average of {\displaystyle O(\varepsilon {\sqrt {\log n}})} O(\varepsilon {\sqrt {\log n}}) for pairwise summation.[2]

A very similar recursive structure of summation is found in many fast Fourier transform (FFT) algorithms, and is responsible for the same slow roundoff accumulation of those FFTs.[2][3]

https://eng.libretexts.org/Bookshelves/Electrical_Engineering/Book%3A_Fast_Fourier_Transforms_(Burrus)/10%3A_Implementing_FFTs_in_Practice/10.8%3A_Numerical_Accuracy_in_FFTs

However, these encouraging error-growth rates only apply if the trigonometric “twiddle” factors in the FFT algorithm are computed very accurately. Many FFT implementations, including FFTW and common manufacturer-optimized libraries, therefore use precomputed tables of twiddle factors calculated by means of standard library functions (which compute trigonometric constants to roughly machine precision). The other common method to compute twiddle factors is to use a trigonometric recurrence formula—this saves memory (and cache), but almost all recurrences have errors that grow as O(n‾√) , O(n) or even O(n2) which lead to corresponding errors in the FFT.

...

There are, in fact, trigonometric recurrences with the same logarithmic error growth as the FFT, but these seem more difficult to implement efficiently; they require that a table of Θ(logn) values be stored and updated as the recurrence progresses. Instead, in order to gain at least some of the benefits of a trigonometric recurrence (reduced memory pressure at the expense of more arithmetic), FFTW includes several ways to compute a much smaller twiddle table, from which the desired entries can be computed accurately on the fly using a bounded number (usually <3) of complex multiplications. For example, instead of a twiddle table with n entries ωkn , FFTW can use two tables with Θ(n‾√) entries each, so that ωkn is computed by multiplying an entry in one table (indexed with the low-order bits of k ) by an entry in the other table (indexed with the high-order bits of k ).

[ed.: Nicholas Higham's "Accuracy and Stability of Numerical Algorithms" seems like a good reference for this kind of analysis.]

nibble
pdf
papers
programming
systems
numerics
nitty-gritty
intricacy
approximation
accuracy
types
sci-comp
multi
q-n-a
stackex
hmm
oly-programming
accretion
formal-methods
yak-shaving
wiki
reference
algorithms
yoga
ground-up
divide-and-conquer
fourier
books
tidbits
chart
caltech
nostalgia
https://stackoverflow.com/questions/2729637/does-epsilon-really-guarantees-anything-in-floating-point-computations

"you must use an epsilon when dealing with floats" is a knee-jerk reaction of programmers with a superficial understanding of floating-point computations, for comparisons in general (not only to zero).

This is usually unhelpful because it doesn't tell you how to minimize the propagation of rounding errors, it doesn't tell you how to avoid cancellation or absorption problems, and even when your problem is indeed related to the comparison of two floats, it doesn't tell you what value of epsilon is right for what you are doing.

...

Regarding the propagation of rounding errors, there exists specialized analyzers that can help you estimate it, because it is a tedious thing to do by hand.

https://www.di.ens.fr/~cousot/projects/DAEDALUS/synthetic_summary/CEA/Fluctuat/index.html

This was part of HW1 of CS24:

https://en.wikipedia.org/wiki/Kahan_summation_algorithm

In particular, simply summing n numbers in sequence has a worst-case error that grows proportional to n, and a root mean square error that grows as {\displaystyle {\sqrt {n}}} {\sqrt {n}} for random inputs (the roundoff errors form a random walk).[2] With compensated summation, the worst-case error bound is independent of n, so a large number of values can be summed with an error that only depends on the floating-point precision.[2]

cf:

https://en.wikipedia.org/wiki/Pairwise_summation

In numerical analysis, pairwise summation, also called cascade summation, is a technique to sum a sequence of finite-precision floating-point numbers that substantially reduces the accumulated round-off error compared to naively accumulating the sum in sequence.[1] Although there are other techniques such as Kahan summation that typically have even smaller round-off errors, pairwise summation is nearly as good (differing only by a logarithmic factor) while having much lower computational cost—it can be implemented so as to have nearly the same cost (and exactly the same number of arithmetic operations) as naive summation.

In particular, pairwise summation of a sequence of n numbers xn works by recursively breaking the sequence into two halves, summing each half, and adding the two sums: a divide and conquer algorithm. Its worst-case roundoff errors grow asymptotically as at most O(ε log n), where ε is the machine precision (assuming a fixed condition number, as discussed below).[1] In comparison, the naive technique of accumulating the sum in sequence (adding each xi one at a time for i = 1, ..., n) has roundoff errors that grow at worst as O(εn).[1] Kahan summation has a worst-case error of roughly O(ε), independent of n, but requires several times more arithmetic operations.[1] If the roundoff errors are random, and in particular have random signs, then they form a random walk and the error growth is reduced to an average of {\displaystyle O(\varepsilon {\sqrt {\log n}})} O(\varepsilon {\sqrt {\log n}}) for pairwise summation.[2]

A very similar recursive structure of summation is found in many fast Fourier transform (FFT) algorithms, and is responsible for the same slow roundoff accumulation of those FFTs.[2][3]

https://eng.libretexts.org/Bookshelves/Electrical_Engineering/Book%3A_Fast_Fourier_Transforms_(Burrus)/10%3A_Implementing_FFTs_in_Practice/10.8%3A_Numerical_Accuracy_in_FFTs

However, these encouraging error-growth rates only apply if the trigonometric “twiddle” factors in the FFT algorithm are computed very accurately. Many FFT implementations, including FFTW and common manufacturer-optimized libraries, therefore use precomputed tables of twiddle factors calculated by means of standard library functions (which compute trigonometric constants to roughly machine precision). The other common method to compute twiddle factors is to use a trigonometric recurrence formula—this saves memory (and cache), but almost all recurrences have errors that grow as O(n‾√) , O(n) or even O(n2) which lead to corresponding errors in the FFT.

...

There are, in fact, trigonometric recurrences with the same logarithmic error growth as the FFT, but these seem more difficult to implement efficiently; they require that a table of Θ(logn) values be stored and updated as the recurrence progresses. Instead, in order to gain at least some of the benefits of a trigonometric recurrence (reduced memory pressure at the expense of more arithmetic), FFTW includes several ways to compute a much smaller twiddle table, from which the desired entries can be computed accurately on the fly using a bounded number (usually <3) of complex multiplications. For example, instead of a twiddle table with n entries ωkn , FFTW can use two tables with Θ(n‾√) entries each, so that ωkn is computed by multiplying an entry in one table (indexed with the low-order bits of k ) by an entry in the other table (indexed with the high-order bits of k ).

[ed.: Nicholas Higham's "Accuracy and Stability of Numerical Algorithms" seems like a good reference for this kind of analysis.]

may 2019 by nhaliday

Teach debugging

may 2019 by nhaliday

A friend of mine and I couldn't understand why some people were having so much trouble; the material seemed like common sense. The Feynman Method was the only tool we needed.

1. Write down the problem

2. Think real hard

3. Write down the solution

The Feynman Method failed us on the last project: the design of a divider, a real-world-scale project an order of magnitude more complex than anything we'd been asked to tackle before. On the day he assigned the project, the professor exhorted us to begin early. Over the next few weeks, we heard rumors that some of our classmates worked day and night without making progress.

...

And then, just after midnight, a number of our newfound buddies from dinner reported successes. Half of those who started from scratch had working designs. Others were despondent, because their design was still broken in some subtle, non-obvious way. As I talked with one of those students, I began poring over his design. And after a few minutes, I realized that the Feynman method wasn't the only way forward: it should be possible to systematically apply a mechanical technique repeatedly to find the source of our problems. Beneath all the abstractions, our projects consisted purely of NAND gates (woe to those who dug around our toolbox enough to uncover dynamic logic), which outputs a 0 only when both inputs are 1. If the correct output is 0, both inputs should be 1. The input that isn't is in error, an error that is, itself, the output of a NAND gate where at least one input is 0 when it should be 1. We applied this method recursively, finding the source of all the problems in both our designs in under half an hour.

How To Debug Any Program: https://www.blinddata.com/blog/how-to-debug-any-program-9

May 8th 2019 by Saketh Are

Start by Questioning Everything

...

When a program is behaving unexpectedly, our attention tends to be drawn first to the most complex portions of the code. However, mistakes can come in all forms. I've personally been guilty of rushing to debug sophisticated portions of my code when the real bug was that I forgot to read in the input file. In the following section, we'll discuss how to reliably focus our attention on the portions of the program that need correction.

Then Question as Little as Possible

Suppose that we have a program and some input on which its behavior doesn’t match our expectations. The goal of debugging is to narrow our focus to as small a section of the program as possible. Once our area of interest is small enough, the value of the incorrect output that is being produced will typically tell us exactly what the bug is.

In order to catch the point at which our program diverges from expected behavior, we must inspect the intermediate state of the program. Suppose that we select some point during execution of the program and print out all values in memory. We can inspect the results manually and decide whether they match our expectations. If they don't, we know for a fact that we can focus on the first half of the program. It either contains a bug, or our expectations of what it should produce were misguided. If the intermediate state does match our expectations, we can focus on the second half of the program. It either contains a bug, or our understanding of what input it expects was incorrect.

Question Things Efficiently

For practical purposes, inspecting intermediate state usually doesn't involve a complete memory dump. We'll typically print a small number of variables and check whether they have the properties we expect of them. Verifying the behavior of a section of code involves:

1. Before it runs, inspecting all values in memory that may influence its behavior.

2. Reasoning about the expected behavior of the code.

3. After it runs, inspecting all values in memory that may be modified by the code.

Reasoning about expected behavior is typically the easiest step to perform even in the case of highly complex programs. Practically speaking, it's time-consuming and mentally strenuous to write debug output into your program and to read and decipher the resulting values. It is therefore advantageous to structure your code into functions and sections that pass a relatively small amount of information between themselves, minimizing the number of values you need to inspect.

...

Finding the Right Question to Ask

We’ve assumed so far that we have available a test case on which our program behaves unexpectedly. Sometimes, getting to that point can be half the battle. There are a few different approaches to finding a test case on which our program fails. It is reasonable to attempt them in the following order:

1. Verify correctness on the sample inputs.

2. Test additional small cases generated by hand.

3. Adversarially construct corner cases by hand.

4. Re-read the problem to verify understanding of input constraints.

5. Design large cases by hand and write a program to construct them.

6. Write a generator to construct large random cases and a brute force oracle to verify outputs.

techtariat
dan-luu
engineering
programming
debugging
IEEE
reflection
stories
education
higher-ed
checklists
iteration-recursion
divide-and-conquer
thinking
ground-up
nitty-gritty
giants
feynman
error
input-output
structure
composition-decomposition
abstraction
systematic-ad-hoc
reduction
teaching
state
correctness
multi
oly
oly-programming
metabuch
neurons
problem-solving
wire-guided
marginal
strategy
tactics
methodology
simplification-normalization
1. Write down the problem

2. Think real hard

3. Write down the solution

The Feynman Method failed us on the last project: the design of a divider, a real-world-scale project an order of magnitude more complex than anything we'd been asked to tackle before. On the day he assigned the project, the professor exhorted us to begin early. Over the next few weeks, we heard rumors that some of our classmates worked day and night without making progress.

...

And then, just after midnight, a number of our newfound buddies from dinner reported successes. Half of those who started from scratch had working designs. Others were despondent, because their design was still broken in some subtle, non-obvious way. As I talked with one of those students, I began poring over his design. And after a few minutes, I realized that the Feynman method wasn't the only way forward: it should be possible to systematically apply a mechanical technique repeatedly to find the source of our problems. Beneath all the abstractions, our projects consisted purely of NAND gates (woe to those who dug around our toolbox enough to uncover dynamic logic), which outputs a 0 only when both inputs are 1. If the correct output is 0, both inputs should be 1. The input that isn't is in error, an error that is, itself, the output of a NAND gate where at least one input is 0 when it should be 1. We applied this method recursively, finding the source of all the problems in both our designs in under half an hour.

How To Debug Any Program: https://www.blinddata.com/blog/how-to-debug-any-program-9

May 8th 2019 by Saketh Are

Start by Questioning Everything

...

When a program is behaving unexpectedly, our attention tends to be drawn first to the most complex portions of the code. However, mistakes can come in all forms. I've personally been guilty of rushing to debug sophisticated portions of my code when the real bug was that I forgot to read in the input file. In the following section, we'll discuss how to reliably focus our attention on the portions of the program that need correction.

Then Question as Little as Possible

Suppose that we have a program and some input on which its behavior doesn’t match our expectations. The goal of debugging is to narrow our focus to as small a section of the program as possible. Once our area of interest is small enough, the value of the incorrect output that is being produced will typically tell us exactly what the bug is.

In order to catch the point at which our program diverges from expected behavior, we must inspect the intermediate state of the program. Suppose that we select some point during execution of the program and print out all values in memory. We can inspect the results manually and decide whether they match our expectations. If they don't, we know for a fact that we can focus on the first half of the program. It either contains a bug, or our expectations of what it should produce were misguided. If the intermediate state does match our expectations, we can focus on the second half of the program. It either contains a bug, or our understanding of what input it expects was incorrect.

Question Things Efficiently

For practical purposes, inspecting intermediate state usually doesn't involve a complete memory dump. We'll typically print a small number of variables and check whether they have the properties we expect of them. Verifying the behavior of a section of code involves:

1. Before it runs, inspecting all values in memory that may influence its behavior.

2. Reasoning about the expected behavior of the code.

3. After it runs, inspecting all values in memory that may be modified by the code.

Reasoning about expected behavior is typically the easiest step to perform even in the case of highly complex programs. Practically speaking, it's time-consuming and mentally strenuous to write debug output into your program and to read and decipher the resulting values. It is therefore advantageous to structure your code into functions and sections that pass a relatively small amount of information between themselves, minimizing the number of values you need to inspect.

...

Finding the Right Question to Ask

We’ve assumed so far that we have available a test case on which our program behaves unexpectedly. Sometimes, getting to that point can be half the battle. There are a few different approaches to finding a test case on which our program fails. It is reasonable to attempt them in the following order:

1. Verify correctness on the sample inputs.

2. Test additional small cases generated by hand.

3. Adversarially construct corner cases by hand.

4. Re-read the problem to verify understanding of input constraints.

5. Design large cases by hand and write a program to construct them.

6. Write a generator to construct large random cases and a brute force oracle to verify outputs.

may 2019 by nhaliday

Performance Engineering of Software Systems | Electrical Engineering and Computer Science | MIT OpenCourseWare

unit mit ocw programming engineering nitty-gritty ground-up cs algorithms distributed concurrency performance caching systems c(pp) jvm video lectures slides sci-comp advanced metal-to-virtual

may 2019 by nhaliday

unit mit ocw programming engineering nitty-gritty ground-up cs algorithms distributed concurrency performance caching systems c(pp) jvm video lectures slides sci-comp advanced metal-to-virtual

may 2019 by nhaliday

Hyperbolic angle - Wikipedia

november 2017 by nhaliday

A unit circle {\displaystyle x^{2}+y^{2}=1} x^2 + y^2 = 1 has a circular sector with an area half of the circular angle in radians. Analogously, a unit hyperbola {\displaystyle x^{2}-y^{2}=1} {\displaystyle x^{2}-y^{2}=1} has a hyperbolic sector with an area half of the hyperbolic angle.

nibble
math
trivia
wiki
reference
physics
relativity
concept
atoms
geometry
ground-up
characterization
measure
definition
plots
calculation
nitty-gritty
direction
metrics
manifolds
november 2017 by nhaliday

What is the connection between special and general relativity? - Physics Stack Exchange

november 2017 by nhaliday

Special relativity is the "special case" of general relativity where spacetime is flat. The speed of light is essential to both.

nibble
q-n-a
overflow
physics
relativity
explanation
synthesis
hi-order-bits
ground-up
gravity
summary
aphorism
differential
geometry
november 2017 by nhaliday

What is the difference between general and special relativity? - Quora

november 2017 by nhaliday

General Relativity is, quite simply, needed to explain gravity.

Special Relativity is the special case of GR, when the metric is flat — which means no gravity.

You need General Relativity when the metric gets all curvy, and when things start to experience gravitation.

nibble
q-n-a
qra
explanation
physics
relativity
synthesis
hi-order-bits
ground-up
gravity
summary
aphorism
differential
geometry
Special Relativity is the special case of GR, when the metric is flat — which means no gravity.

You need General Relativity when the metric gets all curvy, and when things start to experience gravitation.

november 2017 by nhaliday

Genetics: CHROMOSOMAL MAPS AND MAPPING FUNCTIONS

october 2017 by nhaliday

Any particular gene has a specific location (its "locus") on a particular chromosome. For any two genes (or loci) alpha and beta, we can ask "What is the recombination frequency between them?" If the genes are on different chromosomes, the answer is 50% (independent assortment). If the two genes are on the same chromosome, the recombination frequency will be somewhere in the range from 0 to 50%. The "map unit" (1 cM) is the genetic map distance that corresponds to a recombination frequency of 1%. In large chromosomes, the cumulative map distance may be much greater than 50cM, but the maximum recombination frequency is 50%. Why? In large chromosomes, there is enough length to allow for multiple cross-overs, so we have to ask what result we expect for random multiple cross-overs.

1. How is it that random multiple cross-overs give the same result as independent assortment?

Figure 5.12 shows how the various double cross-over possibilities add up, resulting in gamete genotype percentages that are indistinguisable from independent assortment (50% parental type, 50% non-parental type). This is a very important figure. It provides the explanation for why genes that are far apart on a very large chromosome sort out in crosses just as if they were on separate chromosomes.

2. Is there a way to measure how close together two crossovers can occur involving the same two chromatids? That is, how could we measure whether there is spacial "interference"?

Figure 5.13 shows how a measurement of the gamete frequencies resulting from a "three point cross" can answer this question. If we would get a "lower than expected" occurrence of recombinant genotypes aCb and AcB, it would suggest that there is some hindrance to the two cross-overs occurring this close together. Crosses of this type in Drosophila have shown that, in this organism, double cross-overs do not occur at distances of less than about 10 cM between the two cross-over sites. ( Textbook, page 196. )

3. How does all of this lead to the "mapping function", the mathematical (graphical) relation between the observed recombination frequency (percent non-parental gametes) and the cumulative genetic distance in map units?

Figure 5.14 shows the result for the two extremes of "complete interference" and "no interference". The situation for real chromosomes in real organisms is somewhere between these extremes, such as the curve labelled "interference decreasing with distance".

org:junk
org:edu
explanation
faq
nibble
genetics
genomics
bio
ground-up
magnitude
data
flux-stasis
homo-hetero
measure
orders
metric-space
limits
measurement
1. How is it that random multiple cross-overs give the same result as independent assortment?

Figure 5.12 shows how the various double cross-over possibilities add up, resulting in gamete genotype percentages that are indistinguisable from independent assortment (50% parental type, 50% non-parental type). This is a very important figure. It provides the explanation for why genes that are far apart on a very large chromosome sort out in crosses just as if they were on separate chromosomes.

2. Is there a way to measure how close together two crossovers can occur involving the same two chromatids? That is, how could we measure whether there is spacial "interference"?

Figure 5.13 shows how a measurement of the gamete frequencies resulting from a "three point cross" can answer this question. If we would get a "lower than expected" occurrence of recombinant genotypes aCb and AcB, it would suggest that there is some hindrance to the two cross-overs occurring this close together. Crosses of this type in Drosophila have shown that, in this organism, double cross-overs do not occur at distances of less than about 10 cM between the two cross-over sites. ( Textbook, page 196. )

3. How does all of this lead to the "mapping function", the mathematical (graphical) relation between the observed recombination frequency (percent non-parental gametes) and the cumulative genetic distance in map units?

Figure 5.14 shows the result for the two extremes of "complete interference" and "no interference". The situation for real chromosomes in real organisms is somewhere between these extremes, such as the curve labelled "interference decreasing with distance".

october 2017 by nhaliday

Best Topology Olympiad ***EVER*** - Affine Mess - Quora

october 2017 by nhaliday

Most people take courses in topology, algebraic topology, knot theory, differential topology and what have you without once doing anything with a finite topological space. There may have been some quirky questions about such spaces early on in a point-set topology course, but most of us come out of these courses thinking that finite topological spaces are either discrete or only useful as an exotic counterexample to some standard separation property. The mere idea of calculating the fundamental group for a 4-point space seems ludicrous.

Only it’s not. This is a genuine question, not a joke, and I find it both hilarious and super educational. DO IT!!

nibble
qra
announcement
math
geometry
topology
puzzles
rec-math
oly
links
math.AT
ground-up
finiteness
math.GN
Only it’s not. This is a genuine question, not a joke, and I find it both hilarious and super educational. DO IT!!

october 2017 by nhaliday

Can You Pass Harvard's 1869 Entrance Exam? - Business Insider

september 2017 by nhaliday

hard classics + basicish math

news
org:lite
org:biz
history
pre-ww2
early-modern
usa
education
higher-ed
dysgenics
the-classics
canon
iron-age
mediterranean
war
quiz
psychometrics
math
geometry
ground-up
calculation
foreign-lang
comparison
big-peeps
multiplicative
old-anglo
harvard
elite
ranking
measurement
september 2017 by nhaliday

Resonance in a Pendulum - YouTube

september 2017 by nhaliday

The vibration of any given washer is able to transmit its energy only to another washer with exactly the same frequency. Since the length of a pendulum determines its frequency of vibration, each pendulum can only set another pendulum vibrating if it has the same length.

nibble
video
social
physics
mechanics
waves
oscillation
synchrony
flux-stasis
increase-decrease
concrete
ground-up
dirty-hands
phys-energy
frequency
spreading
september 2017 by nhaliday

Power of a point - Wikipedia

september 2017 by nhaliday

The power of point P (see in Figure 1) can be defined equivalently as the product of distances from the point P to the two intersection points of any ray emanating from P.

nibble
math
geometry
spatial
ground-up
concept
metrics
invariance
identity
atoms
wiki
reference
measure
yoga
calculation
september 2017 by nhaliday

Physics 152: Gravity, Fluids, Waves, Heat

september 2017 by nhaliday

lots of good lecture notes with pictures, worked examples, and simulations

unit
org:edu
org:junk
course
physics
mechanics
gravity
tidbits
symmetry
calculation
examples
lecture-notes
simulation
dynamic
dynamical
visualization
visual-understanding
ground-up
fluid
waves
oscillation
thermo
stat-mech
p:whenever
accretion
math.CA
hi-order-bits
nitty-gritty
linearity
spatial
space
entropy-like
temperature
proofs
yoga
plots
september 2017 by nhaliday

Centers of gravity in non-uniform fields - Wikipedia

september 2017 by nhaliday

In physics, a center of gravity of a material body is a point that may be used for a summary description of gravitational interactions. In a uniform gravitational field, the center of mass serves as the center of gravity. This is a very good approximation for smaller bodies near the surface of Earth, so there is no practical need to distinguish "center of gravity" from "center of mass" in most applications, such as engineering and medicine.

In a non-uniform field, gravitational effects such as potential energy, force, and torque can no longer be calculated using the center of mass alone. In particular, a non-uniform gravitational field can produce a torque on an object, even about an axis through the center of mass. The center of gravity seeks to explain this effect. Formally, a center of gravity is an application point of the resultant gravitational force on the body. Such a point may not exist, and if it exists, it is not unique. One can further define a unique center of gravity by approximating the field as either parallel or spherically symmetric.

The concept of a center of gravity as distinct from the center of mass is rarely used in applications, even in celestial mechanics, where non-uniform fields are important. Since the center of gravity depends on the external field, its motion is harder to determine than the motion of the center of mass. The common method to deal with gravitational torques is a field theory.

nibble
wiki
reference
physics
mechanics
intricacy
atoms
expectancy
spatial
direction
ground-up
concept
existence
uniqueness
homo-hetero
gravity
gotchas
In a non-uniform field, gravitational effects such as potential energy, force, and torque can no longer be calculated using the center of mass alone. In particular, a non-uniform gravitational field can produce a torque on an object, even about an axis through the center of mass. The center of gravity seeks to explain this effect. Formally, a center of gravity is an application point of the resultant gravitational force on the body. Such a point may not exist, and if it exists, it is not unique. One can further define a unique center of gravity by approximating the field as either parallel or spherically symmetric.

The concept of a center of gravity as distinct from the center of mass is rarely used in applications, even in celestial mechanics, where non-uniform fields are important. Since the center of gravity depends on the external field, its motion is harder to determine than the motion of the center of mass. The common method to deal with gravitational torques is a field theory.

september 2017 by nhaliday

Drude model - Wikipedia

september 2017 by nhaliday

The Drude model of electrical conduction was proposed in 1900[1][2] by Paul Drude to explain the transport properties of electrons in materials (especially metals). The model, which is an application of kinetic theory, assumes that the microscopic behavior of electrons in a solid may be treated classically and looks much like _a pinball machine_, with a sea of constantly jittering electrons bouncing and re-bouncing off heavier, relatively immobile positive ions.

The two most significant results of the Drude model are an electronic equation of motion,

d<p(t)>/dt = q(E + 1/m <p(t)> x B) - <p(t)>/τ

and a linear relationship between current density J and electric field E,

J = (nq^2τ/m) E

latter is Ohm's law

nibble
physics
electromag
models
local-global
stat-mech
identity
atoms
wiki
reference
ground-up
cartoons
The two most significant results of the Drude model are an electronic equation of motion,

d<p(t)>/dt = q(E + 1/m <p(t)> x B) - <p(t)>/τ

and a linear relationship between current density J and electric field E,

J = (nq^2τ/m) E

latter is Ohm's law

september 2017 by nhaliday

Is it possible to recover Classical Mechanics from Schrödinger's equation? - Physics Stack Exchange

august 2017 by nhaliday

Classical limit of quantum mechanics: https://physics.stackexchange.com/questions/32112/classical-limit-of-quantum-mechanics

https://physics.stackexchange.com/questions/108222/from-quantum-mechanics-to-classical-mechanics

Classical Limit of Quantum Mechanics: https://mathoverflow.net/questions/102313/classical-limit-of-quantum-mechanics

How/when does quantum mechanics become classical mechanics?: https://www.quora.com/How-when-does-quantum-mechanics-become-classical-mechanics

Remarks concerning the status & some ramifications of EHRENFEST’S THEOREM: http://www.reed.edu/physics/faculty/wheeler/documents/Quantum%20Mechanics/Miscellaneous%20Essays/Ehrenfest's%20Theorem.pdf

nibble
q-n-a
overflow
physics
mechanics
quantum
scale
approximation
lens
limits
multi
synthesis
hi-order-bits
big-picture
ground-up
qra
magnitude
pdf
essay
papers
https://physics.stackexchange.com/questions/108222/from-quantum-mechanics-to-classical-mechanics

Classical Limit of Quantum Mechanics: https://mathoverflow.net/questions/102313/classical-limit-of-quantum-mechanics

How/when does quantum mechanics become classical mechanics?: https://www.quora.com/How-when-does-quantum-mechanics-become-classical-mechanics

Remarks concerning the status & some ramifications of EHRENFEST’S THEOREM: http://www.reed.edu/physics/faculty/wheeler/documents/Quantum%20Mechanics/Miscellaneous%20Essays/Ehrenfest's%20Theorem.pdf

august 2017 by nhaliday

Introduction to Scaling Laws

august 2017 by nhaliday

https://betadecay.wordpress.com/2009/10/02/the-physics-of-scaling-laws-and-dimensional-analysis/

http://galileo.phys.virginia.edu/classes/304/scaling.pdf

Galileo’s Discovery of Scaling Laws: https://www.mtholyoke.edu/~mpeterso/classes/galileo/scaling8.pdf

Days 1 and 2 of Two New Sciences

An example of such an insight is “the surface of a small solid is comparatively greater than that of a large one” because the surface goes like the square of a linear dimension, but the volume goes like the cube.5 Thus as one scales down macroscopic objects, forces on their surfaces like viscous drag become relatively more important, and bulk forces like weight become relatively less important. Galileo uses this idea on the First Day in the context of resistance in free fall, as an explanation for why similar objects of different size do not fall exactly together, but the smaller one lags behind.

nibble
org:junk
exposition
lecture-notes
physics
mechanics
street-fighting
problem-solving
scale
magnitude
estimate
fermi
mental-math
calculation
nitty-gritty
multi
scitariat
org:bleg
lens
tutorial
guide
ground-up
tricki
skeleton
list
cheatsheet
identity
levers
hi-order-bits
yoga
metabuch
pdf
article
essay
history
early-modern
europe
the-great-west-whale
science
the-trenches
discovery
fluid
architecture
oceans
giants
tidbits
elegance
http://galileo.phys.virginia.edu/classes/304/scaling.pdf

Galileo’s Discovery of Scaling Laws: https://www.mtholyoke.edu/~mpeterso/classes/galileo/scaling8.pdf

Days 1 and 2 of Two New Sciences

An example of such an insight is “the surface of a small solid is comparatively greater than that of a large one” because the surface goes like the square of a linear dimension, but the volume goes like the cube.5 Thus as one scales down macroscopic objects, forces on their surfaces like viscous drag become relatively more important, and bulk forces like weight become relatively less important. Galileo uses this idea on the First Day in the context of resistance in free fall, as an explanation for why similar objects of different size do not fall exactly together, but the smaller one lags behind.

august 2017 by nhaliday

Inscribed angle - Wikipedia

august 2017 by nhaliday

pf:

- for triangle w/ one side = a diameter, draw isosceles triangle and use supplementary angle identities

- otherwise draw second triangle w/ side = a diameter, and use above result twice

nibble
math
geometry
spatial
ground-up
wiki
reference
proofs
identity
levers
yoga
- for triangle w/ one side = a diameter, draw isosceles triangle and use supplementary angle identities

- otherwise draw second triangle w/ side = a diameter, and use above result twice

august 2017 by nhaliday

Analysis of variance - Wikipedia

july 2017 by nhaliday

Analysis of variance (ANOVA) is a collection of statistical models used to analyze the differences among group means and their associated procedures (such as "variation" among and between groups), developed by statistician and evolutionary biologist Ronald Fisher. In the ANOVA setting, the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether or not the means of several groups are equal, and therefore generalizes the t-test to more than two groups. ANOVAs are useful for comparing (testing) three or more means (groups or variables) for statistical significance. It is conceptually similar to multiple two-sample t-tests, but is more conservative (results in less type I error) and is therefore suited to a wide range of practical problems.

good pic: https://en.wikipedia.org/wiki/Analysis_of_variance#Motivating_example

tutorial by Gelman: http://www.stat.columbia.edu/~gelman/research/published/econanova3.pdf

so one way to think of partitioning the variance:

y_ij = alpha_i + beta_j + eps_ij

Var(y_ij) = Var(alpha_i) + Var(beta_j) + Cov(alpha_i, beta_j) + Var(eps_ij)

and alpha_i, beta_j are independent, so Cov(alpha_i, beta_j) = 0

can you make this work w/ interaction effects?

data-science
stats
methodology
hypothesis-testing
variance-components
concept
conceptual-vocab
thinking
wiki
reference
nibble
multi
visualization
visual-understanding
pic
pdf
exposition
lecture-notes
gelman
scitariat
tutorial
acm
ground-up
yoga
good pic: https://en.wikipedia.org/wiki/Analysis_of_variance#Motivating_example

tutorial by Gelman: http://www.stat.columbia.edu/~gelman/research/published/econanova3.pdf

so one way to think of partitioning the variance:

y_ij = alpha_i + beta_j + eps_ij

Var(y_ij) = Var(alpha_i) + Var(beta_j) + Cov(alpha_i, beta_j) + Var(eps_ij)

and alpha_i, beta_j are independent, so Cov(alpha_i, beta_j) = 0

can you make this work w/ interaction effects?

july 2017 by nhaliday

co.combinatorics - Classification of Platonic solids - MathOverflow

july 2017 by nhaliday

My question is very basic: where can I find a complete (and hopefully self-contained) proof of the classification of Platonic solids? In all the references that I found, they use Euler's formula v−e+f=2v−e+f=2 to show that there are exactly five possible triples (v,e,f)(v,e,f). But of course this is not a complete proof because it does not rule out the possibility of different configurations or deformations. Has anyone ever written up a complete proof of this statement?!

...

This is a classical question. Here is my reading of it: Why is there a unique polytope with given combinatorics of faces, which are all regular polygons? Of course, for simple polytopes (tetrahedron, cube, dodecahedron) this is clear, but for the octahedron and icosahedron this is less clear.

The answer lies in the Cauchy's theorem. It was Legendre, while writing his Elements of Geometry and Trigonometry, noticed that Euclid's proof is incomplete in the Elements. Curiously, Euclid finds both radii of inscribed and circumscribed spheres (correctly) without ever explaining why they exist. Cauchy worked out a proof while still a student in 1813, more or less specifically for this purpose. The proof also had a technical gap which was found and patched up by Steinitz in 1920s.

The complete (corrected) proof can be found in the celebrated Proofs from the Book, or in Marcel Berger's Geometry. My book gives a bit more of historical context and some soft arguments (ch. 19). It's worth comparing this proof with (an erroneous) pre-Steinitz exposition, say in Hadamard's Leçons de Géométrie Elémentaire II, or with an early post-Steinitz correct but tedious proof given in (otherwise, excellent) Alexandrov's monograph (see also ch.26 in my book which compares all the approaches).

P.S. Note that Coxeter in Regular Polytopes can completely avoid this issue but taking a different (modern) definition of the regular polytopes (which are symmetric under group actions). For a modern exposition and the state of art of this approach, see McMullen and Schulte's Abstract Regular Polytopes.

https://en.wikipedia.org/wiki/Platonic_solid#Classification

https://mathoverflow.net/questions/46502/on-the-number-of-archimedean-solids

q-n-a
overflow
math
topology
geometry
math.CO
history
iron-age
mediterranean
the-classics
multi
curiosity
clarity
proofs
nibble
wiki
reference
characterization
uniqueness
list
ground-up
grokkability-clarity
...

This is a classical question. Here is my reading of it: Why is there a unique polytope with given combinatorics of faces, which are all regular polygons? Of course, for simple polytopes (tetrahedron, cube, dodecahedron) this is clear, but for the octahedron and icosahedron this is less clear.

The answer lies in the Cauchy's theorem. It was Legendre, while writing his Elements of Geometry and Trigonometry, noticed that Euclid's proof is incomplete in the Elements. Curiously, Euclid finds both radii of inscribed and circumscribed spheres (correctly) without ever explaining why they exist. Cauchy worked out a proof while still a student in 1813, more or less specifically for this purpose. The proof also had a technical gap which was found and patched up by Steinitz in 1920s.

The complete (corrected) proof can be found in the celebrated Proofs from the Book, or in Marcel Berger's Geometry. My book gives a bit more of historical context and some soft arguments (ch. 19). It's worth comparing this proof with (an erroneous) pre-Steinitz exposition, say in Hadamard's Leçons de Géométrie Elémentaire II, or with an early post-Steinitz correct but tedious proof given in (otherwise, excellent) Alexandrov's monograph (see also ch.26 in my book which compares all the approaches).

P.S. Note that Coxeter in Regular Polytopes can completely avoid this issue but taking a different (modern) definition of the regular polytopes (which are symmetric under group actions). For a modern exposition and the state of art of this approach, see McMullen and Schulte's Abstract Regular Polytopes.

https://en.wikipedia.org/wiki/Platonic_solid#Classification

https://mathoverflow.net/questions/46502/on-the-number-of-archimedean-solids

july 2017 by nhaliday

Harmonic mean - Wikipedia

july 2017 by nhaliday

The harmonic mean is a Schur-concave function, and dominated by the minimum of its arguments, in the sense that for any positive set of arguments, {\displaystyle \min(x_{1}\ldots x_{n})\leq H(x_{1}\ldots x_{n})\leq n\min(x_{1}\ldots x_{n})} . Thus, the harmonic mean cannot be made arbitrarily large by changing some values to bigger ones (while having at least one value unchanged).

more generally, for the weighted mean w/ Pr(x_i)=t_i, H(x1,...,xn) <= x_i/t_i

nibble
math
properties
estimate
concept
definition
wiki
reference
extrema
magnitude
expectancy
metrics
ground-up
more generally, for the weighted mean w/ Pr(x_i)=t_i, H(x1,...,xn) <= x_i/t_i

july 2017 by nhaliday

Lecture 6: Nash Equilibrum Existence

june 2017 by nhaliday

pf:

- For mixed strategy profile p ∈ Δ(A), let g_ij(p) = gain for player i to switch to pure strategy j.

- Define y: Δ(A) -> Δ(A) by y_ij(p) ∝ p_ij + g_ij(p) (normalizing constant = 1 + ∑_k g_ik(p)).

- Look at fixed point of y.

pdf
nibble
lecture-notes
exposition
acm
game-theory
proofs
math
topology
existence
fixed-point
simplex
equilibrium
ground-up
- For mixed strategy profile p ∈ Δ(A), let g_ij(p) = gain for player i to switch to pure strategy j.

- Define y: Δ(A) -> Δ(A) by y_ij(p) ∝ p_ij + g_ij(p) (normalizing constant = 1 + ∑_k g_ik(p)).

- Look at fixed point of y.

june 2017 by nhaliday

Snp With More Than One Base ?

may 2017 by nhaliday

https://biology.stackexchange.com/questions/35260/what-is-the-difference-between-single-nucleotide-polymorphism-snp-mutation-an

Novel multi-nucleotide polymorphisms in the human genome characterized by whole genome and exome sequencing: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2952858/

MAC: identifying and correcting annotation for multi-nucleotide variations: https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-015-1779-7

A Novel Type of Sequence Variation: Multiple-Nucleotide Length Polymorphisms Discovered in the Bovine Genome: http://www.genetics.org/content/176/1/403

Technologic Issues in GWAS and Follow-up Studies: https://www.genome.gov/pages/about/od/opg/multi-ic_symposia/may2007/techissues.pdf

Types of Polymorphisms

Understanding SNPs and INDELs in microbial genomes: http://thegenomefactory.blogspot.com/2013/10/understanding-snps-and-indels-in.html

https://f1000research.com/posters/2090

cf: https://pinboard.in/u:nhaliday/b:7ec826f19ade

q-n-a
stackex
bio
genetics
genomics
population-genetics
explanation
mutation
pop-structure
multi
bioinformatics
study
curiosity
org:nat
nature
methodology
QTL
concept
conceptual-vocab
structure
ground-up
🌞
chart
pdf
slides
presentation
nibble
Novel multi-nucleotide polymorphisms in the human genome characterized by whole genome and exome sequencing: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2952858/

MAC: identifying and correcting annotation for multi-nucleotide variations: https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-015-1779-7

A Novel Type of Sequence Variation: Multiple-Nucleotide Length Polymorphisms Discovered in the Bovine Genome: http://www.genetics.org/content/176/1/403

Technologic Issues in GWAS and Follow-up Studies: https://www.genome.gov/pages/about/od/opg/multi-ic_symposia/may2007/techissues.pdf

Types of Polymorphisms

Understanding SNPs and INDELs in microbial genomes: http://thegenomefactory.blogspot.com/2013/10/understanding-snps-and-indels-in.html

https://f1000research.com/posters/2090

cf: https://pinboard.in/u:nhaliday/b:7ec826f19ade

may 2017 by nhaliday

Fourier transform - Wikipedia

april 2017 by nhaliday

https://en.wikipedia.org/wiki/Fourier_transform#Properties_of_the_Fourier_transform

https://en.wikipedia.org/wiki/Fourier_transform#Tables_of_important_Fourier_transforms

nibble
math
acm
math.CA
fourier
list
identity
duality
math.CV
wiki
reference
multi
objektbuch
cheatsheet
calculation
nitty-gritty
concept
examples
integral
AMT
ground-up
IEEE
properties
https://en.wikipedia.org/wiki/Fourier_transform#Tables_of_important_Fourier_transforms

april 2017 by nhaliday

Why Momentum Really Works

acmtariat techtariat org:bleg nibble machine-learning acm optimization gradient-descent exposition explanation yoga dynamic visualization visual-understanding better-explained linear-algebra iterative-methods iteration-recursion polynomials dynamical metabuch let-me-see ground-up oscillation fourier curvature convexity-curvature analysis concept atoms org:popup

april 2017 by nhaliday

acmtariat techtariat org:bleg nibble machine-learning acm optimization gradient-descent exposition explanation yoga dynamic visualization visual-understanding better-explained linear-algebra iterative-methods iteration-recursion polynomials dynamical metabuch let-me-see ground-up oscillation fourier curvature convexity-curvature analysis concept atoms org:popup

april 2017 by nhaliday

Understanding statistics through interactive visualizations

explanation list visualization gotchas paradox stats methodology hypothesis-testing visual-understanding better-explained links regression-to-mean metabuch examples data-science street-fighting intuition ground-up nitty-gritty

march 2017 by nhaliday

explanation list visualization gotchas paradox stats methodology hypothesis-testing visual-understanding better-explained links regression-to-mean metabuch examples data-science street-fighting intuition ground-up nitty-gritty

march 2017 by nhaliday

Fundamental Theorems of Evolution: The American Naturalist: Vol 0, No 0

march 2017 by nhaliday

I suggest that the most fundamental theorem of evolution is the Price equation, both because of its simplicity and broad scope and because it can be used to derive four other familiar results that are similarly fundamental: Fisher’s average-excess equation, Robertson’s secondary theorem of natural selection, the breeder’s equation, and Fisher’s fundamental theorem. These derivations clarify both the relationships behind these results and their assumptions. Slightly less fundamental results include those for multivariate evolution and social selection. A key feature of fundamental theorems is that they have great simplicity and scope, which are often achieved by sacrificing perfect accuracy. Quantitative genetics has been more productive of fundamental theorems than population genetics, probably because its empirical focus on unknown genotypes freed it from the tyranny of detail and allowed it to focus on general issues.

study
essay
bio
evolution
population-genetics
fisher
selection
EGT
dynamical
exposition
methodology
🌞
big-picture
levers
list
nibble
article
chart
explanation
clarity
trees
ground-up
ideas
grokkability-clarity
march 2017 by nhaliday

Linear Algebra Review

february 2017 by nhaliday

slightly modified?: http://cs229.stanford.edu/section/cs229-linalg.pdf

unit
linear-algebra
ground-up
math
acm
machine-learning
optimization
differential
video
lectures
lecture-notes
init
tutorial
calculation
nitty-gritty
multi
stanford
linearity
p:null
february 2017 by nhaliday

More on Multivariate Gaussians

february 2017 by nhaliday

Fact #1: mean and covariance uniquely determine distribution

Fact #3: closure under sum, marginalizing, and conditioning

covariance of conditional distribution is given by a Schur complement (independent of x_B. is that obvious?)

pdf
exposition
lecture-notes
stanford
nibble
distribution
acm
machine-learning
probability
levers
calculation
ground-up
characterization
rigidity
closure
nitty-gritty
linear-algebra
properties
Fact #3: closure under sum, marginalizing, and conditioning

covariance of conditional distribution is given by a Schur complement (independent of x_B. is that obvious?)

february 2017 by nhaliday

Hoeffding’s Inequality

february 2017 by nhaliday

basic idea of standard pf: bound e^{tX} by line segment (convexity) then use Taylor expansion (in p = b/(b-a), the fraction of range to right of 0) of logarithm

pdf
lecture-notes
exposition
nibble
concentration-of-measure
estimate
proofs
ground-up
acm
probability
series
s:null
february 2017 by nhaliday

CS229 Supplemental Lecture notes: Hoeffding’s inequality

february 2017 by nhaliday

- weaker by a constant factor

- uses symmetrization instead of Taylor series

pdf
lecture-notes
exposition
nibble
proofs
concentration-of-measure
estimate
machine-learning
acm
probability
math
moments
ground-up
stanford
symmetry
curvature
convexity-curvature
- uses symmetrization instead of Taylor series

february 2017 by nhaliday

Do grad school students remember everything they were taught in college all the time? - Quora

q-n-a qra grad-school learning synthesis hi-order-bits neurons physics lens analogy cartoons links 🎓 scholar gowers mathtariat feynman giants quotes games nibble thinking zooming retention meta:research big-picture skeleton s:** p:whenever wire-guided narrative intuition lesswrong commentary ground-up limits examples problem-solving info-dynamics knowledge studying ideas the-trenches chart

february 2017 by nhaliday

q-n-a qra grad-school learning synthesis hi-order-bits neurons physics lens analogy cartoons links 🎓 scholar gowers mathtariat feynman giants quotes games nibble thinking zooming retention meta:research big-picture skeleton s:** p:whenever wire-guided narrative intuition lesswrong commentary ground-up limits examples problem-solving info-dynamics knowledge studying ideas the-trenches chart

february 2017 by nhaliday

Einstein's Most Famous Thought Experiment

february 2017 by nhaliday

When Einstein abandoned an emission theory of light, he had also to abandon the hope that electrodynamics could be made to conform to the principle of relativity by the normal sorts of modifications to electrodynamic theory that occupied the theorists of the second half of the 19th century. Instead Einstein knew he must resort to extraordinary measures. He was willing to seek realization of his goal in a re-examination of our basic notions of space and time. Einstein concluded his report on his youthful thought experiment:

"One sees that in this paradox the germ of the special relativity theory is already contained. Today everyone knows, of course, that all attempts to clarify this paradox satisfactorily were condemned to failure as long as the axiom of the absolute character of time, or of simultaneity, was rooted unrecognized in the unconscious. To recognize clearly this axiom and its arbitrary character already implies the essentials of the solution of the problem."

einstein
giants
physics
history
stories
gedanken
exposition
org:edu
electromag
relativity
nibble
innovation
novelty
the-trenches
synchrony
discovery
🔬
org:junk
science
absolute-relative
visuo
explanation
ground-up
clarity
state
causation
intuition
ideas
mostly-modern
pre-ww2
marginal
grokkability-clarity
"One sees that in this paradox the germ of the special relativity theory is already contained. Today everyone knows, of course, that all attempts to clarify this paradox satisfactorily were condemned to failure as long as the axiom of the absolute character of time, or of simultaneity, was rooted unrecognized in the unconscious. To recognize clearly this axiom and its arbitrary character already implies the essentials of the solution of the problem."

february 2017 by nhaliday

A VERY BRIEF REVIEW OF MEASURE THEORY

february 2017 by nhaliday

A brief philosophical discussion:

Measure theory, as much as any branch of mathematics, is an area where it is important to be acquainted with the basic notions and statements, but not desperately important to be acquainted with the detailed proofs, which are often rather unilluminating. One should always have in a mind a place where one could go and look if one ever did need to understand a proof: for me, that place is Rudin’s Real and Complex Analysis (Rudin’s “red book”).

gowers
pdf
math
math.CA
math.FA
philosophy
measure
exposition
synthesis
big-picture
hi-order-bits
ergodic
ground-up
summary
roadmap
mathtariat
proofs
nibble
unit
integral
zooming
p:whenever
Measure theory, as much as any branch of mathematics, is an area where it is important to be acquainted with the basic notions and statements, but not desperately important to be acquainted with the detailed proofs, which are often rather unilluminating. One should always have in a mind a place where one could go and look if one ever did need to understand a proof: for me, that place is Rudin’s Real and Complex Analysis (Rudin’s “red book”).

february 2017 by nhaliday

Cauchy-Schwarz inequality and Hölder's inequality - Mathematics Stack Exchange

january 2017 by nhaliday

- Cauchy-Schwarz (special case of Holder's inequality where p=q=1/2) implies Holder's inequality

- pith: define potential F(t) = int f^{pt} g^{q(1-t)}, show log F is midpoint-convex hence convex, then apply convexity between F(0) and F(1) for F(1/p) = ||fg||_1

q-n-a
overflow
math
estimate
proofs
ground-up
math.FA
inner-product
tidbits
norms
duality
nibble
integral
- pith: define potential F(t) = int f^{pt} g^{q(1-t)}, show log F is midpoint-convex hence convex, then apply convexity between F(0) and F(1) for F(1/p) = ||fg||_1

january 2017 by nhaliday

interpretation - How to understand degrees of freedom? - Cross Validated

january 2017 by nhaliday

From Wikipedia, there are three interpretations of the degrees of freedom of a statistic:

In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.

Estimates of statistical parameters can be based upon different amounts of information or data. The number of independent pieces of information that go into the estimate of a parameter is called the degrees of freedom (df). In general, the degrees of freedom of an estimate of a parameter is equal to the number of independent scores that go into the estimate minus the number of parameters used as intermediate steps in the estimation of the parameter itself (which, in sample variance, is one, since the sample mean is the only intermediate step).

Mathematically, degrees of freedom is the dimension of the domain of a random vector, or essentially the number of 'free' components: how many components need to be known before the vector is fully determined.

...

This is a subtle question. It takes a thoughtful person not to understand those quotations! Although they are suggestive, it turns out that none of them is exactly or generally correct. I haven't the time (and there isn't the space here) to give a full exposition, but I would like to share one approach and an insight that it suggests.

Where does the concept of degrees of freedom (DF) arise? The contexts in which it's found in elementary treatments are:

- The Student t-test and its variants such as the Welch or Satterthwaite solutions to the Behrens-Fisher problem (where two populations have different variances).

- The Chi-squared distribution (defined as a sum of squares of independent standard Normals), which is implicated in the sampling distribution of the variance.

- The F-test (of ratios of estimated variances).

- The Chi-squared test, comprising its uses in (a) testing for independence in contingency tables and (b) testing for goodness of fit of distributional estimates.

In spirit, these tests run a gamut from being exact (the Student t-test and F-test for Normal variates) to being good approximations (the Student t-test and the Welch/Satterthwaite tests for not-too-badly-skewed data) to being based on asymptotic approximations (the Chi-squared test). An interesting aspect of some of these is the appearance of non-integral "degrees of freedom" (the Welch/Satterthwaite tests and, as we will see, the Chi-squared test). This is of especial interest because it is the first hint that DF is not any of the things claimed of it.

...

Having been alerted by these potential ambiguities, let's hold up the Chi-squared goodness of fit test for examination, because (a) it's simple, (b) it's one of the common situations where people really do need to know about DF to get the p-value right and (c) it's often used incorrectly. Here's a brief synopsis of the least controversial application of this test:

...

This, many authorities tell us, should have (to a very close approximation) a Chi-squared distribution. But there's a whole family of such distributions. They are differentiated by a parameter νν often referred to as the "degrees of freedom." The standard reasoning about how to determine νν goes like this

I have kk counts. That's kk pieces of data. But there are (functional) relationships among them. To start with, I know in advance that the sum of the counts must equal nn. That's one relationship. I estimated two (or pp, generally) parameters from the data. That's two (or pp) additional relationships, giving p+1p+1 total relationships. Presuming they (the parameters) are all (functionally) independent, that leaves only k−p−1k−p−1 (functionally) independent "degrees of freedom": that's the value to use for νν.

The problem with this reasoning (which is the sort of calculation the quotations in the question are hinting at) is that it's wrong except when some special additional conditions hold. Moreover, those conditions have nothing to do with independence (functional or statistical), with numbers of "components" of the data, with the numbers of parameters, nor with anything else referred to in the original question.

...

Things went wrong because I violated two requirements of the Chi-squared test:

1. You must use the Maximum Likelihood estimate of the parameters. (This requirement can, in practice, be slightly violated.)

2. You must base that estimate on the counts, not on the actual data! (This is crucial.)

...

The point of this comparison--which I hope you have seen coming--is that the correct DF to use for computing the p-values depends on many things other than dimensions of manifolds, counts of functional relationships, or the geometry of Normal variates. There is a subtle, delicate interaction between certain functional dependencies, as found in mathematical relationships among quantities, and distributions of the data, their statistics, and the estimators formed from them. Accordingly, it cannot be the case that DF is adequately explainable in terms of the geometry of multivariate normal distributions, or in terms of functional independence, or as counts of parameters, or anything else of this nature.

We are led to see, then, that "degrees of freedom" is merely a heuristic that suggests what the sampling distribution of a (t, Chi-squared, or F) statistic ought to be, but it is not dispositive. Belief that it is dispositive leads to egregious errors. (For instance, the top hit on Google when searching "chi squared goodness of fit" is a Web page from an Ivy League university that gets most of this completely wrong! In particular, a simulation based on its instructions shows that the chi-squared value it recommends as having 7 DF actually has 9 DF.)

q-n-a
overflow
stats
data-science
concept
jargon
explanation
methodology
things
nibble
degrees-of-freedom
clarity
curiosity
manifolds
dimensionality
ground-up
intricacy
hypothesis-testing
examples
list
ML-MAP-E
gotchas
grokkability-clarity
In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.

Estimates of statistical parameters can be based upon different amounts of information or data. The number of independent pieces of information that go into the estimate of a parameter is called the degrees of freedom (df). In general, the degrees of freedom of an estimate of a parameter is equal to the number of independent scores that go into the estimate minus the number of parameters used as intermediate steps in the estimation of the parameter itself (which, in sample variance, is one, since the sample mean is the only intermediate step).

Mathematically, degrees of freedom is the dimension of the domain of a random vector, or essentially the number of 'free' components: how many components need to be known before the vector is fully determined.

...

This is a subtle question. It takes a thoughtful person not to understand those quotations! Although they are suggestive, it turns out that none of them is exactly or generally correct. I haven't the time (and there isn't the space here) to give a full exposition, but I would like to share one approach and an insight that it suggests.

Where does the concept of degrees of freedom (DF) arise? The contexts in which it's found in elementary treatments are:

- The Student t-test and its variants such as the Welch or Satterthwaite solutions to the Behrens-Fisher problem (where two populations have different variances).

- The Chi-squared distribution (defined as a sum of squares of independent standard Normals), which is implicated in the sampling distribution of the variance.

- The F-test (of ratios of estimated variances).

- The Chi-squared test, comprising its uses in (a) testing for independence in contingency tables and (b) testing for goodness of fit of distributional estimates.

In spirit, these tests run a gamut from being exact (the Student t-test and F-test for Normal variates) to being good approximations (the Student t-test and the Welch/Satterthwaite tests for not-too-badly-skewed data) to being based on asymptotic approximations (the Chi-squared test). An interesting aspect of some of these is the appearance of non-integral "degrees of freedom" (the Welch/Satterthwaite tests and, as we will see, the Chi-squared test). This is of especial interest because it is the first hint that DF is not any of the things claimed of it.

...

Having been alerted by these potential ambiguities, let's hold up the Chi-squared goodness of fit test for examination, because (a) it's simple, (b) it's one of the common situations where people really do need to know about DF to get the p-value right and (c) it's often used incorrectly. Here's a brief synopsis of the least controversial application of this test:

...

This, many authorities tell us, should have (to a very close approximation) a Chi-squared distribution. But there's a whole family of such distributions. They are differentiated by a parameter νν often referred to as the "degrees of freedom." The standard reasoning about how to determine νν goes like this

I have kk counts. That's kk pieces of data. But there are (functional) relationships among them. To start with, I know in advance that the sum of the counts must equal nn. That's one relationship. I estimated two (or pp, generally) parameters from the data. That's two (or pp) additional relationships, giving p+1p+1 total relationships. Presuming they (the parameters) are all (functionally) independent, that leaves only k−p−1k−p−1 (functionally) independent "degrees of freedom": that's the value to use for νν.

The problem with this reasoning (which is the sort of calculation the quotations in the question are hinting at) is that it's wrong except when some special additional conditions hold. Moreover, those conditions have nothing to do with independence (functional or statistical), with numbers of "components" of the data, with the numbers of parameters, nor with anything else referred to in the original question.

...

Things went wrong because I violated two requirements of the Chi-squared test:

1. You must use the Maximum Likelihood estimate of the parameters. (This requirement can, in practice, be slightly violated.)

2. You must base that estimate on the counts, not on the actual data! (This is crucial.)

...

The point of this comparison--which I hope you have seen coming--is that the correct DF to use for computing the p-values depends on many things other than dimensions of manifolds, counts of functional relationships, or the geometry of Normal variates. There is a subtle, delicate interaction between certain functional dependencies, as found in mathematical relationships among quantities, and distributions of the data, their statistics, and the estimators formed from them. Accordingly, it cannot be the case that DF is adequately explainable in terms of the geometry of multivariate normal distributions, or in terms of functional independence, or as counts of parameters, or anything else of this nature.

We are led to see, then, that "degrees of freedom" is merely a heuristic that suggests what the sampling distribution of a (t, Chi-squared, or F) statistic ought to be, but it is not dispositive. Belief that it is dispositive leads to egregious errors. (For instance, the top hit on Google when searching "chi squared goodness of fit" is a Web page from an Ivy League university that gets most of this completely wrong! In particular, a simulation based on its instructions shows that the chi-squared value it recommends as having 7 DF actually has 9 DF.)

january 2017 by nhaliday

teaching - Intuitive explanation for dividing by $n-1$ when calculating standard deviation? - Cross Validated

january 2017 by nhaliday

The standard deviation calculated with a divisor of n-1 is a standard deviation calculated from the sample as an estimate of the standard deviation of the population from which the sample was drawn. Because the observed values fall, on average, closer to the sample mean than to the population mean, the standard deviation which is calculated using deviations from the sample mean underestimates the desired standard deviation of the population. Using n-1 instead of n as the divisor corrects for that by making the result a little bit bigger.

Note that the correction has a larger proportional effect when n is small than when it is large, which is what we want because when n is larger the sample mean is likely to be a good estimator of the population mean.

...

A common one is that the definition of variance (of a distribution) is the second moment recentered around a known, definite mean, whereas the estimator uses an estimated mean. This loss of a degree of freedom (given the mean, you can reconstitute the dataset with knowledge of just n−1 of the data values) requires the use of n−1 rather than nn to "adjust" the result.

q-n-a
overflow
stats
acm
intuition
explanation
bias-variance
methodology
moments
nibble
degrees-of-freedom
sampling-bias
generalization
dimensionality
ground-up
intricacy
Note that the correction has a larger proportional effect when n is small than when it is large, which is what we want because when n is larger the sample mean is likely to be a good estimator of the population mean.

...

A common one is that the definition of variance (of a distribution) is the second moment recentered around a known, definite mean, whereas the estimator uses an estimated mean. This loss of a degree of freedom (given the mean, you can reconstitute the dataset with knowledge of just n−1 of the data values) requires the use of n−1 rather than nn to "adjust" the result.

january 2017 by nhaliday

linear algebra - What's an intuitive way to think about the determinant? - Mathematics Stack Exchange

january 2017 by nhaliday

goes through the standard volume of parallelepiped/multilinear alternating map formulations

q-n-a
overflow
intuition
math
linear-algebra
ground-up
explanation
characterization
spatial
measure
nibble
identity
january 2017 by nhaliday

bundles : init ‧ math ‧ problem-solving ‧ vague

**related tags**

Copy this bookmark: