nhaliday + grokkability-clarity   101

How is definiteness expressed in languages with no definite article, clitic or affix? - Linguistics Stack Exchange
All languages, as far as we know, do something to mark information status. Basically this means that when you refer to an X, you have to do something to indicate the answer to questions like:
1. Do you have a specific X in mind?
2. If so, you think your hearer is familiar with the X you're talking about?
3. If so, have you already been discussing that X for a while, or is it new to the conversation?
4. If you've been discussing the X for a while, has it been the main topic of conversation?

Question #2 is more or less what we mean by "definiteness."
...

But there are lots of other information-status-marking strategies that don't directly involve definiteness marking. For example:
...
q-n-a  stackex  language  foreign-lang  linguistics  lexical  syntax  concept  conceptual-vocab  thinking  things  span-cover  direction  degrees-of-freedom  communication  anglo  japan  china  asia  russia  mediterranean  grokkability-clarity  intricacy  uniqueness  number  universalism-particularism  whole-partial-many  usa  latin-america  farmers-and-foragers  nordic  novelty  trivia  duplication  dependence-independence  spanish  context  orders  water  comparison 
6 weeks ago by nhaliday
selenium - What is the difference between cssSelector & Xpath and which is better with respect to performance for cross browser testing? - Stack Overflow
CSS selectors perform far better than Xpath and it is well documented in Selenium community. Here are some reasons,
- Xpath engines are different in each browser, hence make them inconsistent
- IE does not have a native xpath engine, therefore selenium injects its own xpath engine for compatibility of its API. Hence we lose the advantage of using native browser features that WebDriver inherently promotes.
- Xpath tend to become complex and hence make hard to read in my opinion
However there are some situations where, you need to use xpath, for example, searching for a parent element or searching element by its text (I wouldn't recommend the later).
--
I’m going to hold the unpopular on SO selenium tag opinion that XPath is preferable to CSS in the longer run.

This long post has two sections - first I'll put a back-of-the-napkin proof the performance difference between the two is 0.1-0.3 milliseconds (yes; that's 100 microseconds), and then I'll share my opinion why XPath is more powerful.

...

With the performance out of the picture, why do I think xpath is better? Simple – versatility, and power.

Xpath is a language developed for working with XML documents; as such, it allows for much more powerful constructs than css.
For example, navigation in every direction in the tree – find an element, then go to its grandparent and search for a child of it having certain properties.
It allows embedded boolean conditions – cond1 and not(cond2 or not(cond3 and cond4)); embedded selectors – "find a div having these children with these attributes, and then navigate according to it".
XPath allows searching based on a node's value (its text) – however frowned upon this practice is, it does come in handy especially in badly structured documents (no definite attributes to step on, like dynamic ids and classes - locate the element by its text content).

The stepping in css is definitely easier – one can start writing selectors in a matter of minutes; but after a couple of days of usage, the power and possibilities xpath has quickly overcomes css.
And purely subjective – a complex css is much harder to read than a complex xpath expression.
q-n-a  stackex  comparison  best-practices  programming  yak-shaving  python  javascript  web  frontend  performance  DSL  debate  grokkability  trees  grokkability-clarity 
august 2019 by nhaliday
testing - Is there a reason that tests aren't written inline with the code that they test? - Software Engineering Stack Exchange
The only advantage I can think of for inline tests would be reducing the number of files to be written. With modern IDEs this really isn't that big a deal.

There are, however, a number of obvious drawbacks to inline testing:
- It violates separation of concerns. This may be debatable, but to me testing functionality is a different responsibility than implementing it.
- You'd either have to introduce new language features to distinguish between tests/implementation, or you'd risk blurring the line between the two.
- Larger source files are harder to work with: harder to read, harder to understand, you're more likely to have to deal with source control conflicts.
- I think it would make it harder to put your "tester" hat on, so to speak. If you're looking at the implementation details, you'll be more tempted to skip implementing certain tests.
q-n-a  stackex  programming  engineering  best-practices  debate  correctness  checking  code-organizing  composition-decomposition  coupling-cohesion  psychology  cog-psych  attention  thinking  neurons  contiguity-proximity  grokkability  grokkability-clarity 
august 2019 by nhaliday
An Eye Tracking Study on camelCase and under_score Identifier Styles - IEEE Conference Publication
One main difference is that subjects were trained mainly in the underscore style and were all programmers. While results indicate no difference in accuracy between the two styles, subjects recognize identifiers in the underscore style more quickly.

ToCamelCaseorUnderscore: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.158.9499
An empirical study of 135 programmers and non-programmers was conducted to better understand the impact of identifier style on code readability. The experiment builds on past work of others who study how readers of natural language perform such tasks. Results indicate that camel casing leads to higher accuracy among all subjects regardless of training, and those trained in camel casing are able to recognize identifiers in the camel case style faster than identifiers in the underscore style.

https://en.wikipedia.org/wiki/Camel_case#Readability_studies
A 2009 study comparing snake case to camel case found that camel case identifiers could be recognised with higher accuracy among both programmers and non-programmers, and that programmers already trained in camel case were able to recognise those identifiers faster than underscored snake-case identifiers.[35]

A 2010 follow-up study, under the same conditions but using an improved measurement method with use of eye-tracking equipment, indicates: "While results indicate no difference in accuracy between the two styles, subjects recognize identifiers in the underscore style more quickly."[36]
study  psychology  cog-psych  hci  programming  best-practices  stylized-facts  null-result  multi  wiki  reference  concept  empirical  evidence-based  efficiency  accuracy  time  code-organizing  grokkability  protocol-metadata  form-design  grokkability-clarity 
july 2019 by nhaliday
history - Why are UNIX/POSIX system call namings so illegible? - Unix & Linux Stack Exchange
It's due to the technical constraints of the time. The POSIX standard was created in the 1980s and referred to UNIX, which was born in the 1970. Several C compilers at that time were limited to identifiers that were 6 or 8 characters long, so that settled the standard for the length of variable and function names.

http://neverworkintheory.org/2017/11/26/abbreviated-full-names.html
We carried out a family of controlled experiments to investigate whether the use of abbreviated identifier names, with respect to full-word identifier names, affects fault fixing in C and Java source code. This family consists of an original (or baseline) controlled experiment and three replications. We involved 100 participants with different backgrounds and experiences in total. Overall results suggested that there is no difference in terms of effort, effectiveness, and efficiency to fix faults, when source code contains either only abbreviated or only full-word identifier names. We also conducted a qualitative study to understand the values, beliefs, and assumptions that inform and shape fault fixing when identifier names are either abbreviated or full-word. We involved in this qualitative study six professional developers with 1--3 years of work experience. A number of insights emerged from this qualitative study and can be considered a useful complement to the quantitative results from our family of experiments. One of the most interesting insights is that developers, when working on source code with abbreviated identifier names, adopt a more methodical approach to identify and fix faults by extending their focus point and only in a few cases do they expand abbreviated identifiers.
q-n-a  stackex  trivia  programming  os  systems  legacy  legibility  ux  libraries  unix  linux  hacker  cracker-prog  multi  evidence-based  empirical  expert-experience  engineering  study  best-practices  comparison  quality  debugging  efficiency  time  code-organizing  grokkability  grokkability-clarity 
july 2019 by nhaliday
The Existential Risk of Math Errors - Gwern.net
How big is this upper bound? Mathematicians have often made errors in proofs. But it’s rarer for ideas to be accepted for a long time and then rejected. But we can divide errors into 2 basic cases corresponding to type I and type II errors:

1. Mistakes where the theorem is still true, but the proof was incorrect (type I)
2. Mistakes where the theorem was false, and the proof was also necessarily incorrect (type II)

Before someone comes up with a final answer, a mathematician may have many levels of intuition in formulating & working on the problem, but we’ll consider the final end-product where the mathematician feels satisfied that he has solved it. Case 1 is perhaps the most common case, with innumerable examples; this is sometimes due to mistakes in the proof that anyone would accept is a mistake, but many of these cases are due to changing standards of proof. For example, when David Hilbert discovered errors in Euclid’s proofs which no one noticed before, the theorems were still true, and the gaps more due to Hilbert being a modern mathematician thinking in terms of formal systems (which of course Euclid did not think in). (David Hilbert himself turns out to be a useful example of the other kind of error: his famous list of 23 problems was accompanied by definite opinions on the outcome of each problem and sometimes timings, several of which were wrong or questionable5.) Similarly, early calculus used ‘infinitesimals’ which were sometimes treated as being 0 and sometimes treated as an indefinitely small non-zero number; this was incoherent and strictly speaking, practically all of the calculus results were wrong because they relied on an incoherent concept - but of course the results were some of the greatest mathematical work ever conducted6 and when later mathematicians put calculus on a more rigorous footing, they immediately re-derived those results (sometimes with important qualifications), and doubtless as modern math evolves other fields have sometimes needed to go back and clean up the foundations and will in the future.7

...

Isaac Newton, incidentally, gave two proofs of the same solution to a problem in probability, one via enumeration and the other more abstract; the enumeration was correct, but the other proof totally wrong and this was not noticed for a long time, leading Stigler to remark:

...

TYPE I > TYPE II?
“Lefschetz was a purely intuitive mathematician. It was said of him that he had never given a completely correct proof, but had never made a wrong guess either.”
- Gian-Carlo Rota13

Case 2 is disturbing, since it is a case in which we wind up with false beliefs and also false beliefs about our beliefs (we no longer know that we don’t know). Case 2 could lead to extinction.

...

Except, errors do not seem to be evenly & randomly distributed between case 1 and case 2. There seem to be far more case 1s than case 2s, as already mentioned in the early calculus example: far more than 50% of the early calculus results were correct when checked more rigorously. Richard Hamming attributes to Ralph Boas a comment that while editing Mathematical Reviews that “of the new results in the papers reviewed most are true but the corresponding proofs are perhaps half the time plain wrong”.

...

Gian-Carlo Rota gives us an example with Hilbert:

...

Olga labored for three years; it turned out that all mistakes could be corrected without any major changes in the statement of the theorems. There was one exception, a paper Hilbert wrote in his old age, which could not be fixed; it was a purported proof of the continuum hypothesis, you will find it in a volume of the Mathematische Annalen of the early thirties.

...

Leslie Lamport advocates for machine-checked proofs and a more rigorous style of proofs similar to natural deduction, noting a mathematician acquaintance guesses at a broad error rate of 1/329 and that he routinely found mistakes in his own proofs and, worse, believed false conjectures30.

[more on these "structured proofs":
https://academia.stackexchange.com/questions/52435/does-anyone-actually-publish-structured-proofs
https://mathoverflow.net/questions/35727/community-experiences-writing-lamports-structured-proofs
]

We can probably add software to that list: early software engineering work found that, dismayingly, bug rates seem to be simply a function of lines of code, and one would expect diseconomies of scale. So one would expect that in going from the ~4,000 lines of code of the Microsoft DOS operating system kernel to the ~50,000,000 lines of code in Windows Server 2003 (with full systems of applications and libraries being even larger: the comprehensive Debian repository in 2007 contained ~323,551,126 lines of code) that the number of active bugs at any time would be… fairly large. Mathematical software is hopefully better, but practitioners still run into issues (eg Durán et al 2014, Fonseca et al 2017) and I don’t know of any research pinning down how buggy key mathematical systems like Mathematica are or how much published mathematics may be erroneous due to bugs. This general problem led to predictions of doom and spurred much research into automated proof-checking, static analysis, and functional languages31.

[related:
https://mathoverflow.net/questions/11517/computer-algebra-errors
I don't know any interesting bugs in symbolic algebra packages but I know a true, enlightening and entertaining story about something that looked like a bug but wasn't.

Define sinc𝑥=(sin𝑥)/𝑥.

Someone found the following result in an algebra package: ∫∞0𝑑𝑥sinc𝑥=𝜋/2
They then found the following results:

...

So of course when they got:

∫∞0𝑑𝑥sinc𝑥sinc(𝑥/3)sinc(𝑥/5)⋯sinc(𝑥/15)=(467807924713440738696537864469/935615849440640907310521750000)𝜋

hmm:
Which means that nobody knows Fourier analysis nowdays. Very sad and discouraging story... – fedja Jan 29 '10 at 18:47

--

Because the most popular systems are all commercial, they tend to guard their bug database rather closely -- making them public would seriously cut their sales. For example, for the open source project Sage (which is quite young), you can get a list of all the known bugs from this page. 1582 known issues on Feb.16th 2010 (which includes feature requests, problems with documentation, etc).

That is an order of magnitude less than the commercial systems. And it's not because it is better, it is because it is younger and smaller. It might be better, but until SAGE does a lot of analysis (about 40% of CAS bugs are there) and a fancy user interface (another 40%), it is too hard to compare.

I once ran a graduate course whose core topic was studying the fundamental disconnect between the algebraic nature of CAS and the analytic nature of the what it is mostly used for. There are issues of logic -- CASes work more or less in an intensional logic, while most of analysis is stated in a purely extensional fashion. There is no well-defined 'denotational semantics' for expressions-as-functions, which strongly contributes to the deeper bugs in CASes.]

...

Should such widely-believed conjectures as P≠NP or the Riemann hypothesis turn out be false, then because they are assumed by so many existing proofs, a far larger math holocaust would ensue38 - and our previous estimates of error rates will turn out to have been substantial underestimates. But it may be a cloud with a silver lining, if it doesn’t come at a time of danger.

https://mathoverflow.net/questions/338607/why-doesnt-mathematics-collapse-down-even-though-humans-quite-often-make-mista

more on formal methods in programming:
https://www.quantamagazine.org/formal-verification-creates-hacker-proof-code-20160920/
https://intelligence.org/2014/03/02/bob-constable/

https://softwareengineering.stackexchange.com/questions/375342/what-are-the-barriers-that-prevent-widespread-adoption-of-formal-methods
Update: measured effort
In the October 2018 issue of Communications of the ACM there is an interesting article about Formally verified software in the real world with some estimates of the effort.

Interestingly (based on OS development for military equipment), it seems that producing formally proved software requires 3.3 times more effort than with traditional engineering techniques. So it's really costly.

On the other hand, it requires 2.3 times less effort to get high security software this way than with traditionally engineered software if you add the effort to make such software certified at a high security level (EAL 7). So if you have high reliability or security requirements there is definitively a business case for going formal.

WHY DON'T PEOPLE USE FORMAL METHODS?: https://www.hillelwayne.com/post/why-dont-people-use-formal-methods/
You can see examples of how all of these look at Let’s Prove Leftpad. HOL4 and Isabelle are good examples of “independent theorem” specs, SPARK and Dafny have “embedded assertion” specs, and Coq and Agda have “dependent type” specs.6

If you squint a bit it looks like these three forms of code spec map to the three main domains of automated correctness checking: tests, contracts, and types. This is not a coincidence. Correctness is a spectrum, and formal verification is one extreme of that spectrum. As we reduce the rigour (and effort) of our verification we get simpler and narrower checks, whether that means limiting the explored state space, using weaker types, or pushing verification to the runtime. Any means of total specification then becomes a means of partial specification, and vice versa: many consider Cleanroom a formal verification technique, which primarily works by pushing code review far beyond what’s humanly possible.

...

The question, then: “is 90/95/99% correct significantly cheaper than 100% correct?” The answer is very yes. We all are comfortable saying that a codebase we’ve well-tested and well-typed is mostly correct modulo a few fixes in prod, and we’re even writing more than four lines of code a day. In fact, the vast… [more]
ratty  gwern  analysis  essay  realness  truth  correctness  reason  philosophy  math  proofs  formal-methods  cs  programming  engineering  worse-is-better/the-right-thing  intuition  giants  old-anglo  error  street-fighting  heuristic  zooming  risk  threat-modeling  software  lens  logic  inference  physics  differential  geometry  estimate  distribution  robust  speculation  nonlinearity  cost-benefit  convexity-curvature  measure  scale  trivia  cocktail  history  early-modern  europe  math.CA  rigor  news  org:mag  org:sci  miri-cfar  pdf  thesis  comparison  examples  org:junk  q-n-a  stackex  pragmatic  tradeoffs  cracker-prog  techtariat  invariance  DSL  chart  ecosystem  grokkability  heavyweights  CAS  static-dynamic  lower-bounds  complexity  tcs  open-problems  big-surf  ideas  certificates-recognition  proof-systems  PCP  mediterranean  SDP  meta:prediction  epistemic  questions  guessing  distributed  overflow  nibble  soft-question  track-record  big-list  hmm  frontier  state-of-art  move-fast-(and-break-things)  grokkability-clarity  technical-writing  trust 
july 2019 by nhaliday
Interview with Donald Knuth | Interview with Donald Knuth | InformIT
Andrew Binstock and Donald Knuth converse on the success of open source, the problem with multicore architecture, the disappointing lack of interest in literate programming, the menace of reusable code, and that urban legend about winning a programming contest with a single compilation.

Reusable vs. re-editable code: https://hal.archives-ouvertes.fr/hal-01966146/document
- Konrad Hinsen

https://www.johndcook.com/blog/2008/05/03/reusable-code-vs-re-editable-code/
I think whether code should be editable or in “an untouchable black box” depends on the number of developers involved, as well as their talent and motivation. Knuth is a highly motivated genius working in isolation. Most software is developed by large teams of programmers with varying degrees of motivation and talent. I think the further you move away from Knuth along these three axes the more important black boxes become.
nibble  interview  giants  expert-experience  programming  cs  software  contrarianism  carmack  oss  prediction  trends  linux  concurrency  desktop  comparison  checking  debugging  stories  engineering  hmm  idk  algorithms  books  debate  flux-stasis  duplication  parsimony  best-practices  writing  documentation  latex  intricacy  structure  hardware  caching  workflow  editors  composition-decomposition  coupling-cohesion  exposition  technical-writing  thinking  cracker-prog  code-organizing  grokkability  multi  techtariat  commentary  pdf  reflection  essay  examples  python  data-science  libraries  grokkability-clarity 
june 2019 by nhaliday
c++ - Why is the code in most STL implementations so convoluted? - Stack Overflow
A similar questions have been previously posed:

Is there a readable implementation of the STL

Why STL implementation is so unreadable? How C++ could have been improved here?

--

Neil Butterworth, now listed as "anon", provided a useful link in his answer to the SO question "Is there a readable implementation of the STL?". Quoting his answer there:

There is a book The C++ Standard Template Library, co-authored by the original STL designers Stepanov & Lee (together with P.J. Plauger and David Musser), which describes a possible implementation, complete with code - see http://www.amazon.co.uk/C-Standard-Template-Library/dp/0134376331.

See also the other answers in that thread.

Anyway, most of the STL code (by STL I here mean the STL-like subset of the C++ standard library) is template code, and as such must be header-only, and since it's used in almost every program it pays to have that code as short as possible.

Thus, the natural trade-off point between conciseness and readability is much farther over on the conciseness end of the scale than with "normal" code.

--

About the variables names, library implementors must use "crazy" naming conventions, such as names starting with an underscore followed by an uppercase letter, because such names are reserved for them. They cannot use "normal" names, because those may have been redefined by a user macro.

Section 17.6.3.3.2 "Global names" §1 states:

Certain sets of names and function signatures are always reserved to the implementation:

Each name that contains a double underscore or begins with an underscore followed by an uppercase letter is reserved to the implementation for any use.

Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.

(Note that these rules forbid header guards like __MY_FILE_H which I have seen quite often.)

--

Implementations vary. libc++ for example, is much easier on the eyes. There's still a bit of underscore noise though. As others have noted, the leading underscores are unfortunately required. Here's the same function in libc++:
q-n-a  stackex  programming  engineering  best-practices  c(pp)  systems  pls  nitty-gritty  libraries  code-organizing  grokkability  grokkability-clarity 
may 2019 by nhaliday
Language Log: French syntax is (in)corruptible
One of the most striking ideologies of linguistic uniqueness is the belief that French exactly mirrors the inner language of logical thought. A few minutes of research led me to the conclusion that the source of this meme, or at least its earliest example, is an essay by Antoine de Rivarol, "L'Universalité de la langue française". In 1783, the Berlin Academy held a competition for essays on the subject of the widespread usage of French, and its prospects for continuing as the lingua franca of European intellectuals. Apparently nine submissions argued that French would continue; nine that it would be replaced by German; and one that Russian would win out. (English got no votes.) Antoine de Rivarol shared the prize with Johann Christoph Schwab.

De Rivarol's essay is the source of the often-quoted phrase Ce qui n'est pas clair n'est pas français ("What is not clear is not French"). My (doubtless faulty) translation of the relevant passage is below the jump.
org:junk  commentary  quotes  europe  gallic  language  history  early-modern  essay  aphorism  lol  clarity  open-closed  enlightenment-renaissance-restoration-reformation  alien-character  linguistics  grokkability-clarity  french  uniqueness  org:edu 
july 2017 by nhaliday
The Iran-Saudi Arabia Conflict Explained in Three Maps – LATE EMPIRE
As you can see, a high percentage of Saudi oil happens to be in the Shi’a-populated areas of their country. Saudi Arabia’s rulers are ultraconservative religious authoritarians of the Sunni conviction who unabashedly and colorfully display their interpretation of the Islamic religion without compromise. Naturally, such a ruling caste maintains a persistent paranoia that their Shi’a citizens will defect and take the oil under their feet with them with the assistance of the equally ultraconservative, ambitious, and revolutionary Shi’a Iran.

The American “restructuring” of the Iraqi political map and subsequent formation of a Shi’a-led government in Baghdad has created a launching pad for the projection of Iranian influence into Saudi territory.

So that’s the conflict in a nutshell.
gnon  🐸  MENA  iran  politics  foreign-policy  geopolitics  religion  islam  explanation  clarity  summary  chart  roots  energy-resources  history  mostly-modern  usa  maps  big-picture  impetus  geography  grokkability-clarity  backup 
july 2017 by nhaliday
co.combinatorics - Classification of Platonic solids - MathOverflow
My question is very basic: where can I find a complete (and hopefully self-contained) proof of the classification of Platonic solids? In all the references that I found, they use Euler's formula v−e+f=2v−e+f=2 to show that there are exactly five possible triples (v,e,f)(v,e,f). But of course this is not a complete proof because it does not rule out the possibility of different configurations or deformations. Has anyone ever written up a complete proof of this statement?!

...

This is a classical question. Here is my reading of it: Why is there a unique polytope with given combinatorics of faces, which are all regular polygons? Of course, for simple polytopes (tetrahedron, cube, dodecahedron) this is clear, but for the octahedron and icosahedron this is less clear.

The answer lies in the Cauchy's theorem. It was Legendre, while writing his Elements of Geometry and Trigonometry, noticed that Euclid's proof is incomplete in the Elements. Curiously, Euclid finds both radii of inscribed and circumscribed spheres (correctly) without ever explaining why they exist. Cauchy worked out a proof while still a student in 1813, more or less specifically for this purpose. The proof also had a technical gap which was found and patched up by Steinitz in 1920s.

The complete (corrected) proof can be found in the celebrated Proofs from the Book, or in Marcel Berger's Geometry. My book gives a bit more of historical context and some soft arguments (ch. 19). It's worth comparing this proof with (an erroneous) pre-Steinitz exposition, say in Hadamard's Leçons de Géométrie Elémentaire II, or with an early post-Steinitz correct but tedious proof given in (otherwise, excellent) Alexandrov's monograph (see also ch.26 in my book which compares all the approaches).

P.S. Note that Coxeter in Regular Polytopes can completely avoid this issue but taking a different (modern) definition of the regular polytopes (which are symmetric under group actions). For a modern exposition and the state of art of this approach, see McMullen and Schulte's Abstract Regular Polytopes.

https://en.wikipedia.org/wiki/Platonic_solid#Classification
https://mathoverflow.net/questions/46502/on-the-number-of-archimedean-solids
q-n-a  overflow  math  topology  geometry  math.CO  history  iron-age  mediterranean  the-classics  multi  curiosity  clarity  proofs  nibble  wiki  reference  characterization  uniqueness  list  ground-up  grokkability-clarity 
july 2017 by nhaliday
Unsupervised learning, one notion or many? – Off the convex path
(Task A) Learning a distribution from samples. (Examples: gaussian mixtures, topic models, variational autoencoders,..)

(Task B) Understanding latent structure in the data. This is not the same as (a); for example principal component analysis, clustering, manifold learning etc. identify latent structure but don’t learn a distribution per se.

(Task C) Feature Learning. Learn a mapping from datapoint → feature vector such that classification tasks are easier to carry out on feature vectors rather than datapoints. For example, unsupervised feature learning could help lower the amount of labeled samples needed for learning a classifier, or be useful for domain adaptation.

Task B is often a subcase of Task C, as the intended user of “structure found in data” are humans (scientists) who pour over the representation of data to gain some intuition about its properties, and these “properties” can be often phrased as a classification task.

This post explains the relationship between Tasks A and C, and why they get mixed up in students’ mind. We hope there is also some food for thought here for experts, namely, our discussion about the fragility of the usual “perplexity” definition of unsupervised learning. It explains why Task A doesn’t in practice lead to good enough solution for Task C. For example, it has been believed for many years that for deep learning, unsupervised pretraining should help supervised training, but this has been hard to show in practice.
acmtariat  org:bleg  nibble  machine-learning  acm  thinking  clarity  unsupervised  conceptual-vocab  concept  explanation  features  bayesian  off-convex  deep-learning  latent-variables  generative  intricacy  distribution  sampling  grokkability-clarity  org:popup 
june 2017 by nhaliday
A Lttle More Nuance
economics lowest, though still increasing (also most successful frankly, wonder why? :P):
In this view, the trajectories of the disciplines relative to one another are sharpened. I have to say that if you’d asked me ex ante to rank fields by nuance I would have come up with an ordering much like the one visible at the end of the trend lines. But it also seems that social science fields were not differentiated in this way until comparatively recently. Note that the negative trend line for Economics is relative not to the rate of nuance within field itself—which is going up, as it is everywhere—but rather with respect to the base rate. The trend line for Philosophy is also worth remarking on. It differs quite markedly from the others, as it has a very high nuance rate in the first few decades of the twentieth century, which then sharply declines, and rejoins the upward trend in the 1980s. I have not looked at the internal structure of this trend any further, but it is very tempting to read it as the post-WWI positivists bringing the hammer down on what they saw as nonsense in their own field. That’s putting it much too sharply, of course, but then again that’s partly why we’re here in the first place.

https://twitter.com/GabrielRossman/status/879698510077059074
hmm: https://kieranhealy.org/files/papers/fuck-nuance.pdf
scitariat  social-science  sociology  philosophy  history  letters  psychology  sapiens  anthropology  polisci  economics  anglo  language  data  trends  pro-rata  visualization  mostly-modern  academia  intricacy  vague  parsimony  clarity  multi  twitter  social  commentary  jargon  pdf  study  essay  rhetoric  article  time-series  lexical  grokkability-clarity 
june 2017 by nhaliday
Book Review: Peter Turchin – War and Peace and War
I think Turchin’s book is a good introductory text to the new science of cliodynamics, one he himself did much to found (along with Nefedov and Korotayev). However, though readable – mostly, I suspect, because I am interested in the subject – it is not well-written. The text was too thick, there were too many awkward grammatical constructions, and the quotes are far, far too long.

More importantly, 1) the theory is not internally well-integrated and 2) there isn’t enough emphasis on the fundamental differences separating agrarian from industrial societies. For instance, Turchin makes a lot of the idea that the Italians’ low level of asabiya (“amoral familism”) was responsible for it’s only becoming politically unified in the late 19th century. But why then was it the same for Germany, the bloody frontline for the religious wars of the 17th century? And why was France able to build a huge empire under Napoleon, when it had lost all its “meta-ethnic frontiers” / marches by 1000 AD? For answers to these questions about the genesis of the modern nation-state, one would be much better off by looking at more conventional explanations by the likes of Benedict Anderson, Charles Tilly, or Gabriel Ardant.

Nowadays, modern political technologies – the history textbook, the Monument to the Unknown Soldier, the radio and Internet – have long displaced the meta-ethnic frontier as the main drivers behind the formation of asabiya. Which is certainly not to say that meta-ethnic frontiers are unimportant – they are, especially in the case of Dar al-Islam, which feels itself to be under siege on multiple fronts (the “bloody borders” of clash-of-civilizations-speak), which according to Turchin’s theory should promote a stronger Islamic identity. But their intrinsic importance has been diluted by the influence of modern media.
gnon  books  review  summary  turchin  broad-econ  anthropology  sapiens  cohesion  malthus  cliometrics  deep-materialism  inequality  cycles  oscillation  martial  europe  germanic  history  mostly-modern  world-war  efficiency  islam  MENA  group-selection  cultural-dynamics  leviathan  civilization  big-picture  frontier  conquest-empire  iron-age  mediterranean  usa  pre-ww2  elite  disease  parasites-microbiome  nihil  peace-violence  order-disorder  gallic  early-modern  medieval  EU  general-survey  data  poll  critique  comparison  society  eastern-europe  asia  china  india  korea  sinosphere  anglosphere  latin-america  japan  media  institutions  🎩  vampire-squid  winner-take-all  article  nationalism-globalism  🌞  microfoundations  hari-seldon  grokkability  grokkability-clarity 
june 2017 by nhaliday
There Is No Such Thing as Decreasing Returns to Scale — Confessions of a Supply-Side Liberal
Besides pedagogical inertia—enforced to some extent by textbook publishers—I am not quite sure what motivates the devotion in so many economics curricula to U-shaped average cost curves. Let me make one guess: there is a desire to explain why firms are the size they are rather than larger or smaller. To my mind, such an explanation should proceed in one of three ways, appropriate to three different situations.
econotariat  economics  micro  plots  scale  marginal  industrial-org  business  econ-productivity  efficiency  cost-benefit  explanation  critique  clarity  intricacy  curvature  convexity-curvature  nonlinearity  input-output  grokkability-clarity 
may 2017 by nhaliday
None So Blind | West Hunter
There have been several articles in the literature claiming that the gene frequency of the 35delG allele of connexin-26, the most common allele causing deafness in Europeans, has doubled in the past 200 years, as a result of relaxed selection and assortative mating over that period.

That’s fucking ridiculous. I see people referencing this in journal articles and books. It’s mentioned in OMIM. But it’s pure nonsense.

https://westhunt.wordpress.com/2013/03/05/none-so-blind/#comment-10483
The only way you’re going to see such a high frequency of an effectively lethal recessive in a continental population is if it conferred a reproductive advantage in heterozygotes. The required advantage must have been as large as its gene frequency, something around 1-2%.

So it’s like sickle-cell.

Now, if you decreased the bad reproductive consequences of deafness, what would you expect to happen? Gradual increase, at around 1 or 2% a generation, if the carrier advantage held – but it probably didn’t. It was probably a defense against some infectious disease, and those have become much less important. If there was no longer any carrier advantage, the frequency wouldn’t change at all.

In order to double in 200 years, you would need a carrier advantage > 9%.

Assortative mating,deaf people marrying other deaf people, would not make much difference. Even if deaf people substantially out-reproduced normals, which they don’t, only ~1-2% of the copies of 35delG reside in deaf people.
west-hunter  scitariat  rant  critique  thinking  realness  being-right  clarity  evolution  genetics  population-genetics  recent-selection  null-result  sapiens  tradeoffs  disease  parasites-microbiome  embodied  multi  poast  ideas  grokkability-clarity 
may 2017 by nhaliday
Is Economic Activity Really “Distributed Less Evenly” Than It Used To Be?
http://xenocrypt.github.io/CountyIncomeHistory.html

First, imagine if you had a bar chart with every county in the United States sorted from lowest to highest by wages per capita, with the width of each bar proportional to the population of the county.

In fact, whenever anyone talks about “clustering” and “even distributions”, they’re mostly really talking about ways of comparing monotonic curves with integral one, whether they realize it or not.
org:med  wonkish  unaffiliated  debate  critique  trends  economics  commentary  douthatish  urban  distribution  inequality  polarization  malaise  regularizer  clarity  usa  history  mostly-modern  data  analysis  spock  nitty-gritty  compensation  vague  data-science  visual-understanding  🎩  thinking  plots  ssc  multi  tools  dynamic  money  class  class-warfare  left-wing  time-series  density  realness  geography  urban-rural  grokkability-clarity 
may 2017 by nhaliday
'Capital in the Twenty-First Century' by Thomas Piketty, reviewed | New Republic
by Robert Solow (positive)

The data then exhibit a clear pattern. In France and Great Britain, national capital stood fairly steadily at about seven times national income from 1700 to 1910, then fell sharply from 1910 to 1950, presumably as a result of wars and depression, reaching a low of 2.5 in Britain and a bit less than 3 in France. The capital-income ratio then began to climb in both countries, and reached slightly more than 5 in Britain and slightly less than 6 in France by 2010. The trajectory in the United States was slightly different: it started at just above 3 in 1770, climbed to 5 in 1910, fell slightly in 1920, recovered to a high between 5 and 5.5 in 1930, fell to below 4 in 1950, and was back to 4.5 in 2010.

The wealth-income ratio in the United States has always been lower than in Europe. The main reason in the early years was that land values bulked less in the wide open spaces of North America. There was of course much more land, but it was very cheap. Into the twentieth century and onward, however, the lower capital-income ratio in the United States probably reflects the higher level of productivity: a given amount of capital could support a larger production of output than in Europe. It is no surprise that the two world wars caused much less destruction and dissipation of capital in the United States than in Britain and France. The important observation for Piketty’s argument is that, in all three countries, and elsewhere as well, the wealth-income ratio has been increasing since 1950, and is almost back to nineteenth-century levels. He projects this increase to continue into the current century, with weighty consequences that will be discussed as we go on.

...

Now if you multiply the rate of return on capital by the capital-income ratio, you get the share of capital in the national income. For example, if the rate of return is 5 percent a year and the stock of capital is six years worth of national income, income from capital will be 30 percent of national income, and so income from work will be the remaining 70 percent. At last, after all this preparation, we are beginning to talk about inequality, and in two distinct senses. First, we have arrived at the functional distribution of income—the split between income from work and income from wealth. Second, it is always the case that wealth is more highly concentrated among the rich than income from labor (although recent American history looks rather odd in this respect); and this being so, the larger the share of income from wealth, the more unequal the distribution of income among persons is likely to be. It is this inequality across persons that matters most for good or ill in a society.

...

The data are complicated and not easily comparable across time and space, but here is the flavor of Piketty’s summary picture. Capital is indeed very unequally distributed. Currently in the United States, the top 10 percent own about 70 percent of all the capital, half of that belonging to the top 1 percent; the next 40 percent—who compose the “middle class”—own about a quarter of the total (much of that in the form of housing), and the remaining half of the population owns next to nothing, about 5 percent of total wealth. Even that amount of middle-class property ownership is a new phenomenon in history. The typical European country is a little more egalitarian: the top 1 percent own 25 percent of the total capital, and the middle class 35 percent. (A century ago the European middle class owned essentially no wealth at all.) If the ownership of wealth in fact becomes even more concentrated during the rest of the twenty-first century, the outlook is pretty bleak unless you have a taste for oligarchy.

Income from wealth is probably even more concentrated than wealth itself because, as Piketty notes, large blocks of wealth tend to earn a higher return than small ones. Some of this advantage comes from economies of scale, but more may come from the fact that very big investors have access to a wider range of investment opportunities than smaller investors. Income from work is naturally less concentrated than income from wealth. In Piketty’s stylized picture of the United States today, the top 1 percent earns about 12 percent of all labor income, the next 9 percent earn 23 percent, the middle class gets about 40 percent, and the bottom half about a quarter of income from work. Europe is not very different: the top 10 percent collect somewhat less and the other two groups a little more.

You get the picture: modern capitalism is an unequal society, and the rich-get-richer dynamic strongly suggest that it will get more so. But there is one more loose end to tie up, already hinted at, and it has to do with the advent of very high wage incomes. First, here are some facts about the composition of top incomes. About 60 percent of the income of the top 1 percent in the United States today is labor income. Only when you get to the top tenth of 1 percent does income from capital start to predominate. The income of the top hundredth of 1 percent is 70 percent from capital. The story for France is not very different, though the proportion of labor income is a bit higher at every level. Evidently there are some very high wage incomes, as if you didn’t know.

This is a fairly recent development. In the 1960s, the top 1 percent of wage earners collected a little more than 5 percent of all wage incomes. This fraction has risen pretty steadily until nowadays, when the top 1 percent of wage earners receive 10–12 percent of all wages. This time the story is rather different in France. There the share of total wages going to the top percentile was steady at 6 percent until very recently, when it climbed to 7 percent. The recent surge of extreme inequality at the top of the wage distribution may be primarily an American development. Piketty, who with Emmanuel Saez has made a careful study of high-income tax returns in the United States, attributes this to the rise of what he calls “supermanagers.” The very highest income class consists to a substantial extent of top executives of large corporations, with very rich compensation packages. (A disproportionate number of these, but by no means all of them, come from the financial services industry.) With or without stock options, these large pay packages get converted to wealth and future income from wealth. But the fact remains that much of the increased income (and wealth) inequality in the United States is driven by the rise of these supermanagers.

and Deirdre McCloskey (p critical): https://ejpe.org/journal/article/view/170
nice discussion of empirical economics, economic history, market failures and statism, etc., with several bon mots

Piketty’s great splash will undoubtedly bring many young economically interested scholars to devote their lives to the study of the past. That is good, because economic history is one of the few scientifically quantitative branches of economics. In economic history, as in experimental economics and a few other fields, the economists confront the evidence (as they do not for example in most macroeconomics or industrial organization or international trade theory nowadays).

...

Piketty gives a fine example of how to do it. He does not get entangled as so many economists do in the sole empirical tool they are taught, namely, regression analysis on someone else’s “data” (one of the problems is the word data, meaning “things given”: scientists should deal in capta, “things seized”). Therefore he does not commit one of the two sins of modern economics, the use of meaningless “tests” of statistical significance (he occasionally refers to “statistically insignificant” relations between, say, tax rates and growth rates, but I am hoping he does not suppose that a large coefficient is “insignificant” because R. A. Fisher in 1925 said it was). Piketty constructs or uses statistics of aggregate capital and of inequality and then plots them out for inspection, which is what physicists, for example, also do in dealing with their experiments and observations. Nor does he commit the other sin, which is to waste scientific time on existence theorems. Physicists, again, don’t. If we economists are going to persist in physics envy let us at least learn what physicists actually do. Piketty stays close to the facts, and does not, for example, wander into the pointless worlds of non-cooperative game theory, long demolished by experimental economics. He also does not have recourse to non-computable general equilibrium, which never was of use for quantitative economic science, being a branch of philosophy, and a futile one at that. On both points, bravissimo.

...

Since those founding geniuses of classical economics, a market-tested betterment (a locution to be preferred to “capitalism”, with its erroneous implication that capital accumulation, not innovation, is what made us better off) has enormously enriched large parts of a humanity now seven times larger in population than in 1800, and bids fair in the next fifty years or so to enrich everyone on the planet. [Not SSA or MENA...]

...

Then economists, many on the left but some on the right, in quick succession from 1880 to the present—at the same time that market-tested betterment was driving real wages up and up and up—commenced worrying about, to name a few of the pessimisms concerning “capitalism” they discerned: greed, alienation, racial impurity, workers’ lack of bargaining strength, workers’ bad taste in consumption, immigration of lesser breeds, monopoly, unemployment, business cycles, increasing returns, externalities, under-consumption, monopolistic competition, separation of ownership from control, lack of planning, post-War stagnation, investment spillovers, unbalanced growth, dual labor markets, capital insufficiency (William Easterly calls it “capital fundamentalism”), peasant irrationality, capital-market imperfections, public … [more]
news  org:mag  big-peeps  econotariat  economics  books  review  capital  capitalism  inequality  winner-take-all  piketty  wealth  class  labor  mobility  redistribution  growth-econ  rent-seeking  history  mostly-modern  trends  compensation  article  malaise  🎩  the-bones  whiggish-hegelian  cjones-like  multi  mokyr-allen-mccloskey  expert  market-failure  government  broad-econ  cliometrics  aphorism  lens  gallic  clarity  europe  critique  rant  optimism  regularizer  pessimism  ideology  behavioral-econ  authoritarianism  intervention  polanyi-marx  politics  left-wing  absolute-relative  regression-to-mean  legacy  empirical  data-science  econometrics  methodology  hypothesis-testing  physics  iron-age  mediterranean  the-classics  quotes  krugman  world  entrepreneurialism  human-capital  education  supply-demand  plots  manifolds  intersection  markets  evolution  darwinian  giants  old-anglo  egalitarianism-hierarchy  optimate  morality  ethics  envy  stagnation  nl-and-so-can-you  expert-experience  courage  stats  randy-ayndy  reason  intersection-connectedness  detail-architect 
april 2017 by nhaliday
How Transparency Kills Information Aggregation: Theory and Experiment
We investigate the potential of transparency to influence committee decision-making. We present a model in which career concerned committee members receive private information of different type-dependent accuracy, deliberate and vote. We study three levels of transparency under which career concerns are predicted to affect behavior differently, and test the model’s key predictions in a laboratory experiment. The model’s predictions are largely borne out – transparency negatively affects information aggregation at the deliberation and voting stages, leading to sharply different committee error rates than under secrecy. This occurs despite subjects revealing more information under transparency than theory predicts.
study  economics  micro  decision-making  decision-theory  collaboration  coordination  info-econ  info-dynamics  behavioral-econ  field-study  clarity  ethics  civic  integrity  error  unintended-consequences  🎩  org:ngo  madisonian  regularizer  enlightenment-renaissance-restoration-reformation  white-paper  microfoundations  open-closed  composition-decomposition  organizing  grokkability-clarity 
april 2017 by nhaliday
There is no fitness but fitness, and the lineage is its bearer | Philosophical Transactions of the Royal Society B: Biological Sciences
We argue that this understanding of inclusive fitness based on gene lineages provides the most illuminating and accurate picture and avoids pitfalls in interpretation and empirical applications of inclusive fitness theory.
study  bio  evolution  selection  concept  explanation  clarity  deep-materialism  org:nat  🌞  population-genetics  genetics  nibble  grokkability-clarity 
march 2017 by nhaliday
Fundamental Theorems of Evolution: The American Naturalist: Vol 0, No 0
I suggest that the most fundamental theorem of evolution is the Price equation, both because of its simplicity and broad scope and because it can be used to derive four other familiar results that are similarly fundamental: Fisher’s average-excess equation, Robertson’s secondary theorem of natural selection, the breeder’s equation, and Fisher’s fundamental theorem. These derivations clarify both the relationships behind these results and their assumptions. Slightly less fundamental results include those for multivariate evolution and social selection. A key feature of fundamental theorems is that they have great simplicity and scope, which are often achieved by sacrificing perfect accuracy. Quantitative genetics has been more productive of fundamental theorems than population genetics, probably because its empirical focus on unknown genotypes freed it from the tyranny of detail and allowed it to focus on general issues.
study  essay  bio  evolution  population-genetics  fisher  selection  EGT  dynamical  exposition  methodology  🌞  big-picture  levers  list  nibble  article  chart  explanation  clarity  trees  ground-up  ideas  grokkability-clarity 
march 2017 by nhaliday
probability - Why does a 95% Confidence Interval (CI) not imply a 95% chance of containing the mean? - Cross Validated
The confidence interval is the answer to the request: "Give me an interval that will bracket the true value of the parameter in 100p% of the instances of an experiment that is repeated a large number of times." The credible interval is an answer to the request: "Give me an interval that brackets the true value with probability pp given the particular sample I've actually observed." To be able to answer the latter request, we must first adopt either (a) a new concept of the data generating process or (b) a different concept of the definition of probability itself.

http://stats.stackexchange.com/questions/139290/a-psychology-journal-banned-p-values-and-confidence-intervals-is-it-indeed-wise

PS. Note that my question is not about the ban itself; it is about the suggested approach. I am not asking about frequentist vs. Bayesian inference either. The Editorial is pretty negative about Bayesian methods too; so it is essentially about using statistics vs. not using statistics at all.

wut

http://stats.stackexchange.com/questions/6966/why-continue-to-teach-and-use-hypothesis-testing-when-confidence-intervals-are
http://stats.stackexchange.com/questions/2356/are-there-any-examples-where-bayesian-credible-intervals-are-obviously-inferior
http://stats.stackexchange.com/questions/2272/whats-the-difference-between-a-confidence-interval-and-a-credible-interval
http://stats.stackexchange.com/questions/6652/what-precisely-is-a-confidence-interval
http://stats.stackexchange.com/questions/1164/why-havent-robust-and-resistant-statistics-replaced-classical-techniques/
http://stats.stackexchange.com/questions/16312/what-is-the-difference-between-confidence-intervals-and-hypothesis-testing
http://stats.stackexchange.com/questions/31679/what-is-the-connection-between-credible-regions-and-bayesian-hypothesis-tests
http://stats.stackexchange.com/questions/11609/clarification-on-interpreting-confidence-intervals
http://stats.stackexchange.com/questions/16493/difference-between-confidence-intervals-and-prediction-intervals
q-n-a  overflow  nibble  stats  data-science  science  methodology  concept  confidence  conceptual-vocab  confusion  explanation  thinking  hypothesis-testing  jargon  multi  meta:science  best-practices  error  discussion  bayesian  frequentist  hmm  publishing  intricacy  wut  comparison  motivation  clarity  examples  robust  metabuch  🔬  info-dynamics  reference  grokkability-clarity 
february 2017 by nhaliday
The Art of Not Being Governed - Wikipedia
For two thousand years the disparate groups that now reside in Zomia (a mountainous region the size of Europe that consists of portions of seven Asian countries) have fled the projects of the nation state societies that surround them—slavery, conscription, taxes, corvée, epidemics, and warfare.[1][2] This book, essentially an “anarchist history,” is the first-ever examination of the huge literature on nation-building whose author evaluates why people would deliberately and reactively remain stateless.

Among the strategies employed by the people of Zomia to remain stateless are physical dispersion in rugged terrain; agricultural practices that enhance mobility; pliable ethnic identities; devotion to prophetic, millenarian leaders; and maintenance of a largely oral culture that allows them to reinvent their histories and genealogies as they move between and around states.

Scott admits to making "bold claims" in his book[3] but credits many other scholars, including the French anthropologist Pierre Clastres and the American historian Owen Lattimore, as influences.[3]
books  anthropology  history  contrarianism  asia  developing-world  world  farmers-and-foragers  leviathan  coordination  wiki  pseudoE  ethnography  emergent  sociology  civilization  order-disorder  legibility  domestication  decentralized  allodium  madisonian  broad-econ  apollonian-dionysian  grokkability  grokkability-clarity 
february 2017 by nhaliday
Einstein's Most Famous Thought Experiment
When Einstein abandoned an emission theory of light, he had also to abandon the hope that electrodynamics could be made to conform to the principle of relativity by the normal sorts of modifications to electrodynamic theory that occupied the theorists of the second half of the 19th century. Instead Einstein knew he must resort to extraordinary measures. He was willing to seek realization of his goal in a re-examination of our basic notions of space and time. Einstein concluded his report on his youthful thought experiment:

"One sees that in this paradox the germ of the special relativity theory is already contained. Today everyone knows, of course, that all attempts to clarify this paradox satisfactorily were condemned to failure as long as the axiom of the absolute character of time, or of simultaneity, was rooted unrecognized in the unconscious. To recognize clearly this axiom and its arbitrary character already implies the essentials of the solution of the problem."
einstein  giants  physics  history  stories  gedanken  exposition  org:edu  electromag  relativity  nibble  innovation  novelty  the-trenches  synchrony  discovery  🔬  org:junk  science  absolute-relative  visuo  explanation  ground-up  clarity  state  causation  intuition  ideas  mostly-modern  pre-ww2  marginal  grokkability-clarity 
february 2017 by nhaliday
interpretation - How to understand degrees of freedom? - Cross Validated
From Wikipedia, there are three interpretations of the degrees of freedom of a statistic:

In statistics, the number of degrees of freedom is the number of values in the final calculation of a statistic that are free to vary.

Estimates of statistical parameters can be based upon different amounts of information or data. The number of independent pieces of information that go into the estimate of a parameter is called the degrees of freedom (df). In general, the degrees of freedom of an estimate of a parameter is equal to the number of independent scores that go into the estimate minus the number of parameters used as intermediate steps in the estimation of the parameter itself (which, in sample variance, is one, since the sample mean is the only intermediate step).

Mathematically, degrees of freedom is the dimension of the domain of a random vector, or essentially the number of 'free' components: how many components need to be known before the vector is fully determined.

...

This is a subtle question. It takes a thoughtful person not to understand those quotations! Although they are suggestive, it turns out that none of them is exactly or generally correct. I haven't the time (and there isn't the space here) to give a full exposition, but I would like to share one approach and an insight that it suggests.

Where does the concept of degrees of freedom (DF) arise? The contexts in which it's found in elementary treatments are:

- The Student t-test and its variants such as the Welch or Satterthwaite solutions to the Behrens-Fisher problem (where two populations have different variances).
- The Chi-squared distribution (defined as a sum of squares of independent standard Normals), which is implicated in the sampling distribution of the variance.
- The F-test (of ratios of estimated variances).
- The Chi-squared test, comprising its uses in (a) testing for independence in contingency tables and (b) testing for goodness of fit of distributional estimates.

In spirit, these tests run a gamut from being exact (the Student t-test and F-test for Normal variates) to being good approximations (the Student t-test and the Welch/Satterthwaite tests for not-too-badly-skewed data) to being based on asymptotic approximations (the Chi-squared test). An interesting aspect of some of these is the appearance of non-integral "degrees of freedom" (the Welch/Satterthwaite tests and, as we will see, the Chi-squared test). This is of especial interest because it is the first hint that DF is not any of the things claimed of it.

...

Having been alerted by these potential ambiguities, let's hold up the Chi-squared goodness of fit test for examination, because (a) it's simple, (b) it's one of the common situations where people really do need to know about DF to get the p-value right and (c) it's often used incorrectly. Here's a brief synopsis of the least controversial application of this test:

...

This, many authorities tell us, should have (to a very close approximation) a Chi-squared distribution. But there's a whole family of such distributions. They are differentiated by a parameter νν often referred to as the "degrees of freedom." The standard reasoning about how to determine νν goes like this

I have kk counts. That's kk pieces of data. But there are (functional) relationships among them. To start with, I know in advance that the sum of the counts must equal nn. That's one relationship. I estimated two (or pp, generally) parameters from the data. That's two (or pp) additional relationships, giving p+1p+1 total relationships. Presuming they (the parameters) are all (functionally) independent, that leaves only k−p−1k−p−1 (functionally) independent "degrees of freedom": that's the value to use for νν.

The problem with this reasoning (which is the sort of calculation the quotations in the question are hinting at) is that it's wrong except when some special additional conditions hold. Moreover, those conditions have nothing to do with independence (functional or statistical), with numbers of "components" of the data, with the numbers of parameters, nor with anything else referred to in the original question.

...

Things went wrong because I violated two requirements of the Chi-squared test:

1. You must use the Maximum Likelihood estimate of the parameters. (This requirement can, in practice, be slightly violated.)
2. You must base that estimate on the counts, not on the actual data! (This is crucial.)

...

The point of this comparison--which I hope you have seen coming--is that the correct DF to use for computing the p-values depends on many things other than dimensions of manifolds, counts of functional relationships, or the geometry of Normal variates. There is a subtle, delicate interaction between certain functional dependencies, as found in mathematical relationships among quantities, and distributions of the data, their statistics, and the estimators formed from them. Accordingly, it cannot be the case that DF is adequately explainable in terms of the geometry of multivariate normal distributions, or in terms of functional independence, or as counts of parameters, or anything else of this nature.

We are led to see, then, that "degrees of freedom" is merely a heuristic that suggests what the sampling distribution of a (t, Chi-squared, or F) statistic ought to be, but it is not dispositive. Belief that it is dispositive leads to egregious errors. (For instance, the top hit on Google when searching "chi squared goodness of fit" is a Web page from an Ivy League university that gets most of this completely wrong! In particular, a simulation based on its instructions shows that the chi-squared value it recommends as having 7 DF actually has 9 DF.)
q-n-a  overflow  stats  data-science  concept  jargon  explanation  methodology  things  nibble  degrees-of-freedom  clarity  curiosity  manifolds  dimensionality  ground-up  intricacy  hypothesis-testing  examples  list  ML-MAP-E  gotchas  grokkability-clarity 
january 2017 by nhaliday
Information Processing: How Brexit was won, and the unreasonable effectiveness of physicists
‘If you don’t get this elementary, but mildly unnatural, mathematics of elementary probability into your repertoire, then you go through a long life like a one-legged man in an ass-kicking contest. You’re giving a huge advantage to everybody else. One of the advantages of a fellow like Buffett … is that he automatically thinks in terms of decision trees and the elementary math of permutations and combinations… It’s not that hard to learn. What is hard is to get so you use it routinely almost everyday of your life. The Fermat/Pascal system is dramatically consonant with the way that the world works. And it’s fundamental truth. So you simply have to have the technique…

‘One of the things that influenced me greatly was studying physics… If I were running the world, people who are qualified to do physics would not be allowed to elect out of taking it. I think that even people who aren’t [expecting to] go near physics and engineering learn a thinking system in physics that is not learned so well anywhere else… The tradition of always looking for the answer in the most fundamental way available – that is a great tradition.’ --- Charlie Munger, Warren Buffet’s partner.

...

If you want to make big improvements in communication, my advice is – hire physicists, not communications people from normal companies, and never believe what advertising companies tell you about ‘data’ unless you can independently verify it. Physics, mathematics, and computer science are domains in which there are real experts, unlike macro-economic forecasting which satisfies neither of the necessary conditions – 1) enough structure in the information to enable good predictions, 2) conditions for good fast feedback and learning. Physicists and mathematicians regularly invade other fields but other fields do not invade theirs so we can see which fields are hardest for very talented people. It is no surprise that they can successfully invade politics and devise things that rout those who wrongly think they know what they are doing. Vote Leave paid very close attention to real experts. ...

More important than technology is the mindset – the hard discipline of obeying Richard Feynman’s advice: ‘The most important thing is not to fool yourself and you are the easiest person to fool.’ They were a hard floor on ‘fooling yourself’ and I empowered them to challenge everybody including me. They saved me from many bad decisions even though they had zero experience in politics and they forced me to change how I made important decisions like what got what money. We either operated scientifically or knew we were not, which is itself very useful knowledge. (One of the things they did was review the entire literature to see what reliable studies have been done on ‘what works’ in politics and what numbers are reliable.) Charlie Munger is one half of the most successful investment partnership in world history. He advises people – hire physicists. It works and the real prize is not the technology but a culture of making decisions in a rational way and systematically avoiding normal ways of fooling yourself as much as possible. This is very far from normal politics.
albion  hsu  scitariat  politics  strategy  tactics  recruiting  stories  reflection  britain  brexit  data-science  physics  interdisciplinary  impact  arbitrage  spock  discipline  clarity  lens  thick-thin  quotes  commentary  tetlock  meta:prediction  wonkish  complex-systems  intricacy  systematic-ad-hoc  realness  current-events  info-dynamics  unaffiliated  grokkability-clarity 
january 2017 by nhaliday
soft question - Thinking and Explaining - MathOverflow
- good question from Bill Thurston
- great answers by Terry Tao, fedja, Minhyong Kim, gowers, etc.

Terry Tao:
- symmetry as blurring/vibrating/wobbling, scale invariance
- anthropomorphization, adversarial perspective for estimates/inequalities/quantifiers, spending/economy

fedja walks through his though-process from another answer

Minhyong Kim: anthropology of mathematical philosophizing

Per Vognsen: normality as isotropy
comment: conjugate subgroup gHg^-1 ~ "H but somewhere else in G"

gowers: hidden things in basic mathematics/arithmetic
comment by Ryan Budney: x sin(x) via x -> (x, sin(x)), (x, y) -> xy
I kinda get what he's talking about but needed to use Mathematica to get the initial visualization down.
To remind myself later:
- xy can be easily visualized by juxtaposing the two parabolae x^2 and -x^2 diagonally
- x sin(x) can be visualized along that surface by moving your finger along the line (x, 0) but adding some oscillations in y direction according to sin(x)
q-n-a  soft-question  big-list  intuition  communication  teaching  math  thinking  writing  thurston  lens  overflow  synthesis  hi-order-bits  👳  insight  meta:math  clarity  nibble  giants  cartoons  gowers  mathtariat  better-explained  stories  the-trenches  problem-solving  homogeneity  symmetry  fedja  examples  philosophy  big-picture  vague  isotropy  reflection  spatial  ground-up  visual-understanding  polynomials  dimensionality  math.GR  worrydream  scholar  🎓  neurons  metabuch  yoga  retrofit  mental-math  metameta  wisdom  wordlessness  oscillation  operational  adversarial  quantifiers-sums  exposition  explanation  tricki  concrete  s:***  manifolds  invariance  dynamical  info-dynamics  cool  direction  elegance  heavyweights  analysis  guessing  grokkability-clarity  technical-writing 
january 2017 by nhaliday
The Mythological Machinations of The Compensation/Productivity Gap – New River Investments – Medium
We in fact find that the surplus productivity growth that did not flow to the corporate sector, or even to the very rich, but in fact flowed to the very poor.

https://twitter.com/pseudoerasmus/status/852585347980636160
@PatrickIber an old chestnut. statistical illusion. wages deflated by CPI. productivity/output deflated by PPI or output deflator

pseudoerasmus: "would have voted for Bernie were I an american"
org:med  data  analysis  contrarianism  econ-productivity  economics  trends  macro  supply-demand  compensation  distribution  inequality  winner-take-all  malaise  regularizer  money  stagnation  multi  econotariat  pseudoE  broad-econ  twitter  social  commentary  discussion  debate  critique  growth-econ  econometrics  econ-metrics  history  mostly-modern  labor  intricacy  clarity  politics  2016-election  piketty  stylized-facts  vampire-squid  gotchas  time-series  grokkability-clarity 
january 2017 by nhaliday
Overcoming Bias : Chip Away At Hard Problems
One of the most common ways that wannabe academics fail is by failing to sufficiently focus on a few topics of interest to academia. Many of them become amateur intellectuals, people who think and write more as a hobby, and less to gain professional rewards via institutions like academia, media, and business. Such amateurs are often just as smart and hard-working as professionals, and they can more directly address the topics that interest them. Professionals, in contrast, must specialize more, have less freedom to pick topics, and must try harder to impress others, which encourages the use of more difficult robust/rigorous methods.

You might think their added freedom would result in amateurs contributing more to intellectual progress, but in fact they contribute less. Yes, amateurs can and do make more initial progress when new topics arise suddenly far from topics where established expert institutions have specialized. But then over time amateurs blow their lead by focusing less and relying on easier more direct methods. They rely more on informal conversation as analysis method, they prefer personal connections over open competitions in choosing people, and they rely more on a perceived consensus among a smaller group of fellow enthusiasts. As a result, their contributions just don’t appeal as widely or as long.
ratty  postrat  culture  academia  science  epistemic  hanson  frontier  contrarianism  thick-thin  long-term  regularizer  strategy  impact  essay  subculture  meta:rhetoric  aversion  discipline  curiosity  rigor  rationality  rat-pack  🤖  success  2016  farmers-and-foragers  exploration-exploitation  low-hanging  clarity  vague  🦉  optimate  systematic-ad-hoc  metameta  s:***  discovery  focus  info-dynamics  hari-seldon  grokkability-clarity 
december 2016 by nhaliday
Fact Posts: How and Why
The most useful thinking skill I've taught myself, which I think should be more widely practiced, is writing what I call "fact posts." I write a bunch of these on my blog. (I write fact posts about pregnancy and childbirth here.)

To write a fact post, you start with an empirical question, or a general topic. Something like "How common are hate crimes?" or "Are epidurals really dangerous?" or "What causes manufacturing job loss?"

It's okay if this is a topic you know very little about. This is an exercise in original seeing and showing your reasoning, not finding the official last word on a topic or doing the best analysis in the world.

Then you open up a Google doc and start taking notes.

You look for quantitative data from conventionally reliable sources. CDC data for incidences of diseases and other health risks in the US; WHO data for global health issues; Bureau of Labor Statistics data for US employment; and so on. Published scientific journal articles, especially from reputable journals and large randomized studies.

You explicitly do not look for opinion, even expert opinion. You avoid news, and you're wary of think-tank white papers. You're looking for raw information. You are taking a sola scriptura approach, for better and for worse.

And then you start letting the data show you things.

You see things that are surprising or odd, and you note that.

You see facts that seem to be inconsistent with each other, and you look into the data sources and methodology until you clear up the mystery.

You orient towards the random, the unfamiliar, the things that are totally unfamiliar to your experience. One of the major exports of Germany is valves? When was the last time I even thought about valves? Why valves, what do you use valves in? OK, show me a list of all the different kinds of machine parts, by percent of total exports.

And so, you dig in a little bit, to this part of the world that you hadn't looked at before. You cultivate the ability to spin up a lightweight sort of fannish obsessive curiosity when something seems like it might be a big deal.

And you take casual notes and impressions (though keeping track of all the numbers and their sources in your notes).

You do a little bit of arithmetic to compare things to familiar reference points. How does this source of risk compare to the risk of smoking or going horseback riding? How does the effect size of this drug compare to the effect size of psychotherapy?

You don't really want to do statistics. You might take percents, means, standard deviations, maybe a Cohen's d here and there, but nothing fancy. You're just trying to figure out what's going on.

It's often a good idea to rank things by raw scale. What is responsible for the bulk of deaths, the bulk of money moved, etc? What is big? Then pay attention more to things, and ask more questions about things, that are big. (Or disproportionately high-impact.)

You may find that this process gives you contrarian beliefs, but often you won't, you'll just have a strongly fact-based assessment of why you believe the usual thing.
ratty  lesswrong  essay  rhetoric  meta:rhetoric  epistemic  thinking  advice  street-fighting  scholar  checklists  🤖  spock  writing  2016  info-foraging  rat-pack  clarity  systematic-ad-hoc  bounded-cognition  info-dynamics  let-me-see  nitty-gritty  core-rats  evidence-based  truth  grokkability-clarity 
december 2016 by nhaliday
Information Processing: Search results for compressed sensing
https://www.unz.com/jthompson/the-hsu-boundary/
http://infoproc.blogspot.com/2017/09/phase-transitions-and-genomic.html
Added: Here are comments from "Donoho-Student":
Donoho-Student says:
September 14, 2017 at 8:27 pm GMT • 100 Words

The Donoho-Tanner transition describes the noise-free (h2=1) case, which has a direct analog in the geometry of polytopes.

The n = 30s result from Hsu et al. (specifically the value of the coefficient, 30, when p is the appropriate number of SNPs on an array and h2 = 0.5) is obtained via simulation using actual genome matrices, and is original to them. (There is no simple formula that gives this number.) The D-T transition had only been established in the past for certain classes of matrices, like random matrices with specific distributions. Those results cannot be immediately applied to genomes.

The estimate that s is (order of magnitude) 10k is also a key input.

I think Hsu refers to n = 1 million instead of 30 * 10k = 300k because the effective SNP heritability of IQ might be less than h2 = 0.5 — there is noise in the phenotype measurement, etc.

Donoho-Student says:
September 15, 2017 at 11:27 am GMT • 200 Words

Lasso is a common statistical method but most people who use it are not familiar with the mathematical theorems from compressed sensing. These results give performance guarantees and describe phase transition behavior, but because they are rigorous theorems they only apply to specific classes of sensor matrices, such as simple random matrices. Genomes have correlation structure, so the theorems do not directly apply to the real world case of interest, as is often true.

What the Hsu paper shows is that the exact D-T phase transition appears in the noiseless (h2 = 1) problem using genome matrices, and a smoothed version appears in the problem with realistic h2. These are new results, as is the prediction for how much data is required to cross the boundary. I don’t think most gwas people are familiar with these results. If they did understand the results they would fund/design adequately powered studies capable of solving lots of complex phenotypes, medical conditions as well as IQ, that have significant h2.

Most people who use lasso, as opposed to people who prove theorems, are not even aware of the D-T transition. Even most people who prove theorems have followed the Candes-Tao line of attack (restricted isometry property) and don’t think much about D-T. Although D eventually proved some things about the phase transition using high dimensional geometry, it was initially discovered via simulation using simple random matrices.
hsu  list  stream  genomics  genetics  concept  stats  methodology  scaling-up  scitariat  sparsity  regression  biodet  bioinformatics  norms  nibble  compressed-sensing  applications  search  ideas  multi  albion  behavioral-gen  iq  state-of-art  commentary  explanation  phase-transition  measurement  volo-avolo  regularization  levers  novelty  the-trenches  liner-notes  clarity  random-matrices  innovation  high-dimension  linear-models  grokkability-clarity 
november 2016 by nhaliday
The Third Sex | West Hunter
- eusociality and humans
- chromosome morphs
- green-beards and asabiya
- MHC/HLA and mating
- white-throated sparrows

Strategies: https://westhunt.wordpress.com/2017/12/27/strategies/
There is a qualitative difference between being XY and X0 (Turner’s syndrome). Being XY, a guy, is the physical embodiment of an evolutionary strategy: a certain genetic pattern that has a way of making more copies of itself. It’s a complex strategy, but that’s what it is. X0 people are sterile: they don’t generate more X0 individuals. Not every XY individual succeeds in reproducing, any more than every maple seed starts a new maple tree – but on average, it works. An X0 individual is the result of noise, errors in meiosis: Turner’s syndrome isn’t a strategy. In the same way, someone with Down’s syndrome isn’t Nature’s way of producing more people with Down’s syndrome.

Parenthetically, being a guy that tries to reproduce is a strategy. Actually reproducing is a successful outcome of that strategy. Similarly, being an alpha dude in a polygynous species like elephant seals is not a strategy: trying to be an alpha dude is the strategy. I see people confuse those two things all the time.

...

Natural selection tends to make physical embodiments of a successful reproductive strategy common. So stuff like Down’s syndrome, Turner’s syndrome, androgen insensitivity, etc, are all rare. Successful evolutionary strategies usually involve actually getting things done: so there is a tendency for natural selection to develop and optimize various useful abilities, like walking and talking and thinking. All part of the strategy. Many non-strategies [like Downs or Fragile X] mess up those abilities

...

Is there any evidence for alternate evolutionary strategies in humans, other than just male and female? Not really, so far. For example, schizophrenia looks more like noise, sand in the gears. Not much of the schiz genetic variance shows up in GWAS samples: it looks like it’s mostly driven by rare variants – genetic load. There may actually be some truth to the notion that happy families are all alike.

So, is sex a spectrum in humans? No: obviously not. Two basic strategies, plus errors.

Why would a geneticist be unable to make the distinction between an evolutionary strategy and an error of development (i.e. caused by replication errors of pathogens)? Well, the average geneticist doesn’t know much evolutionary biology. And being embedded in a university, the current replacement for old-fashioned booby hatches, he’s subject to pressures that reward him for saying stupid things. and of course some people are pre-adapted to saying stupid things.

https://westhunt.wordpress.com/2017/12/27/strategies/#comment-98964
My whole point was that you can explain the qualitative difference without being in any way teleological. You’d do better to think about positive igon-values than Aristotle.
bio  sapiens  speculation  evolution  society  insight  west-hunter  🌞  big-picture  sexuality  sex  immune  scitariat  ideas  s:*  cohesion  CRISPR  genetics  genomics  parasites-microbiome  deep-materialism  scifi-fantasy  tribalism  us-them  organizing  multi  signal-noise  error  sanctity-degradation  aphorism  rant  higher-ed  academia  social-science  westminster  truth  equilibrium  strategy  disease  psychiatry  genetic-load  QTL  poast  volo-avolo  explanation  philosophy  the-classics  big-peeps  thinking  clarity  distribution  selection  intricacy  conceptual-vocab  biotech  enhancement  new-religion  coordination  cooperate-defect  axelrod  EGT  population-genetics  interests  darwinian  telos-atelos  sociality  metabuch  essence-existence  forms-instances  game-theory  grokkability-clarity  number  uniqueness  degrees-of-freedom 
october 2016 by nhaliday
Can It Be Wrong To Crystallize Patterns? | Slate Star Codex
So provisionally I’m not sure there’s such a thing as crystallizing a pattern and being wrong to do so. You can crystallize patterns in such a way that it ends out misleading people who were already at risk of being misled – like the “ley lines” and “international Jewry” examples – and in practice this is a HUGE HUGE problem. But it seems to me that if you’re good enough at sorting through connotations to handle it that crystallization is usually a good idea.
thinking  rationality  yvain  ssc  essay  reflection  things  insight  lesswrong  critique  ratty  epistemic  map-territory  clarity  error  vague  systematic-ad-hoc  info-dynamics  grokkability-clarity 
october 2016 by nhaliday
A Fervent Defense of Frequentist Statistics - Less Wrong
Short summary. This essay makes many points, each of which I think is worth reading, but if you are only going to understand one point I think it should be “Myth 5″ below, which describes the online learning framework as a response to the claim that frequentist methods need to make strong modeling assumptions. Among other things, online learning allows me to perform the following remarkable feat: if I’m betting on horses, and I get to place bets after watching other people bet but before seeing which horse wins the race, then I can guarantee that after a relatively small number of races, I will do almost as well overall as the best other person, even if the number of other people is very large (say, 1 billion), and their performance is correlated in complicated ways.

If you’re only going to understand two points, then also read about the frequentist version of Solomonoff induction, which is described in “Myth 6″.

...

If you are like me from, say, two years ago, you are firmly convinced that Bayesian methods are superior and that you have knockdown arguments in favor of this. If this is the case, then I hope this essay will give you an experience that I myself found life-altering: the experience of having a way of thinking that seemed unquestionably true slowly dissolve into just one of many imperfect models of reality. This experience helped me gain more explicit appreciation for the skill of viewing the world from many different angles, and of distinguishing between a very successful paradigm and reality.

If you are not like me, then you may have had the experience of bringing up one of many reasonable objections to normative Bayesian epistemology, and having it shot down by one of many “standard” arguments that seem wrong but not for easy-to-articulate reasons. I hope to lend some reprieve to those of you in this camp, by providing a collection of “standard” replies to these standard arguments.
bayesian  philosophy  stats  rhetoric  advice  debate  critique  expert  lesswrong  commentary  discussion  regularizer  essay  exposition  🤖  aphorism  spock  synthesis  clever-rats  ratty  hi-order-bits  top-n  2014  acmtariat  big-picture  acm  iidness  online-learning  lens  clarity  unit  nibble  frequentist  s:**  expert-experience  subjective-objective  grokkability-clarity 
september 2016 by nhaliday
Thoughts on graduate school | Secret Blogging Seminar
I’ll organize my thoughts around the following ideas.

- Prioritize reading readable sources
- Build narratives
- Study other mathematician’s taste
- Do one early side project
- Find a clump of other graduate students
- Cast a wide net when looking for an advisor
- Don’t just work on one thing
- Don’t graduate until you have to
reflection  math  grad-school  phd  advice  expert  strategy  long-term  growth  🎓  aphorism  learning  scholar  hi-order-bits  tactics  mathtariat  metabuch  org:bleg  nibble  the-trenches  big-picture  narrative  meta:research  info-foraging  skeleton  studying  prioritizing  s:*  info-dynamics  chart  expert-experience  explore-exploit  meta:reading  grokkability  grokkability-clarity 
september 2016 by nhaliday
natural language processing blog: Debugging machine learning
I've been thinking, mostly in the context of teaching, about how to specifically teach debugging of machine learning. Personally I find it very helpful to break things down in terms of the usual error terms: Bayes error (how much error is there in the best possible classifier), approximation error (how much do you pay for restricting to some hypothesis class), estimation error (how much do you pay because you only have finite samples), optimization error (how much do you pay because you didn't find a global optimum to your optimization problem). I've generally found that trying to isolate errors to one of these pieces, and then debugging that piece in particular (eg., pick a better optimizer versus pick a better hypothesis class) has been useful.
machine-learning  debugging  checklists  best-practices  pragmatic  expert  init  system-design  data-science  acmtariat  error  engineering  clarity  intricacy  model-selection  org:bleg  nibble  noise-structure  signal-noise  knowledge  accuracy  expert-experience  checking  grokkability-clarity  methodology 
september 2016 by nhaliday
The Elephant in the Brain: Hidden Motives in Everday Life
https://www.youtube.com/watch?v=V84_F1QWdeU

A Book Response Prediction: https://www.overcomingbias.com/2017/03/a-book-response-prediction.html
I predict that one of the most common responses will be something like “extraordinary claims require extraordinary evidence.” While the evidence we offer is suggestive, for claims as counterintuitive as ours on topics as important as these, evidence should be held to a higher standard than the one our book meets. We should shut up until we can prove our claims.

I predict that another of the most common responses will be something like “this is all well known.” Wise observers have known and mentioned such things for centuries. Perhaps foolish technocrats who only read in their narrow literatures are ignorant of such things, but our book doesn’t add much to what true scholars and thinkers have long known.

https://nintil.com/2018/01/16/this-review-is-not-about-reviewing-the-elephant-in-the-brain/
http://www.overcomingbias.com/2018/01/a-long-review-of-elephant-in-the-brain.html
https://nintil.com/2018/01/28/ad-hoc-explanations-a-rejoinder-to-hanson/

Elephant in the Brain on Religious Hypocrisy:
http://econlog.econlib.org/archives/2018/01/elephant_in_the.html
http://www.overcomingbias.com/2018/01/caplan-critiques-our-religion-chapter.html
books  postrat  simler  hanson  impro  anthropology  insight  todo  X-not-about-Y  signaling  🦀  new-religion  psychology  contrarianism  👽  ratty  rationality  hidden-motives  2017  s:**  p:null  ideas  impetus  multi  video  presentation  unaffiliated  review  summary  education  higher-ed  human-capital  propaganda  nationalism-globalism  civic  domestication  medicine  meta:medicine  healthcare  economics  behavioral-econ  supply-demand  roots  questions  charity  hypocrisy  peter-singer  big-peeps  philosophy  morality  ethics  formal-values  cog-psych  evopsych  thinking  conceptual-vocab  intricacy  clarity  accuracy  truth  is-ought  realness  religion  theos  christianity  islam  cultural-dynamics  within-without  neurons  EEA  analysis  article  links  meta-analysis  survey  judaism  compensation  labor  correlation  endogenous-exogenous  causation  critique  politics  government  polisci  political-econ  emotion  health  study  list  class  art  status  effective-altruism  evidence-based  epistemic  error  contradiction  prediction  culture  aphorism  quotes  discovery  no 
august 2016 by nhaliday
« earlier      
per page:    204080120160

related tags

2016-election  80000-hours  absolute-relative  abstraction  academia  accretion  accuracy  acm  acmtariat  adversarial  advice  aesthetics  africa  age-of-discovery  agriculture  ai  akrasia  albion  algebra  algorithms  alien-character  allodium  alt-inst  analysis  anglo  anglosphere  anomie  anthropology  aphorism  apollonian-dionysian  applications  arbitrage  architecture  arrows  art  article  asia  assembly  assimilation  attention  authoritarianism  automata-languages  aversion  axelrod  backup  bayesian  beauty  behavioral-econ  behavioral-gen  being-right  best-practices  better-explained  biases  big-list  big-peeps  big-picture  big-surf  big-yud  bio  biodet  bioinformatics  biotech  blowhards  boltzmann  books  bootstraps  borjas  bostrom  bounded-cognition  brexit  britain  broad-econ  build-packaging  business  c(pp)  caching  canada  capital  capitalism  carcinisation  career  carmack  cartoons  CAS  causation  certificates-recognition  characterization  charity  chart  checking  checklists  chemistry  china  christianity  civic  civilization  cjones-like  clarity  class  class-warfare  classification  clever-rats  cliometrics  cocktail  code-organizing  cog-psych  cohesion  collaboration  commentary  communication  community  comparison  compensation  competition  compilers  complex-systems  complexity  composition-decomposition  compressed-sensing  computation  computer-memory  concept  conceptual-vocab  concrete  concurrency  conference  confidence  confusion  conquest-empire  context  contiguity-proximity  contradiction  contrarianism  convexity-curvature  cool  cooperate-defect  coordination  core-rats  correctness  correlation  corruption  cost-benefit  counterexample  coupling-cohesion  courage  cracker-prog  creative  CRISPR  critique  crosstab  cs  cultural-dynamics  culture  curiosity  current-events  curvature  cycles  darwinian  data  data-science  data-structures  database  debate  debugging  decentralized  decision-making  decision-theory  deep-learning  deep-materialism  defense  definition  degrees-of-freedom  density  dependence-independence  descriptive  design  desktop  detail-architecture  developing-world  devops  devtools  differential  dimensionality  diogenes  direct-indirect  direction  discipline  discovery  discrimination  discussion  disease  distributed  distribution  documentation  domestication  douthatish  draft  DSL  duplication  dynamic  dynamical  early-modern  eastern-europe  econ-metrics  econ-productivity  econometrics  economics  econotariat  ecosystem  editors  education  EEA  effect-size  effective-altruism  efficiency  egalitarianism-hierarchy  EGT  einstein  elections  electromag  elegance  elite  embodied  emergent  emotion  empirical  endogenous-exogenous  ends-means  energy-resources  engineering  enhancement  enlightenment-renaissance-restoration-reformation  ensembles  entrepreneurialism  entropy-like  envy  epistemic  equilibrium  error  error-handling  essay  essence-existence  estimate  ethics  ethnography  EU  europe  events  evidence-based  evolution  evopsych  examples  exocortex  expansionism  expert  expert-experience  explanans  explanation  exploration-exploitation  exploratory  explore-exploit  exposition  externalities  extratricky  extrema  failure  farmers-and-foragers  features  fedja  field-study  fields  fire  fisher  fixed-point  flux-stasis  focus  foreign-lang  foreign-policy  form-design  formal-methods  formal-values  forms-instances  fourier  free-riding  french  frequentist  frontend  frontier  functional  gallic  game-theory  games  gedanken  general-survey  generalization  generative  genetic-correlation  genetic-load  genetics  genomics  geography  geometry  geopolitics  germanic  giants  gilens-page  git  gnon  gnxp  golang  google  gotchas  government  gowers  grad-school  grokkability  grokkability-clarity  ground-up  group-selection  growth  growth-econ  gtd  guessing  guide  gwern  GxE  habit  hacker  hanson  hardware  hari-seldon  hci  health  healthcare  heavyweights  heuristic  hg  hi-order-bits  hidden-motives  high-dimension  higher-ed  history  hmm  hn  homepage  homo-hetero  homogeneity  howto  hsu  human-capital  humility  hypocrisy  hypothesis-testing  ideas  ideology  idk  iidness  immune  impact  impetus  impro  incentives  increase-decrease  india  induction  industrial-org  inequality  inference  info-dynamics  info-econ  info-foraging  init  innovation  input-output  insight  institutions  integration-extension  integrity  intelligence  interdisciplinary  interests  interface  interface-compatibility  internet  interpretability  intersection  intersection-connectedness  intervention  interview  intricacy  intuition  invariance  iq  iran  iron-age  is-ought  islam  isotropy  iteration-recursion  japan  jargon  javascript  journos-pundits  judaism  judgement  justice  jvm  kinship  knowledge  korea  krugman  labor  language  latency-throughput  latent-variables  latex  latin-america  law  learning  learning-theory  left-wing  legacy  legibility  len:long  len:short  lens  lesswrong  let-me-see  letters  levers  leviathan  lexical  libraries  lifehack  lifts-projections  linear-models  linearity  liner-notes  linguistics  links  linux  lisp  list  literature  local-global  logic  lol  long-term  longform  low-hanging  lower-bounds  machine-learning  macro  madisonian  malaise  malthus  management  manifolds  map-territory  maps  marginal  marginal-rev  market-failure  markets  markov  martial  math  math.CA  math.CO  math.CT  math.DS  math.GN  math.GR  math.NT  mathtariat  meaningness  measure  measurement  mechanics  mechanism-design  media  medicine  medieval  mediterranean  MENA  mental-math  meta-analysis  meta:math  meta:medicine  meta:prediction  meta:reading  meta:research  meta:rhetoric  meta:science  metabuch  metameta  methodology  metrics  michael-nielsen  micro  microfoundations  migration  military  minimalism  minimum-viable  miri-cfar  ML-MAP-E  mobility  model-selection  models  modernity  mokyr-allen-mccloskey  monetary-fiscal  money  money-for-time  monte-carlo  mood-affiliation  morality  mostly-modern  motivation  move-fast-(and-break-things)  msr  multi  multiplicative  narrative  nationalism-globalism  naturality  network-structure  neurons  new-religion  news  nibble  nihil  nitty-gritty  nl-and-so-can-you  nlp  no-go  noise-structure  nonlinearity  nordic  norms  northeast  notation  notetaking  novelty  null-result  number  numerics  ocaml-sml  occam  off-convex  old-anglo  oly  online-learning  oop  open-closed  open-problems  operational  optimate  optimism  optimization  order-disorder  orders  org:bleg  org:com  org:edu  org:health  org:junk  org:mag  org:mat  org:med  org:nat  org:ngo  org:popup  org:rec  org:sci  organization  organizing  os  oscillation  oss  other-xtian  outcome-risk  overflow  p:**  p:null  p:whenever  papers  parable  parallax  parasites-microbiome  pareto  parsimony  path-dependence  paul-romer  PCP  pdf  peace-violence  pennsylvania  people  performance  pessimism  peter-singer  phase-transition  phd  philosophy  physics  pic  piketty  piracy  planning  plots  pls  plt  poast  polanyi-marx  polarization  policy  polisci  political-econ  politics  poll  polynomials  popsci  population-genetics  postrat  pragmatic  pre-2013  pre-ww2  prediction  prediction-markets  preprint  presentation  princeton  prioritizing  pro-rata  problem-solving  productivity  programming  proof-systems  proofs  propaganda  property-rights  protestant-catholic  protocol-metadata  prudence  pseudoE  psychiatry  psycho-atoms  psychology  publishing  puzzles  python  q-n-a  qra  QTL  quality  quantifiers-sums  quantitative-qualitative  quantum  questions  quixotic  quiz  quotes  r-lang  rand-approx  rand-complexity  random  random-matrices  randy-ayndy  rant  rat-pack  rationality  ratty  reading  realness  reason  recent-selection  recommendations  recruiting  reddit  redistribution  reference  reflection  regression  regression-to-mean  regularization  regularizer  regulation  relativity  relativization  religion  rent-seeking  repo  research  research-program  retrofit  review  rhetoric  right-wing  rigor  risk  robust  roots  rsc  russia  ryan-odonnell  s:*  s:**  s:***  safety  sampling  sanctity-degradation  sapiens  scala  scale  scaling-up  scholar  sci-comp  science  scifi-fantasy  scitariat  SDP  search  selection  self-control  sequential  series  sex  sexuality  shipping  signal-noise  signaling  similarity  simler  simplification-normalization  sinosphere  skeleton  skunkworks  sleuthin  slides  smoothness  social  social-choice  social-psych  social-science  sociality  society  sociology  soft-question  software  solid-study  span-cover  spanish  sparsity  spatial  spearhead  speculation  speedometer  spock  ssc  stackex  stagnation  stamina  stat-mech  state  state-of-art  static-dynamic  stats  status  stock-flow  stories  strategy  stream  street-fighting  strings  structure  study  studying  stylized-facts  subculture  subjective-objective  success  summary  summer-2014  supply-demand  survey  symmetry  synchrony  syntax  synthesis  system-design  systematic-ad-hoc  systems  tactics  talks  taxes  tcs  tcstariat  teaching  tech  tech-infrastructure  technical-writing  technology  techtariat  telos-atelos  terminal  tetlock  the-bones  the-classics  the-monster  the-self  the-south  the-trenches  the-world-is-just-atoms  theory-of-mind  theos  thermo  thesis  thick-thin  things  thinking  threat-modeling  thurston  tidbits  time  time-series  time-use  tip-of-tongue  todo  tools  top-n  topology  traces  track-record  tradeoffs  trees  trends  tribalism  tricki  trivia  trust  truth  turchin  turing  twitter  types  ubiquity  unaffiliated  uncertainty  unintended-consequences  uniqueness  unit  universalism-particularism  unix  unsupervised  urban  urban-rural  us-them  usa  ux  vague  vampire-squid  variance-components  vcs  vgr  video  virtu  visual-understanding  visualization  visuo  volo-avolo  von-neumann  water  wealth  web  welfare-state  west-hunter  westminster  whiggish-hegelian  white-paper  whole-partial-many  wigderson  wiki  winner-take-all  wire-guided  wisdom  within-without  wonkish  wordlessness  workflow  working-stiff  workshop  world  world-war  wormholes  worrydream  worse-is-better/the-right-thing  writing  wut  X-not-about-Y  yak-shaving  yoga  yvain  zeitgeist  zooming  🌞  🎓  🎩  🐸  👳  👽  🔬  🤖  🦀  🦉 

Copy this bookmark:



description:


tags: