nhaliday + intuition   97

The Existential Risk of Math Errors - Gwern.net
How big is this upper bound? Mathematicians have often made errors in proofs. But it’s rarer for ideas to be accepted for a long time and then rejected. But we can divide errors into 2 basic cases corresponding to type I and type II errors:

1. Mistakes where the theorem is still true, but the proof was incorrect (type I)
2. Mistakes where the theorem was false, and the proof was also necessarily incorrect (type II)

Before someone comes up with a final answer, a mathematician may have many levels of intuition in formulating & working on the problem, but we’ll consider the final end-product where the mathematician feels satisfied that he has solved it. Case 1 is perhaps the most common case, with innumerable examples; this is sometimes due to mistakes in the proof that anyone would accept is a mistake, but many of these cases are due to changing standards of proof. For example, when David Hilbert discovered errors in Euclid’s proofs which no one noticed before, the theorems were still true, and the gaps more due to Hilbert being a modern mathematician thinking in terms of formal systems (which of course Euclid did not think in). (David Hilbert himself turns out to be a useful example of the other kind of error: his famous list of 23 problems was accompanied by definite opinions on the outcome of each problem and sometimes timings, several of which were wrong or questionable5.) Similarly, early calculus used ‘infinitesimals’ which were sometimes treated as being 0 and sometimes treated as an indefinitely small non-zero number; this was incoherent and strictly speaking, practically all of the calculus results were wrong because they relied on an incoherent concept - but of course the results were some of the greatest mathematical work ever conducted6 and when later mathematicians put calculus on a more rigorous footing, they immediately re-derived those results (sometimes with important qualifications), and doubtless as modern math evolves other fields have sometimes needed to go back and clean up the foundations and will in the future.7

...

Isaac Newton, incidentally, gave two proofs of the same solution to a problem in probability, one via enumeration and the other more abstract; the enumeration was correct, but the other proof totally wrong and this was not noticed for a long time, leading Stigler to remark:

...

TYPE I > TYPE II?
“Lefschetz was a purely intuitive mathematician. It was said of him that he had never given a completely correct proof, but had never made a wrong guess either.”
- Gian-Carlo Rota13

Case 2 is disturbing, since it is a case in which we wind up with false beliefs and also false beliefs about our beliefs (we no longer know that we don’t know). Case 2 could lead to extinction.

...

Except, errors do not seem to be evenly & randomly distributed between case 1 and case 2. There seem to be far more case 1s than case 2s, as already mentioned in the early calculus example: far more than 50% of the early calculus results were correct when checked more rigorously. Richard Hamming attributes to Ralph Boas a comment that while editing Mathematical Reviews that “of the new results in the papers reviewed most are true but the corresponding proofs are perhaps half the time plain wrong”.

...

Gian-Carlo Rota gives us an example with Hilbert:

...

Olga labored for three years; it turned out that all mistakes could be corrected without any major changes in the statement of the theorems. There was one exception, a paper Hilbert wrote in his old age, which could not be fixed; it was a purported proof of the continuum hypothesis, you will find it in a volume of the Mathematische Annalen of the early thirties.

...

Leslie Lamport advocates for machine-checked proofs and a more rigorous style of proofs similar to natural deduction, noting a mathematician acquaintance guesses at a broad error rate of 1/329 and that he routinely found mistakes in his own proofs and, worse, believed false conjectures30.

[more on these "structured proofs":
https://academia.stackexchange.com/questions/52435/does-anyone-actually-publish-structured-proofs
https://mathoverflow.net/questions/35727/community-experiences-writing-lamports-structured-proofs
]

We can probably add software to that list: early software engineering work found that, dismayingly, bug rates seem to be simply a function of lines of code, and one would expect diseconomies of scale. So one would expect that in going from the ~4,000 lines of code of the Microsoft DOS operating system kernel to the ~50,000,000 lines of code in Windows Server 2003 (with full systems of applications and libraries being even larger: the comprehensive Debian repository in 2007 contained ~323,551,126 lines of code) that the number of active bugs at any time would be… fairly large. Mathematical software is hopefully better, but practitioners still run into issues (eg Durán et al 2014, Fonseca et al 2017) and I don’t know of any research pinning down how buggy key mathematical systems like Mathematica are or how much published mathematics may be erroneous due to bugs. This general problem led to predictions of doom and spurred much research into automated proof-checking, static analysis, and functional languages31.

[related:
https://mathoverflow.net/questions/11517/computer-algebra-errors
I don't know any interesting bugs in symbolic algebra packages but I know a true, enlightening and entertaining story about something that looked like a bug but wasn't.

Define sinc𝑥=(sin𝑥)/𝑥.

Someone found the following result in an algebra package: ∫∞0𝑑𝑥sinc𝑥=𝜋/2
They then found the following results:

...

So of course when they got:

∫∞0𝑑𝑥sinc𝑥sinc(𝑥/3)sinc(𝑥/5)⋯sinc(𝑥/15)=(467807924713440738696537864469/935615849440640907310521750000)𝜋

hmm:
Which means that nobody knows Fourier analysis nowdays. Very sad and discouraging story... – fedja Jan 29 '10 at 18:47

--

Because the most popular systems are all commercial, they tend to guard their bug database rather closely -- making them public would seriously cut their sales. For example, for the open source project Sage (which is quite young), you can get a list of all the known bugs from this page. 1582 known issues on Feb.16th 2010 (which includes feature requests, problems with documentation, etc).

That is an order of magnitude less than the commercial systems. And it's not because it is better, it is because it is younger and smaller. It might be better, but until SAGE does a lot of analysis (about 40% of CAS bugs are there) and a fancy user interface (another 40%), it is too hard to compare.

I once ran a graduate course whose core topic was studying the fundamental disconnect between the algebraic nature of CAS and the analytic nature of the what it is mostly used for. There are issues of logic -- CASes work more or less in an intensional logic, while most of analysis is stated in a purely extensional fashion. There is no well-defined 'denotational semantics' for expressions-as-functions, which strongly contributes to the deeper bugs in CASes.]

...

Should such widely-believed conjectures as P≠NP or the Riemann hypothesis turn out be false, then because they are assumed by so many existing proofs, a far larger math holocaust would ensue38 - and our previous estimates of error rates will turn out to have been substantial underestimates. But it may be a cloud with a silver lining, if it doesn’t come at a time of danger.

https://mathoverflow.net/questions/338607/why-doesnt-mathematics-collapse-down-even-though-humans-quite-often-make-mista

more on formal methods in programming:
https://www.quantamagazine.org/formal-verification-creates-hacker-proof-code-20160920/
https://intelligence.org/2014/03/02/bob-constable/

https://softwareengineering.stackexchange.com/questions/375342/what-are-the-barriers-that-prevent-widespread-adoption-of-formal-methods
Update: measured effort
In the October 2018 issue of Communications of the ACM there is an interesting article about Formally verified software in the real world with some estimates of the effort.

Interestingly (based on OS development for military equipment), it seems that producing formally proved software requires 3.3 times more effort than with traditional engineering techniques. So it's really costly.

On the other hand, it requires 2.3 times less effort to get high security software this way than with traditionally engineered software if you add the effort to make such software certified at a high security level (EAL 7). So if you have high reliability or security requirements there is definitively a business case for going formal.

WHY DON'T PEOPLE USE FORMAL METHODS?: https://www.hillelwayne.com/post/why-dont-people-use-formal-methods/
You can see examples of how all of these look at Let’s Prove Leftpad. HOL4 and Isabelle are good examples of “independent theorem” specs, SPARK and Dafny have “embedded assertion” specs, and Coq and Agda have “dependent type” specs.6

If you squint a bit it looks like these three forms of code spec map to the three main domains of automated correctness checking: tests, contracts, and types. This is not a coincidence. Correctness is a spectrum, and formal verification is one extreme of that spectrum. As we reduce the rigour (and effort) of our verification we get simpler and narrower checks, whether that means limiting the explored state space, using weaker types, or pushing verification to the runtime. Any means of total specification then becomes a means of partial specification, and vice versa: many consider Cleanroom a formal verification technique, which primarily works by pushing code review far beyond what’s humanly possible.

...

The question, then: “is 90/95/99% correct significantly cheaper than 100% correct?” The answer is very yes. We all are comfortable saying that a codebase we’ve well-tested and well-typed is mostly correct modulo a few fixes in prod, and we’re even writing more than four lines of code a day. In fact, the vast… [more]
ratty  gwern  analysis  essay  realness  truth  correctness  reason  philosophy  math  proofs  formal-methods  cs  programming  engineering  worse-is-better/the-right-thing  intuition  giants  old-anglo  error  street-fighting  heuristic  zooming  risk  threat-modeling  software  lens  logic  inference  physics  differential  geometry  estimate  distribution  robust  speculation  nonlinearity  cost-benefit  convexity-curvature  measure  scale  trivia  cocktail  history  early-modern  europe  math.CA  rigor  news  org:mag  org:sci  miri-cfar  pdf  thesis  comparison  examples  org:junk  q-n-a  stackex  pragmatic  tradeoffs  cracker-prog  techtariat  invariance  DSL  chart  ecosystem  grokkability  heavyweights  CAS  static-dynamic  lower-bounds  complexity  tcs  open-problems  big-surf  ideas  certificates-recognition  proof-systems  PCP  mediterranean  SDP  meta:prediction  epistemic  questions  guessing  distributed  overflow  nibble  soft-question  track-record  big-list  hmm  frontier  state-of-art  move-fast-(and-break-things)  grokkability-clarity  technical-writing  trust 
july 2019 by nhaliday
Lateralization of brain function - Wikipedia
Language
Language functions such as grammar, vocabulary and literal meaning are typically lateralized to the left hemisphere, especially in right handed individuals.[3] While language production is left-lateralized in up to 90% of right-handers, it is more bilateral, or even right-lateralized, in approximately 50% of left-handers.[4]

Broca's area and Wernicke's area, two areas associated with the production of speech, are located in the left cerebral hemisphere for about 95% of right-handers, but about 70% of left-handers.[5]:69

Auditory and visual processing
The processing of visual and auditory stimuli, spatial manipulation, facial perception, and artistic ability are represented bilaterally.[4] Numerical estimation, comparison and online calculation depend on bilateral parietal regions[6][7] while exact calculation and fact retrieval are associated with left parietal regions, perhaps due to their ties to linguistic processing.[6][7]

...

Depression is linked with a hyperactive right hemisphere, with evidence of selective involvement in "processing negative emotions, pessimistic thoughts and unconstructive thinking styles", as well as vigilance, arousal and self-reflection, and a relatively hypoactive left hemisphere, "specifically involved in processing pleasurable experiences" and "relatively more involved in decision-making processes".

Chaos and Order; the right and left hemispheres: https://orthosphere.wordpress.com/2018/05/23/chaos-and-order-the-right-and-left-hemispheres/
In The Master and His Emissary, Iain McGilchrist writes that a creature like a bird needs two types of consciousness simultaneously. It needs to be able to focus on something specific, such as pecking at food, while it also needs to keep an eye out for predators which requires a more general awareness of environment.

These are quite different activities. The Left Hemisphere (LH) is adapted for a narrow focus. The Right Hemisphere (RH) for the broad. The brains of human beings have the same division of function.

The LH governs the right side of the body, the RH, the left side. With birds, the left eye (RH) looks for predators, the right eye (LH) focuses on food and specifics. Since danger can take many forms and is unpredictable, the RH has to be very open-minded.

The LH is for narrow focus, the explicit, the familiar, the literal, tools, mechanism/machines and the man-made. The broad focus of the RH is necessarily more vague and intuitive and handles the anomalous, novel, metaphorical, the living and organic. The LH is high resolution but narrow, the RH low resolution but broad.

The LH exhibits unrealistic optimism and self-belief. The RH has a tendency towards depression and is much more realistic about a person’s own abilities. LH has trouble following narratives because it has a poor sense of “wholes.” In art it favors flatness, abstract and conceptual art, black and white rather than color, simple geometric shapes and multiple perspectives all shoved together, e.g., cubism. Particularly RH paintings emphasize vistas with great depth of field and thus space and time,[1] emotion, figurative painting and scenes related to the life world. In music, LH likes simple, repetitive rhythms. The RH favors melody, harmony and complex rhythms.

...

Schizophrenia is a disease of extreme LH emphasis. Since empathy is RH and the ability to notice emotional nuance facially, vocally and bodily expressed, schizophrenics tend to be paranoid and are often convinced that the real people they know have been replaced by robotic imposters. This is at least partly because they lose the ability to intuit what other people are thinking and feeling – hence they seem robotic and suspicious.

Oswald Spengler’s The Decline of the West as well as McGilchrist characterize the West as awash in phenomena associated with an extreme LH emphasis. Spengler argues that Western civilization was originally much more RH (to use McGilchrist’s categories) and that all its most significant artistic (in the broadest sense) achievements were triumphs of RH accentuation.

The RH is where novel experiences and the anomalous are processed and where mathematical, and other, problems are solved. The RH is involved with the natural, the unfamiliar, the unique, emotions, the embodied, music, humor, understanding intonation and emotional nuance of speech, the metaphorical, nuance, and social relations. It has very little speech, but the RH is necessary for processing all the nonlinguistic aspects of speaking, including body language. Understanding what someone means by vocal inflection and facial expressions is an intuitive RH process rather than explicit.

...

RH is very much the center of lived experience; of the life world with all its depth and richness. The RH is “the master” from the title of McGilchrist’s book. The LH ought to be no more than the emissary; the valued servant of the RH. However, in the last few centuries, the LH, which has tyrannical tendencies, has tried to become the master. The LH is where the ego is predominantly located. In split brain patients where the LH and the RH are surgically divided (this is done sometimes in the case of epileptic patients) one hand will sometimes fight with the other. In one man’s case, one hand would reach out to hug his wife while the other pushed her away. One hand reached for one shirt, the other another shirt. Or a patient will be driving a car and one hand will try to turn the steering wheel in the opposite direction. In these cases, the “naughty” hand is usually the left hand (RH), while the patient tends to identify herself with the right hand governed by the LH. The two hemispheres have quite different personalities.

The connection between LH and ego can also be seen in the fact that the LH is competitive, contentious, and agonistic. It wants to win. It is the part of you that hates to lose arguments.

Using the metaphor of Chaos and Order, the RH deals with Chaos – the unknown, the unfamiliar, the implicit, the emotional, the dark, danger, mystery. The LH is connected with Order – the known, the familiar, the rule-driven, the explicit, and light of day. Learning something means to take something unfamiliar and making it familiar. Since the RH deals with the novel, it is the problem-solving part. Once understood, the results are dealt with by the LH. When learning a new piece on the piano, the RH is involved. Once mastered, the result becomes a LH affair. The muscle memory developed by repetition is processed by the LH. If errors are made, the activity returns to the RH to figure out what went wrong; the activity is repeated until the correct muscle memory is developed in which case it becomes part of the familiar LH.

Science is an attempt to find Order. It would not be necessary if people lived in an entirely orderly, explicit, known world. The lived context of science implies Chaos. Theories are reductive and simplifying and help to pick out salient features of a phenomenon. They are always partial truths, though some are more partial than others. The alternative to a certain level of reductionism or partialness would be to simply reproduce the world which of course would be both impossible and unproductive. The test for whether a theory is sufficiently non-partial is whether it is fit for purpose and whether it contributes to human flourishing.

...

Analytic philosophers pride themselves on trying to do away with vagueness. To do so, they tend to jettison context which cannot be brought into fine focus. However, in order to understand things and discern their meaning, it is necessary to have the big picture, the overview, as well as the details. There is no point in having details if the subject does not know what they are details of. Such philosophers also tend to leave themselves out of the picture even when what they are thinking about has reflexive implications. John Locke, for instance, tried to banish the RH from reality. All phenomena having to do with subjective experience he deemed unreal and once remarked about metaphors, a RH phenomenon, that they are “perfect cheats.” Analytic philosophers tend to check the logic of the words on the page and not to think about what those words might say about them. The trick is for them to recognize that they and their theories, which exist in minds, are part of reality too.

The RH test for whether someone actually believes something can be found by examining his actions. If he finds that he must regard his own actions as free, and, in order to get along with other people, must also attribute free will to them and treat them as free agents, then he effectively believes in free will – no matter his LH theoretical commitments.

...

We do not know the origin of life. We do not know how or even if consciousness can emerge from matter. We do not know the nature of 96% of the matter of the universe. Clearly all these things exist. They can provide the subject matter of theories but they continue to exist as theorizing ceases or theories change. Not knowing how something is possible is irrelevant to its actual existence. An inability to explain something is ultimately neither here nor there.

If thought begins and ends with the LH, then thinking has no content – content being provided by experience (RH), and skepticism and nihilism ensue. The LH spins its wheels self-referentially, never referring back to experience. Theory assumes such primacy that it will simply outlaw experiences and data inconsistent with it; a profoundly wrong-headed approach.

...

Gödel’s Theorem proves that not everything true can be proven to be true. This means there is an ineradicable role for faith, hope and intuition in every moderately complex human intellectual endeavor. There is no one set of consistent axioms from which all other truths can be derived.

Alan Turing’s proof of the halting problem proves that there is no effective procedure for finding effective procedures. Without a mechanical decision procedure, (LH), when it comes to … [more]
gnon  reflection  books  summary  review  neuro  neuro-nitgrit  things  thinking  metabuch  order-disorder  apollonian-dionysian  bio  examples  near-far  symmetry  homo-hetero  logic  inference  intuition  problem-solving  analytical-holistic  n-factor  europe  the-great-west-whale  occident  alien-character  detail-architecture  art  theory-practice  philosophy  being-becoming  essence-existence  language  psychology  cog-psych  egalitarianism-hierarchy  direction  reason  learning  novelty  science  anglo  anglosphere  coarse-fine  neurons  truth  contradiction  matching  empirical  volo-avolo  curiosity  uncertainty  theos  axioms  intricacy  computation  analogy  essay  rhetoric  deep-materialism  new-religion  knowledge  expert-experience  confidence  biases  optimism  pessimism  realness  whole-partial-many  theory-of-mind  values  competition  reduction  subjective-objective  communication  telos-atelos  ends-means  turing  fiction  increase-decrease  innovation  creative  thick-thin  spengler  multi  ratty  hanson  complex-systems  structure  concrete  abstraction  network-s 
september 2018 by nhaliday
The Hanson-Yudkowsky AI-Foom Debate - Machine Intelligence Research Institute
How Deviant Recent AI Progress Lumpiness?: http://www.overcomingbias.com/2018/03/how-deviant-recent-ai-progress-lumpiness.html
I seem to disagree with most people working on artificial intelligence (AI) risk. While with them I expect rapid change once AI is powerful enough to replace most all human workers, I expect this change to be spread across the world, not concentrated in one main localized AI system. The efforts of AI risk folks to design AI systems whose values won’t drift might stop global AI value drift if there is just one main AI system. But doing so in a world of many AI systems at similar abilities levels requires strong global governance of AI systems, which is a tall order anytime soon. Their continued focus on preventing single system drift suggests that they expect a single main AI system.

The main reason that I understand to expect relatively local AI progress is if AI progress is unusually lumpy, i.e., arriving in unusually fewer larger packages rather than in the usual many smaller packages. If one AI team finds a big lump, it might jump way ahead of the other teams.

However, we have a vast literature on the lumpiness of research and innovation more generally, which clearly says that usually most of the value in innovation is found in many small innovations. We have also so far seen this in computer science (CS) and AI. Even if there have been historical examples where much value was found in particular big innovations, such as nuclear weapons or the origin of humans.

Apparently many people associated with AI risk, including the star machine learning (ML) researchers that they often idolize, find it intuitively plausible that AI and ML progress is exceptionally lumpy. Such researchers often say, “My project is ‘huge’, and will soon do it all!” A decade ago my ex-co-blogger Eliezer Yudkowsky and I argued here on this blog about our differing estimates of AI progress lumpiness. He recently offered Alpha Go Zero as evidence of AI lumpiness:

...

In this post, let me give another example (beyond two big lumps in a row) of what could change my mind. I offer a clear observable indicator, for which data should have available now: deviant citation lumpiness in recent ML research. One standard measure of research impact is citations; bigger lumpier developments gain more citations that smaller ones. And it turns out that the lumpiness of citations is remarkably constant across research fields! See this March 3 paper in Science:

I Still Don’t Get Foom: http://www.overcomingbias.com/2014/07/30855.html
All of which makes it look like I’m the one with the problem; everyone else gets it. Even so, I’m gonna try to explain my problem again, in the hope that someone can explain where I’m going wrong. Here goes.

“Intelligence” just means an ability to do mental/calculation tasks, averaged over many tasks. I’ve always found it plausible that machines will continue to do more kinds of mental tasks better, and eventually be better at pretty much all of them. But what I’ve found it hard to accept is a “local explosion.” This is where a single machine, built by a single project using only a tiny fraction of world resources, goes in a short time (e.g., weeks) from being so weak that it is usually beat by a single human with the usual tools, to so powerful that it easily takes over the entire world. Yes, smarter machines may greatly increase overall economic growth rates, and yes such growth may be uneven. But this degree of unevenness seems implausibly extreme. Let me explain.

If we count by economic value, humans now do most of the mental tasks worth doing. Evolution has given us a brain chock-full of useful well-honed modules. And the fact that most mental tasks require the use of many modules is enough to explain why some of us are smarter than others. (There’d be a common “g” factor in task performance even with independent module variation.) Our modules aren’t that different from those of other primates, but because ours are different enough to allow lots of cultural transmission of innovation, we’ve out-competed other primates handily.

We’ve had computers for over seventy years, and have slowly build up libraries of software modules for them. Like brains, computers do mental tasks by combining modules. An important mental task is software innovation: improving these modules, adding new ones, and finding new ways to combine them. Ideas for new modules are sometimes inspired by the modules we see in our brains. When an innovation team finds an improvement, they usually sell access to it, which gives them resources for new projects, and lets others take advantage of their innovation.

...

In Bostrom’s graph above the line for an initially small project and system has a much higher slope, which means that it becomes in a short time vastly better at software innovation. Better than the entire rest of the world put together. And my key question is: how could it plausibly do that? Since the rest of the world is already trying the best it can to usefully innovate, and to abstract to promote such innovation, what exactly gives one small project such a huge advantage to let it innovate so much faster?

...

In fact, most software innovation seems to be driven by hardware advances, instead of innovator creativity. Apparently, good ideas are available but must usually wait until hardware is cheap enough to support them.

Yes, sometimes architectural choices have wider impacts. But I was an artificial intelligence researcher for nine years, ending twenty years ago, and I never saw an architecture choice make a huge difference, relative to other reasonable architecture choices. For most big systems, overall architecture matters a lot less than getting lots of detail right. Researchers have long wandered the space of architectures, mostly rediscovering variations on what others found before.

Some hope that a small project could be much better at innovation because it specializes in that topic, and much better understands new theoretical insights into the basic nature of innovation or intelligence. But I don’t think those are actually topics where one can usefully specialize much, or where we’ll find much useful new theory. To be much better at learning, the project would instead have to be much better at hundreds of specific kinds of learning. Which is very hard to do in a small project.

What does Bostrom say? Alas, not much. He distinguishes several advantages of digital over human minds, but all software shares those advantages. Bostrom also distinguishes five paths: better software, brain emulation (i.e., ems), biological enhancement of humans, brain-computer interfaces, and better human organizations. He doesn’t think interfaces would work, and sees organizations and better biology as only playing supporting roles.

...

Similarly, while you might imagine someday standing in awe in front of a super intelligence that embodies all the power of a new age, superintelligence just isn’t the sort of thing that one project could invent. As “intelligence” is just the name we give to being better at many mental tasks by using many good mental modules, there’s no one place to improve it. So I can’t see a plausible way one project could increase its intelligence vastly faster than could the rest of the world.

Takeoff speeds: https://sideways-view.com/2018/02/24/takeoff-speeds/
Futurists have argued for years about whether the development of AGI will look more like a breakthrough within a small group (“fast takeoff”), or a continuous acceleration distributed across the broader economy or a large firm (“slow takeoff”).

I currently think a slow takeoff is significantly more likely. This post explains some of my reasoning and why I think it matters. Mostly the post lists arguments I often hear for a fast takeoff and explains why I don’t find them compelling.

(Note: this is not a post about whether an intelligence explosion will occur. That seems very likely to me. Quantitatively I expect it to go along these lines. So e.g. while I disagree with many of the claims and assumptions in Intelligence Explosion Microeconomics, I don’t disagree with the central thesis or with most of the arguments.)
ratty  lesswrong  subculture  miri-cfar  ai  risk  ai-control  futurism  books  debate  hanson  big-yud  prediction  contrarianism  singularity  local-global  speed  speedometer  time  frontier  distribution  smoothness  shift  pdf  economics  track-record  abstraction  analogy  links  wiki  list  evolution  mutation  selection  optimization  search  iteration-recursion  intelligence  metameta  chart  analysis  number  ems  coordination  cooperate-defect  death  values  formal-values  flux-stasis  philosophy  farmers-and-foragers  malthus  scale  studying  innovation  insight  conceptual-vocab  growth-econ  egalitarianism-hierarchy  inequality  authoritarianism  wealth  near-far  rationality  epistemic  biases  cycles  competition  arms  zero-positive-sum  deterrence  war  peace-violence  winner-take-all  technology  moloch  multi  plots  research  science  publishing  humanity  labor  marginal  urban-rural  structure  composition-decomposition  complex-systems  gregory-clark  decentralized  heavy-industry  magnitude  multiplicative  endogenous-exogenous  models  uncertainty  decision-theory  time-prefer 
april 2018 by nhaliday
The Function of Reason | Edge.org
https://www.edge.org/conversation/hugo_mercier-the-argumentative-theory

How Social Is Reason?: http://www.overcomingbias.com/2017/08/how-social-is-reason.html

https://gnxp.nofe.me/2017/07/02/open-thread-732017/
Reading The Enigma of Reason. Pretty good so far. Not incredibly surprising to me so far. To be clear, their argument is somewhat orthogonal to the whole ‘rationality’ debate you may be familiar with from Daniel Kahneman and Amos Tversky’s work (e.g., see Heuristics and Biases).

One of the major problems in analysis is that rationality, reflection and ratiocination, are slow and error prone. To get a sense of that, just read ancient Greek science. Eratosthenes may have calculated to within 1% of the true circumference of the world, but Aristotle’s speculations on the nature of reproduction were rather off.

You may be as clever as Eratosthenes, but most people are not. But you probably accept that the world is round and 24,901 miles around. If you are not American you probably are vague on miles anyway. But you know what the social consensus is, and you accept it because it seems reasonable.

One of the points in cultural evolution work is that a lot of the time rather than relying on your own intuition and or reason, it is far more effective and cognitively cheaper to follow social norms of your ingroup. I only bring this up because unfortunately many pathologies of our political and intellectual world today are not really pathologies. That is, they’re not bugs, but features.

https://gnxp.nofe.me/2017/07/23/open-thread-07232017/
Finished The Enigma of Reason. The basic thesis that reasoning is a way to convince people after you’ve already come to a conclusion, that is, rationalization, was already one I shared. That makes sense since one of the coauthors, Dan Sperber, has been influential in the “naturalistic” school of anthropology. If you’ve read books like In Gods We Trust The Enigma of Reason goes fast. But it is important to note that the cognitive anthropology perspective is useful in things besides religion. I’m thinking in particular of politics.

https://gnxp.nofe.me/2017/07/30/the-delusion-of-reasons-empire/
My point here is that many of our beliefs are arrived at in an intuitive manner, and we find reasons to justify those beliefs. One of the core insights you’ll get from The Enigma of Reason is that rationalization isn’t that big of a misfire or abuse of our capacities. It’s probably just a natural outcome for what and how we use reason in our natural ecology.

Mercier and Sperber contrast their “interactionist” model of what reason is for with an “intellectualist: model. The intellecutalist model is rather straightforward. It is one where individual reasoning capacities exist so that one may make correct inferences about the world around us, often using methods that mimic those in abstract elucidated systems such as formal logic or Bayesian reasoning. When reasoning doesn’t work right, it’s because people aren’t using it for it’s right reasons. It can be entirely solitary because the tools don’t rely on social input or opinion.

The interactionist model holds that reasoning exists because it is a method of persuasion within social contexts. It is important here to note that the authors do not believe that reasoning is simply a tool for winning debates. That is, increasing your status in a social game. Rather, their overall thesis seems to be in alignment with the idea that cognition of reasoning properly understood is a social process. In this vein they offer evidence of how juries may be superior to judges, and the general examples you find in the “wisdom of the crowds” literature. Overall the authors make a strong case for the importance of diversity of good-faith viewpoints, because they believe that the truth on the whole tends to win out in dialogic formats (that is, if there is a truth; they are rather unclear and muddy about normative disagreements and how those can be resolved).

The major issues tend to crop up when reasoning is used outside of its proper context. One of the literature examples, which you are surely familiar with, in The Enigma of Reason is a psychological experiment where there are two conditions, and the researchers vary the conditions and note wide differences in behavior. In particular, the experiment where psychologists put subjects into a room where someone out of view is screaming for help. When they are alone, they quite often go to see what is wrong immediately. In contrast, when there is a confederate of the psychologists in the room who ignores the screaming, people also tend to ignore the screaming.

The researchers know the cause of the change in behavior. It’s the introduction of the confederate and that person’s behavior. But the subjects when interviewed give a wide range of plausible and possible answers. In other words, they are rationalizing their behavior when called to justify it in some way. This is entirely unexpected, we all know that people are very good at coming up with answers to explain their behavior (often in the best light possible). But that doesn’t mean they truly understanding their internal reasons, which seem to be more about intuition.

But much of The Enigma of Reason also recounts how bad people are at coming up with coherent and well thought out rationalizations. That is, their “reasons” tend to be ad hoc and weak. We’re not very good at formal logic or even simple syllogistic reasoning. The explanation for this seems to be two-fold.

...

At this point we need to address the elephant in the room: some humans seem extremely good at reasoning in a classical sense. I’m talking about individuals such as Blaise Pascal, Carl Friedrich Gauss, and John von Neumann. Early on in The Enigma of Reason the authors point out the power of reason by alluding to Eratosthenes’s calculation of the circumference of the earth, which was only off by one percent. Myself, I would have mentioned Archimedes, who I suspect was a genius on the same level as the ones mentioned above.

Mercier and Sperber state near the end of the book that math in particular is special and a powerful way to reason. We all know this. In math the axioms are clear, and agreed upon. And one can inspect the chain of propositions in a very transparent manner. Mathematics has guard-rails for any human who attempts to engage in reasoning. By reducing the ability of humans to enter into unforced errors math is the ideal avenue for solitary individual reasoning. But it is exceptional.

Second, though it is not discussed in The Enigma of Reason there does seem to be variation in general and domain specific intelligence within the human population. People who flourish in mathematics usually have high general intelligences, but they also often exhibit a tendency to be able to engage in high levels of visual-spatial conceptualization.

One the whole the more intelligent you are the better you are able to reason. But that does not mean that those with high intelligence are immune from the traps of motivated reasoning or faulty logic. Mercier and Sperber give many examples. There are two. Linus Pauling was indisputably brilliant, but by the end of his life he was consistently pushing Vitamin C quackery (in part through a very selective interpretation of the scientific literature).* They also point out that much of Isaac Newton’s prodigious intellectual output turns out to have been focused on alchemy and esoteric exegesis which is totally impenetrable. Newton undoubtedly had a first class mind, but if the domain it was applied to was garbage, then the output was also garbage.

...

Overall, the take-homes are:

Reasoning exists to persuade in a group context through dialogue, not individual ratiocination.
Reasoning can give rise to storytelling when prompted, even if the reasons have no relationship to the underlying causality.
Motivated reasoning emerges because we are not skeptical of the reasons we proffer, but highly skeptical of reasons which refute our own.
The “wisdom of the crowds” is not just a curious phenomenon, but one of the primary reasons that humans have become more socially complex and our brains have larger.
Ultimately, if you want to argue someone out of their beliefs…well, good luck with that. But you should read The Enigma of Reason to understand the best strategies (many of them are common sense, and I’ve come to them independently simply through 15 years of having to engage with people of diverse viewpoints).

* R. A. Fisher, who was one of the pioneers of both evolutionary genetics and statistics, famously did not believe there was a connection between smoking and cancer. He himself smoked a pipe regularly.

** From what we know about Blaise Pascal and Isaac Newton, their personalities were such that they’d probably be killed or expelled from a hunter-gatherer band.
books  summary  psychology  social-psych  cog-psych  anthropology  rationality  biases  epistemic  thinking  neurons  realness  truth  info-dynamics  language  speaking  persuasion  dark-arts  impro  roots  ideas  speculation  hypocrisy  intelligence  eden  philosophy  multi  review  critique  ratty  hanson  org:edge  video  interview  communication  insight  impetus  hidden-motives  X-not-about-Y  signaling  🤖  metameta  metabuch  dennett  meta:rhetoric  gnxp  scitariat  open-things  giants  fisher  old-anglo  history  iron-age  mediterranean  the-classics  reason  religion  theos  noble-lie  intuition  instinct  farmers-and-foragers  egalitarianism-hierarchy  early-modern  britain  europe  gallic  hari-seldon  theory-of-mind  parallax  darwinian  evolution  telos-atelos  intricacy  evopsych  chart  traces 
august 2017 by nhaliday
Einstein's Most Famous Thought Experiment
When Einstein abandoned an emission theory of light, he had also to abandon the hope that electrodynamics could be made to conform to the principle of relativity by the normal sorts of modifications to electrodynamic theory that occupied the theorists of the second half of the 19th century. Instead Einstein knew he must resort to extraordinary measures. He was willing to seek realization of his goal in a re-examination of our basic notions of space and time. Einstein concluded his report on his youthful thought experiment:

"One sees that in this paradox the germ of the special relativity theory is already contained. Today everyone knows, of course, that all attempts to clarify this paradox satisfactorily were condemned to failure as long as the axiom of the absolute character of time, or of simultaneity, was rooted unrecognized in the unconscious. To recognize clearly this axiom and its arbitrary character already implies the essentials of the solution of the problem."
einstein  giants  physics  history  stories  gedanken  exposition  org:edu  electromag  relativity  nibble  innovation  novelty  the-trenches  synchrony  discovery  🔬  org:junk  science  absolute-relative  visuo  explanation  ground-up  clarity  state  causation  intuition  ideas  mostly-modern  pre-ww2  marginal  grokkability-clarity 
february 2017 by nhaliday
general topology - What should be the intuition when working with compactness? - Mathematics Stack Exchange
http://math.stackexchange.com/questions/485822/why-is-compactness-so-important

The situation with compactness is sort of like the above. It turns out that finiteness, which you think of as one concept (in the same way that you think of "Foo" as one concept above), is really two concepts: discreteness and compactness. You've never seen these concepts separated before, though. When people say that compactness is like finiteness, they mean that compactness captures part of what it means to be finite in the same way that shortness captures part of what it means to be Foo.

--

As many have said, compactness is sort of a topological generalization of finiteness. And this is true in a deep sense, because topology deals with open sets, and this means that we often "care about how something behaves on an open set", and for compact spaces this means that there are only finitely many possible behaviors.

--

Compactness does for continuous functions what finiteness does for functions in general.

If a set A is finite then every function f:A→R has a max and a min, and every function f:A→R^n is bounded. If A is compact, the every continuous function from A to R has a max and a min and every continuous function from A to R^n is bounded.

If A is finite then every sequence of members of A has a subsequence that is eventually constant, and "eventually constant" is the only kind of convergence you can talk about without talking about a topology on the set. If A is compact, then every sequence of members of A has a convergent subsequence.
q-n-a  overflow  math  topology  math.GN  concept  finiteness  atoms  intuition  oly  mathtariat  multi  discrete  gowers  motivation  synthesis  hi-order-bits  soft-question  limits  things  nibble  definition  convergence  abstraction  span-cover 
january 2017 by nhaliday
teaching - Intuitive explanation for dividing by $n-1$ when calculating standard deviation? - Cross Validated
The standard deviation calculated with a divisor of n-1 is a standard deviation calculated from the sample as an estimate of the standard deviation of the population from which the sample was drawn. Because the observed values fall, on average, closer to the sample mean than to the population mean, the standard deviation which is calculated using deviations from the sample mean underestimates the desired standard deviation of the population. Using n-1 instead of n as the divisor corrects for that by making the result a little bit bigger.

Note that the correction has a larger proportional effect when n is small than when it is large, which is what we want because when n is larger the sample mean is likely to be a good estimator of the population mean.

...

A common one is that the definition of variance (of a distribution) is the second moment recentered around a known, definite mean, whereas the estimator uses an estimated mean. This loss of a degree of freedom (given the mean, you can reconstitute the dataset with knowledge of just n−1 of the data values) requires the use of n−1 rather than nn to "adjust" the result.
q-n-a  overflow  stats  acm  intuition  explanation  bias-variance  methodology  moments  nibble  degrees-of-freedom  sampling-bias  generalization  dimensionality  ground-up  intricacy 
january 2017 by nhaliday
Dvoretzky's theorem - Wikipedia
In mathematics, Dvoretzky's theorem is an important structural theorem about normed vector spaces proved by Aryeh Dvoretzky in the early 1960s, answering a question of Alexander Grothendieck. In essence, it says that every sufficiently high-dimensional normed vector space will have low-dimensional subspaces that are approximately Euclidean. Equivalently, every high-dimensional bounded symmetric convex set has low-dimensional sections that are approximately ellipsoids.

http://mathoverflow.net/questions/143527/intuitive-explanation-of-dvoretzkys-theorem
http://mathoverflow.net/questions/46278/unexpected-applications-of-dvoretzkys-theorem
math  math.FA  inner-product  levers  characterization  geometry  math.MG  concentration-of-measure  multi  q-n-a  overflow  intuition  examples  proofs  dimensionality  gowers  mathtariat  tcstariat  quantum  quantum-info  norms  nibble  high-dimension  wiki  reference  curvature  convexity-curvature  tcs 
january 2017 by nhaliday
"Surely You're Joking, Mr. Feynman!": Adventures of a Curious Character ... - Richard P. Feynman - Google Books
Actually, there was a certain amount of genuine quality to my guesses. I had a scheme, which I still use today when somebody is explaining something that l’m trying to understand: I keep making up examples. For instance, the mathematicians would come in with a terrific theorem, and they’re all excited. As they’re telling me the conditions of the theorem, I construct something which fits all the conditions. You know, you have a set (one ball)—disjoint (two balls). Then the balls tum colors, grow hairs, or whatever, in my head as they put more conditions on. Finally they state the theorem, which is some dumb thing about the ball which isn’t true for my hairy green ball thing, so I say, “False!"
physics  math  feynman  thinking  empirical  examples  lens  intuition  operational  stories  metabuch  visual-understanding  thurston  hi-order-bits  geometry  topology  cartoons  giants  👳  nibble  the-trenches  metameta  meta:math  s:**  quotes  gbooks  elegance 
january 2017 by nhaliday
pr.probability - What is convolution intuitively? - MathOverflow
I remember as a graduate student that Ingrid Daubechies frequently referred to convolution by a bump function as "blurring" - its effect on images is similar to what a short-sighted person experiences when taking off his or her glasses (and, indeed, if one works through the geometric optics, convolution is not a bad first approximation for this effect). I found this to be very helpful, not just for understanding convolution per se, but as a lesson that one should try to use physical intuition to model mathematical concepts whenever one can.

More generally, if one thinks of functions as fuzzy versions of points, then convolution is the fuzzy version of addition (or sometimes multiplication, depending on the context). The probabilistic interpretation is one example of this (where the fuzz is a a probability distribution), but one can also have signed, complex-valued, or vector-valued fuzz, of course.
q-n-a  overflow  math  concept  atoms  intuition  motivation  gowers  visual-understanding  aphorism  soft-question  tidbits  👳  mathtariat  cartoons  ground-up  metabuch  analogy  nibble  yoga  neurons  retrofit  optics  concrete  s:*  multiplicative  fourier 
january 2017 by nhaliday
soft question - Thinking and Explaining - MathOverflow
- good question from Bill Thurston
- great answers by Terry Tao, fedja, Minhyong Kim, gowers, etc.

Terry Tao:
- symmetry as blurring/vibrating/wobbling, scale invariance
- anthropomorphization, adversarial perspective for estimates/inequalities/quantifiers, spending/economy

fedja walks through his though-process from another answer

Minhyong Kim: anthropology of mathematical philosophizing

Per Vognsen: normality as isotropy
comment: conjugate subgroup gHg^-1 ~ "H but somewhere else in G"

gowers: hidden things in basic mathematics/arithmetic
comment by Ryan Budney: x sin(x) via x -> (x, sin(x)), (x, y) -> xy
I kinda get what he's talking about but needed to use Mathematica to get the initial visualization down.
To remind myself later:
- xy can be easily visualized by juxtaposing the two parabolae x^2 and -x^2 diagonally
- x sin(x) can be visualized along that surface by moving your finger along the line (x, 0) but adding some oscillations in y direction according to sin(x)
q-n-a  soft-question  big-list  intuition  communication  teaching  math  thinking  writing  thurston  lens  overflow  synthesis  hi-order-bits  👳  insight  meta:math  clarity  nibble  giants  cartoons  gowers  mathtariat  better-explained  stories  the-trenches  problem-solving  homogeneity  symmetry  fedja  examples  philosophy  big-picture  vague  isotropy  reflection  spatial  ground-up  visual-understanding  polynomials  dimensionality  math.GR  worrydream  scholar  🎓  neurons  metabuch  yoga  retrofit  mental-math  metameta  wisdom  wordlessness  oscillation  operational  adversarial  quantifiers-sums  exposition  explanation  tricki  concrete  s:***  manifolds  invariance  dynamical  info-dynamics  cool  direction  elegance  heavyweights  analysis  guessing  grokkability-clarity  technical-writing 
january 2017 by nhaliday
Shtetl-Optimized » Blog Archive » Why I Am Not An Integrated Information Theorist (or, The Unconscious Expander)
In my opinion, how to construct a theory that tells us which physical systems are conscious and which aren’t—giving answers that agree with “common sense” whenever the latter renders a verdict—is one of the deepest, most fascinating problems in all of science. Since I don’t know a standard name for the problem, I hereby call it the Pretty-Hard Problem of Consciousness. Unlike with the Hard Hard Problem, I don’t know of any philosophical reason why the Pretty-Hard Problem should be inherently unsolvable; but on the other hand, humans seem nowhere close to solving it (if we had solved it, then we could reduce the abortion, animal rights, and strong AI debates to “gentlemen, let us calculate!”).

Now, I regard IIT as a serious, honorable attempt to grapple with the Pretty-Hard Problem of Consciousness: something concrete enough to move the discussion forward. But I also regard IIT as a failed attempt on the problem. And I wish people would recognize its failure, learn from it, and move on.

In my view, IIT fails to solve the Pretty-Hard Problem because it unavoidably predicts vast amounts of consciousness in physical systems that no sane person would regard as particularly “conscious” at all: indeed, systems that do nothing but apply a low-density parity-check code, or other simple transformations of their input data. Moreover, IIT predicts not merely that these systems are “slightly” conscious (which would be fine), but that they can be unboundedly more conscious than humans are.

To justify that claim, I first need to define Φ. Strikingly, despite the large literature about Φ, I had a hard time finding a clear mathematical definition of it—one that not only listed formulas but fully defined the structures that the formulas were talking about. Complicating matters further, there are several competing definitions of Φ in the literature, including ΦDM (discrete memoryless), ΦE (empirical), and ΦAR (autoregressive), which apply in different contexts (e.g., some take time evolution into account and others don’t). Nevertheless, I think I can define Φ in a way that will make sense to theoretical computer scientists. And crucially, the broad point I want to make about Φ won’t depend much on the details of its formalization anyway.

We consider a discrete system in a state x=(x1,…,xn)∈Sn, where S is a finite alphabet (the simplest case is S={0,1}). We imagine that the system evolves via an “updating function” f:Sn→Sn. Then the question that interests us is whether the xi‘s can be partitioned into two sets A and B, of roughly comparable size, such that the updates to the variables in A don’t depend very much on the variables in B and vice versa. If such a partition exists, then we say that the computation of f does not involve “global integration of information,” which on Tononi’s theory is a defining aspect of consciousness.
aaronson  tcstariat  philosophy  dennett  interdisciplinary  critique  nibble  org:bleg  within-without  the-self  neuro  psychology  cog-psych  metrics  nitty-gritty  composition-decomposition  complex-systems  cybernetics  bits  information-theory  entropy-like  forms-instances  empirical  walls  arrows  math.DS  structure  causation  quantitative-qualitative  number  extrema  optimization  abstraction  explanation  summary  degrees-of-freedom  whole-partial-many  network-structure  systematic-ad-hoc  tcs  complexity  hardness  no-go  computation  measurement  intricacy  examples  counterexample  coding-theory  linear-algebra  fields  graphs  graph-theory  expanders  math  math.CO  properties  local-global  intuition  error  definition  coupling-cohesion 
january 2017 by nhaliday
soft question - Why does Fourier analysis of Boolean functions "work"? - Theoretical Computer Science Stack Exchange
Here is my point of view, which I learned from Guy Kindler, though someone more experienced can probably give a better answer: Consider the linear space of functions f: {0,1}^n -> R and consider a linear operator of the form σ_w (for w in {0,1}^n), that maps a function f(x) as above to the function f(x+w). In many of the questions of TCS, there is an underlying need to analyze the effects that such operators have on certain functions.

Now, the point is that the Fourier basis is the basis that diagonalizes all those operators at the same time, which makes the analysis of those operators much simpler. More generally, the Fourier basis diagonalizes the convolution operator, which also underlies many of those questions. Thus, Fourier analysis is likely to be effective whenever one needs to analyze those operators.
q-n-a  math  tcs  synthesis  boolean-analysis  fourier  👳  tidbits  motivation  intuition  linear-algebra  overflow  hi-order-bits  insight  curiosity  ground-up  arrows  nibble  s:*  elegance  guessing 
december 2016 by nhaliday
gt.geometric topology - Intuitive crutches for higher dimensional thinking - MathOverflow
Terry Tao:
I can't help you much with high-dimensional topology - it's not my field, and I've not picked up the various tricks topologists use to get a grip on the subject - but when dealing with the geometry of high-dimensional (or infinite-dimensional) vector spaces such as R^n, there are plenty of ways to conceptualise these spaces that do not require visualising more than three dimensions directly.

For instance, one can view a high-dimensional vector space as a state space for a system with many degrees of freedom. A megapixel image, for instance, is a point in a million-dimensional vector space; by varying the image, one can explore the space, and various subsets of this space correspond to various classes of images.

One can similarly interpret sound waves, a box of gases, an ecosystem, a voting population, a stream of digital data, trials of random variables, the results of a statistical survey, a probabilistic strategy in a two-player game, and many other concrete objects as states in a high-dimensional vector space, and various basic concepts such as convexity, distance, linearity, change of variables, orthogonality, or inner product can have very natural meanings in some of these models (though not in all).

It can take a bit of both theory and practice to merge one's intuition for these things with one's spatial intuition for vectors and vector spaces, but it can be done eventually (much as after one has enough exposure to measure theory, one can start merging one's intuition regarding cardinality, mass, length, volume, probability, cost, charge, and any number of other "real-life" measures).

For instance, the fact that most of the mass of a unit ball in high dimensions lurks near the boundary of the ball can be interpreted as a manifestation of the law of large numbers, using the interpretation of a high-dimensional vector space as the state space for a large number of trials of a random variable.

More generally, many facts about low-dimensional projections or slices of high-dimensional objects can be viewed from a probabilistic, statistical, or signal processing perspective.

Scott Aaronson:
Here are some of the crutches I've relied on. (Admittedly, my crutches are probably much more useful for theoretical computer science, combinatorics, and probability than they are for geometry, topology, or physics. On a related note, I personally have a much easier time thinking about R^n than about, say, R^4 or R^5!)

1. If you're trying to visualize some 4D phenomenon P, first think of a related 3D phenomenon P', and then imagine yourself as a 2D being who's trying to visualize P'. The advantage is that, unlike with the 4D vs. 3D case, you yourself can easily switch between the 3D and 2D perspectives, and can therefore get a sense of exactly what information is being lost when you drop a dimension. (You could call this the "Flatland trick," after the most famous literary work to rely on it.)
2. As someone else mentioned, discretize! Instead of thinking about R^n, think about the Boolean hypercube {0,1}^n, which is finite and usually easier to get intuition about. (When working on problems, I often find myself drawing {0,1}^4 on a sheet of paper by drawing two copies of {0,1}^3 and then connecting the corresponding vertices.)
3. Instead of thinking about a subset S⊆R^n, think about its characteristic function f:R^n→{0,1}. I don't know why that trivial perspective switch makes such a big difference, but it does ... maybe because it shifts your attention to the process of computing f, and makes you forget about the hopeless task of visualizing S!
4. One of the central facts about R^n is that, while it has "room" for only n orthogonal vectors, it has room for exp⁡(n) almost-orthogonal vectors. Internalize that one fact, and so many other properties of R^n (for example, that the n-sphere resembles a "ball with spikes sticking out," as someone mentioned before) will suddenly seem non-mysterious. In turn, one way to internalize the fact that R^n has so many almost-orthogonal vectors is to internalize Shannon's theorem that there exist good error-correcting codes.
5. To get a feel for some high-dimensional object, ask questions about the behavior of a process that takes place on that object. For example: if I drop a ball here, which local minimum will it settle into? How long does this random walk on {0,1}^n take to mix?

Gil Kalai:
This is a slightly different point, but Vitali Milman, who works in high-dimensional convexity, likes to draw high-dimensional convex bodies in a non-convex way. This is to convey the point that if you take the convex hull of a few points on the unit sphere of R^n, then for large n very little of the measure of the convex body is anywhere near the corners, so in a certain sense the body is a bit like a small sphere with long thin "spikes".
q-n-a  intuition  math  visual-understanding  list  discussion  thurston  tidbits  aaronson  tcs  geometry  problem-solving  yoga  👳  big-list  metabuch  tcstariat  gowers  mathtariat  acm  overflow  soft-question  levers  dimensionality  hi-order-bits  insight  synthesis  thinking  models  cartoons  coding-theory  information-theory  probability  concentration-of-measure  magnitude  linear-algebra  boolean-analysis  analogy  arrows  lifts-projections  measure  markov  sampling  shannon  conceptual-vocab  nibble  degrees-of-freedom  worrydream  neurons  retrofit  oscillation  paradox  novelty  tricki  concrete  high-dimension  s:***  manifolds  direction  curvature  convexity-curvature  elegance  guessing 
december 2016 by nhaliday
Overcoming Bias : Two Kinds Of Status
prestige and dominance

More here. I was skeptical at first, but now am convinced: humans see two kinds of status, and approve of prestige-status much more than domination-status. I’ll have much more to say about this in the coming days, but it is far from clear to me that prestige-status is as much better than domination-status as people seem to think. Efforts to achieve prestige-status also have serious negative side-effects.

Two Ways to the Top: Evidence That Dominance and Prestige Are Distinct Yet Viable Avenues to Social Rank and Influence: https://henrich.fas.harvard.edu/files/henrich/files/cheng_et_al_2013.pdf
Dominance (the use of force and intimidation to induce fear) and Prestige (the sharing of expertise or know-how to gain respect)

...

According to the model, Dominance initially arose in evolutionary history as a result of agonistic contests for material resources and mates that were common among nonhuman species, but continues to exist in contemporary human societies, largely in the form of psychological intimidation, coercion, and wielded control over costs and benefits (e.g., access to resources, mates, and well-being). In both humans and nonhumans, Dominance hierarchies are thought to emerge to help maintain patterns of submission directed from subordinates to Dominants, thereby minimizing agonistic battles and incurred costs.

In contrast, Prestige is likely unique to humans, because it is thought to have emerged from selection pressures to preferentially attend to and acquire cultural knowledge from highly skilled or successful others, a capacity considered to be less developed in other animals (Boyd & Richerson, 1985; Laland & Galef, 2009). In this view, social learning (i.e., copying others) evolved in humans as a low-cost fitness-maximizing, information-gathering mechanism (Boyd & Richerson, 1985). Once it became adaptive to copy skilled others, a preference for social models with better than average information would have emerged. This would promote competition for access to the highest quality models, and deference toward these models in exchange for copying and learning opportunities. Consequently, selection likely favored Prestige differentiation, with individuals possessing high-quality information or skills elevated to the top of the hierarchy. Meanwhile, other individuals may reach the highest ranks of their group’s hierarchy by wielding threat of force, regardless of the quality of their knowledge or skills. Thus, Dominance and Prestige can be thought of as coexisting avenues to attaining rank and influence within social groups, despite being underpinned by distinct motivations and behavioral patterns, and resulting in distinct patterns of imitation and deference from subordinates.

Importantly, both Dominance and Prestige are best conceptualized as cognitive and behavioral strategies (i.e., suites of subjective feelings, cognitions, motivations, and behavioral patterns that together produce certain outcomes) deployed in certain situations, and can be used (with more or less success) by any individual within a group. They are not types of individuals, or even, necessarily, traits within individuals. Instead, we assume that all situated dyadic relationships contain differential degrees of both Dominance and Prestige, such that each person is simultaneously Dominant and Prestigious to some extent, to some other individual. Thus, it is possible that a high degree of Dominance and a high degree of Prestige may be found within the same individual, and may depend on who is doing the judging. For example, by controlling students’ access to rewards and punishments, school teachers may exert Dominance in their relationships with some students, but simultaneously enjoy Prestige with others, if they are respected and deferred to for their competence and wisdom. Indeed, previous studies have shown that based on both self- and peer ratings, Dominance and Prestige are largely independent (mean r = -.03; Cheng et al., 2010).

Status Hypocrisy: https://www.overcomingbias.com/2017/01/status-hypocrisy.html
Today we tend to say that our leaders have prestige, while their leaders have dominance. That is, their leaders hold power via personal connections and the threat and practice of violence, bribes, sex, gossip, and conformity pressures. Our leaders, instead, mainly just have whatever abilities follow from our deepest respect and admiration regarding their wisdom and efforts on serious topics that matter for us all. Their leaders more seek power, while ours more have leadership thrust upon them. Because of this us/them split, we tend to try to use persuasion on us, but force on them, when seeking to to change behaviors.

...

Clearly, while there is some fact of the matter about how much a person gains their status via licit or illicit means, there is also a lot of impression management going on. We like to give others the impression that we personally mainly want prestige in ourselves and our associates, and that we only grant others status via the prestige they have earned. But let me suggest that, compared to this ideal, we actually want more dominance in ourselves and our associates than we like to admit, and we submit more often to dominance.

Cads, Dads, Doms: https://www.overcomingbias.com/2010/07/cads-dads-doms.html
"The proper dichotomy is not “virile vs. wimpy” as has been supposed, but “exciting vs. drab,” with the former having the two distinct sub-groups “macho man vs. pretty boy.” Another way to see that this is the right dichotomy is to look around the world: wherever girls really dig macho men, they also dig the peacocky musician type too, finding safe guys a bit boring. And conversely, where devoted dads do the best, it’s more difficult for macho men or in-town-for-a-day rockstars to make out like bandits. …

Whatever it is about high-pathogen-load areas that selects for greater polygynous behavior … will result in an increase in both gorilla-like and peacock-like males, since they’re two viable ways to pursue a polygynous mating strategy."

This fits with there being two kinds of status: dominance and prestige. Macho men, such as CEOs and athletes, have dominance, while musicians and artists have prestige. But women seek both short and long term mates. Since both kinds of status suggest good genes, both attract women seeking short term mates. This happens more when women are younger and richer, and when there is more disease. Foragers pretend they don’t respect dominance as much as they do, so prestigious men get more overt attention, while dominant men get more covert attention.

Women seeking long term mates also consider a man’s ability to supply resources, and may settle for poorer genes to get more resources. Dominant men tend to have more resources than prestigious men, so such men are more likely to fill both roles, being long term mates for some women and short term mates for others. Men who can offer only prestige must accept worse long term mates, while men who can offer only resources must accept few short term mates. Those low in prestige, resources, or dominance must accept no mates. A man who had prestige, dominance, and resources would get the best short and long term mates – what men are these?

Stories are biased toward dramatic events, and so are biased toward events with risky men; it is harder to tell a good story about the attraction of a resource-rich man. So stories naturally encourage short term mating. Shouldn’t this make long-term mates wary of strong mate attraction to dramatic stories?

https://www.overcomingbias.com/2010/07/cads-dads-doms.html#comment-518319076
Woman want three things: someone to fight for them (the Warrior), someone to provide for them (the Tycoon) and someone to excite their emotions or entertain them (the Wizard).

In this context,

Dom=Warrior
Dad= Tycoon
Cad= Wizard

To repeat:

Dom (Cocky)+ Dad (Generous) + Cad (Exciting/Funny) = Laid

https://www.overcomingbias.com/2010/07/cads-dads-doms.html#comment-518318987
There is an old distinction between "proximate" and "ultimate" causes. Evolution is an ultimate cause, physiology (and psychology, here) is a proximate cause. The flower bends to follow the sun because it gathers more light that way, but the immediate mechanism of the bending involves hormones called auxins. I see a lot of speculation about, say, sexual cognitive dimorphism whose ultimate cause is evolutionary, but not so much speculation about the proximate cause - the "how" of the difference, rather than the "why". And here I think a visit to an older mode of explanation like Marsden's - one which is psychological rather than genetic - can sensitize us to the fact that the proximate causes of a behavioral tendency need not be a straightforward matter of being hardwired differently.

This leads to my second point, which is just that we should remember that human beings actually possess consciousness. This means not only that the proximate cause of a behavior may deeply involve subjectivity, self-awareness, and an existential situation. It also means that all of these propositions about what people do are susceptible to change once they have been spelled out and become part of the culture. It is rather like the stock market: once everyone knows (or believes) something, then that information provides no advantage, creating an incentive for novelty.

Finally, the consequences of new beliefs about the how and the why of human nature and human behavior. Right or wrong, theories already begin to have consequences once they are taken up and incorporated into subjectivity. We really need a new Foucault to take on this topic.

The Economics of Social Status: http://www.meltingasphalt.com/the-economics-of-social-status/
Prestige vs. dominance. Joseph Henrich (of WEIRD fame) distinguishes two types of status. Prestige is the kind of status we get from being an impressive human specimen (think Meryl Streep), and it's governed by our 'approach' instincts. Dominance, on the other hand, is … [more]
things  status  hanson  thinking  comparison  len:short  anthropology  farmers-and-foragers  phalanges  ratty  duty  power  humility  hypocrisy  hari-seldon  multi  sex  gender  signaling  🐝  tradeoffs  evopsych  insight  models  sexuality  gender-diff  chart  postrat  yvain  ssc  simler  critique  essay  debate  paying-rent  gedanken  empirical  operational  vague  info-dynamics  len:long  community  henrich  long-short-run  rhetoric  contrarianism  coordination  social-structure  hidden-motives  politics  2016-election  rationality  links  study  summary  list  hive-mind  speculation  coalitions  values  🤖  metabuch  envy  universalism-particularism  egalitarianism-hierarchy  s-factor  unintended-consequences  tribalism  group-selection  justice  inequality  competition  cultural-dynamics  peace-violence  ranking  machiavelli  authoritarianism  strategy  tactics  organizing  leadership  management  n-factor  duplication  thiel  volo-avolo  todo  technocracy  rent-seeking  incentives  econotariat  marginal-rev  civilization  rot  gibbon 
september 2016 by nhaliday
« earlier      
per page:    204080120160

bundles : abstractgood-vibesmathmetaproblem-solvingthinkingvagueworrydream

related tags

2016-election  aaronson  absolute-relative  abstraction  accuracy  acm  acmtariat  additive  additive-combo  advanced  adversarial  advice  aesthetics  age-generation  ai  ai-control  algebra  algorithms  alien-character  alignment  amortization-potential  analogy  analysis  analytical-holistic  anglo  anglosphere  anthropology  aphorism  apollonian-dionysian  applicability-prereqs  arbitrage  arms  arrows  art  asia  atoms  attention  authoritarianism  automation  axioms  being-becoming  benevolence  best-practices  better-explained  bias-variance  biases  big-list  big-picture  big-surf  big-yud  bio  bits  boltzmann  books  boolean-analysis  borel-cantelli  bostrom  bounded-cognition  brain-scan  branches  britain  brunn-minkowski  caching  career  cartoons  CAS  causation  certificates-recognition  characterization  chart  checklists  china  civilization  clarity  clever-rats  closure  coalitions  coarse-fine  cocktail  coding-theory  cog-psych  commentary  communication  community  comparison  competition  complement-substitute  complex-systems  complexity  composition-decomposition  computation  concentration-of-measure  concept  conceptual-vocab  concrete  confidence  confusion  consilience  contracts  contradiction  contrarianism  convergence  convexity-curvature  cool  cooperate-defect  coordination  correctness  correlation  cost-benefit  counterexample  coupling-cohesion  cracker-prog  creative  critique  crux  cs  cultural-dynamics  culture  curiosity  curvature  cybernetics  cycles  dark-arts  darwinian  data-science  data-structures  death  debate  decentralized  decision-making  decision-theory  deep-learning  deep-materialism  definition  degrees-of-freedom  dennett  dependence-independence  detail-architecture  deterrence  differential  dimensionality  direct-indirect  direction  discovery  discrete  discussion  disease  distributed  distribution  drama  DSL  duality  duplication  duty  dynamic  dynamical  early-modern  economics  econotariat  ecosystem  eden  eden-heaven  education  EEA  egalitarianism-hierarchy  EGT  einstein  electromag  elegance  embeddings  emotion  empirical  ems  endogenous-exogenous  ends-means  engineering  entertainment  entropy-like  environment  envy  epistemic  ergodic  error  essay  essence-existence  estimate  europe  evolution  evopsych  examples  existence  exocortex  expanders  expert  expert-experience  explanans  explanation  exploratory  exposition  externalities  extratricky  extrema  faq  farmers-and-foragers  fashun  fedja  feynman  fiction  fields  finiteness  fisher  flexibility  flux-stasis  foreign-lang  form-design  formal-methods  formal-values  forms-instances  fourier  frontier  futurism  gallic  games  gbooks  gedanken  gelman  gender  gender-diff  generalization  geometry  giants  gibbon  gnon  gnxp  god-man-beast-victim  gotchas  gowers  grad-school  graph-theory  graphs  gregory-clark  grokkability  grokkability-clarity  ground-up  group-selection  growth  growth-econ  guessing  guide  gwern  hanson  hardness  hardware  hari-seldon  heavy-industry  heavyweights  henrich  heuristic  hi-order-bits  hidden-motives  high-dimension  history  hive-mind  hmm  hn  homo-hetero  homogeneity  hsu  humanity  humility  hypocrisy  hypothesis-testing  ideas  identity  IEEE  iidness  impetus  impro  incentives  increase-decrease  individualism-collectivism  inequality  inference  info-dynamics  information-theory  init  inner-product  innovation  insight  instinct  integral  intelligence  interdisciplinary  interview  intricacy  intuition  invariance  investing  iq  iron-age  isotropy  iteration-recursion  judaism  judgement  justice  kernels  knowledge  labor  language  large-factor  law  leadership  learning  lectures  legacy  len:long  len:short  lens  lesswrong  let-me-see  levers  leviathan  lifts-projections  limits  linear-algebra  links  list  local-global  logic  long-short-run  long-term  lower-bounds  machiavelli  machine-learning  macro  magnitude  malthus  management  manifolds  map-territory  marginal  marginal-rev  markov  martingale  matching  math  math.AC  math.AG  math.CA  math.CO  math.CT  math.CV  math.DS  math.FA  math.GN  math.GR  math.MG  math.NT  math.RT  mathtariat  matrix-factorization  measure  measurement  mechanics  medicine  mediterranean  mental-math  meta:math  meta:prediction  meta:research  meta:rhetoric  metabuch  metameta  methodology  metric-space  metrics  michael-nielsen  micro  miri-cfar  model-class  model-organism  models  moloch  moments  monetary-fiscal  money  monotonicity  monte-carlo  mostly-modern  motivation  move-fast-(and-break-things)  multi  multiplicative  mutation  n-factor  narrative  nature  near-far  network-structure  neuro  neuro-nitgrit  neurons  new-religion  news  nibble  nitty-gritty  no-go  noble-lie  nonlinearity  norms  notation  novelty  number  occam  occident  old-anglo  oly  open-closed  open-problems  open-things  operational  optics  optimism  optimization  order-disorder  orders  ORFE  org:bleg  org:edge  org:edu  org:junk  org:mag  org:mat  org:sci  organizing  orourke  oscillation  overflow  p:whenever  papers  paradox  parallax  parasites-microbiome  parenting  parsimony  patience  paying-rent  PCP  pdf  peace-violence  persuasion  pessimism  phalanges  philosophy  physics  pic  pigeonhole-markov  planning  plots  poast  politics  polynomials  popsci  positivity  postrat  power  practice  pragmatic  pre-ww2  prediction  predictive-processing  presentation  prioritizing  probabilistic-method  probability  problem-solving  programming  proof-systems  proofs  properties  psychiatry  psychology  psychometrics  publishing  puzzles  q-n-a  qra  quantifiers-sums  quantitative-qualitative  quantum  quantum-info  questions  quixotic  quotes  random  ranking  rationality  ratty  realness  reason  rec-math  reduction  reference  reflection  regression-to-mean  regularity  reinforcement  relativity  relaxation  religion  rent-seeking  research  responsibility  retention  retrofit  review  revolution  rhetoric  rigidity  rigor  risk  robust  roots  rot  s-factor  s:*  s:**  s:***  s:null  sampling  sampling-bias  sapiens  scale  scholar  scholar-pack  science  scitariat  SDP  search  securities  selection  separation  sex  sexuality  shannon  shift  signaling  signum  similarity  simler  simulation  singularity  sinosphere  skeleton  skunkworks  sky  slides  smoothness  social-norms  social-psych  social-science  social-structure  sociality  soft-question  software  space  span-cover  sparsity  spatial  speaking  spectral  speculation  speed  speedometer  spengler  spock  sports  ssc  stackex  stat-mech  state  state-of-art  static-dynamic  stats  status  stochastic-processes  stories  strategy  stream  street-fighting  stress  structure  study  studying  stylized-facts  subculture  subjective-objective  summary  survey  symmetry  synchrony  synthesis  systematic-ad-hoc  tactics  tcs  tcstariat  teaching  technical-writing  technocracy  technology  techtariat  telos-atelos  tensors  tetlock  the-classics  the-devil  the-great-west-whale  the-self  the-trenches  theory-of-mind  theory-practice  theos  thesis  thick-thin  thiel  things  thinking  threat-modeling  thurston  tidbits  time  time-preference  todo  toolkit  top-n  topology  traces  track-record  trade  tradeoffs  trees  tribalism  tricki  tricks  trivia  troll  trust  truth  turing  tutorial  tutoring  uncertainty  unintended-consequences  uniqueness  unit  universalism-particularism  urban-rural  vague  values  video  virtu  visual-understanding  visualization  visuo  volo-avolo  von-neumann  walls  war  water  waves  wealth  west-hunter  whole-partial-many  wiki  winner-take-all  wire-guided  wisdom  within-without  wordlessness  wormholes  worrydream  worse-is-better/the-right-thing  writing  X-not-about-Y  yoga  yvain  zero-positive-sum  zooming  🎓  🐝  👳  🔬  🖥  🤖 

Copy this bookmark:



description:


tags: