nhaliday + deep-learning   245

Ask HN: What's a promising area to work on? | Hacker News
hn  discussion  q-n-a  ideas  impact  trends  the-bones  speedometer  technology  applications  tech  cs  programming  list  top-n  recommendations  lens  machine-learning  deep-learning  security  privacy  crypto  software  hardware  cloud  biotech  CRISPR  bioinformatics  biohacking  blockchain  cryptocurrency  crypto-anarchy  healthcare  graphics  SIGGRAPH  vr  automation  universalism-particularism  expert-experience  reddit  social  arbitrage  supply-demand  ubiquity  cost-benefit  compensation  chart  career  planning  strategy  long-term  advice  sub-super  commentary  rhetoric  org:com  techtariat  human-capital  prioritizing  tech-infrastructure  working-stiff  data-science 
28 days ago by nhaliday
Why is Google Translate so bad for Latin? A longish answer. : latin
hmm:
> All it does its correlate sequences of up to five consecutive words in texts that have been manually translated into two or more languages.
That sort of system ought to be perfect for a dead language, though. Dump all the Cicero, Livy, Lucretius, Vergil, and Oxford Latin Course into a database and we're good.

We're not exactly inundated with brand new Latin to translate.
--
> Dump all the Cicero, Livy, Lucretius, Vergil, and Oxford Latin Course into a database and we're good.
What makes you think that the Google folks haven't done so and used that to create the language models they use?
> That sort of system ought to be perfect for a dead language, though.
Perhaps. But it will be bad at translating novel English sentences to Latin.
foreign-lang  reddit  social  discussion  language  the-classics  literature  dataset  measurement  roots  traces  syntax  anglo  nlp  stackex  links  q-n-a  linguistics  lexical  deep-learning  sequential  hmm  project  arrows  generalization  state-of-art  apollonian-dionysian  machine-learning  google 
june 2019 by nhaliday
classification - ImageNet: what is top-1 and top-5 error rate? - Cross Validated
Now, in the case of top-1 score, you check if the top class (the one having the highest probability) is the same as the target label.

In the case of top-5 score, you check if the target label is one of your top 5 predictions (the 5 ones with the highest probabilities).
nibble  q-n-a  overflow  machine-learning  deep-learning  metrics  comparison  ranking  top-n  classification  computer-vision  benchmarks  dataset  accuracy  error  jargon 
june 2019 by nhaliday
[1803.00085] Chinese Text in the Wild
We introduce Chinese Text in the Wild, a very large dataset of Chinese text in street view images.

...

We give baseline results using several state-of-the-art networks, including AlexNet, OverFeat, Google Inception and ResNet for character recognition, and YOLOv2 for character detection in images. Overall Google Inception has the best performance on recognition with 80.5% top-1 accuracy, while YOLOv2 achieves an mAP of 71.0% on detection. Dataset, source code and trained models will all be publicly available on the website.
nibble  pdf  papers  preprint  machine-learning  deep-learning  deepgoog  state-of-art  china  asia  writing  language  dataset  error  accuracy  computer-vision  pic  ocr  org:mat  benchmarks  questions 
may 2019 by nhaliday
Should I go for TensorFlow or PyTorch?
Honestly, most experts that I know love Pytorch and detest TensorFlow. Karpathy and Justin from Stanford for example. You can see Karpthy's thoughts and I've asked Justin personally and the answer was sharp: PYTORCH!!! TF has lots of PR but its API and graph model are horrible and will waste lots of your research time.

--

...

Updated Mar 12
Update after 2019 TF summit:

TL/DR: previously I was in the pytorch camp but with TF 2.0 it’s clear that Google is really going to try to have parity or try to be better than Pytorch in all aspects where people voiced concerns (ease of use/debugging/dynamic graphs). They seem to be allocating more resources on development than Facebook so the longer term currently looks promising for Google. Prior to TF 2.0 I thought that Pytorch team had more momentum. One area where FB/Pytorch is still stronger is Google is a bit more closed and doesn’t seem to release reproducible cutting edge models such as AlphaGo whereas FAIR released OpenGo for instance. Generally you will end up running into models that are only implemented in one framework of the other so chances are you might end up learning both.
q-n-a  qra  comparison  software  recommendations  cost-benefit  tradeoffs  python  libraries  machine-learning  deep-learning  data-science  sci-comp  tools  google  facebook  tech  competition  best-practices  trends  debugging  expert-experience  ecosystem  theory-practice  pragmatic  wire-guided  static-dynamic  state  academia  frameworks  open-closed 
may 2019 by nhaliday
Information Processing: Moore's Law and AI
Hint to technocratic planners: invest more in physicists, chemists, and materials scientists. The recent explosion in value from technology has been driven by physical science -- software gets way too much credit. From the former we got a factor of a million or more in compute power, data storage, and bandwidth. From the latter, we gained (perhaps) an order of magnitude or two in effectiveness: how much better are current OSes and programming languages than Unix and C, both of which are ~50 years old now?

...

Of relevance to this discussion: a big chunk of AlphaGo's performance improvement over other Go programs is due to raw compute power (link via Jess Riedel). The vertical axis is ELO score. You can see that without multi-GPU compute, AlphaGo has relatively pedestrian strength.
hsu  scitariat  comparison  software  hardware  performance  sv  tech  trends  ai  machine-learning  deep-learning  deepgoog  google  roots  impact  hard-tech  multiplicative  the-world-is-just-atoms  technology  trivia  cocktail  big-picture  hi-order-bits 
may 2019 by nhaliday
Workshop Abstract | Identifying and Understanding Deep Learning Phenomena
ICML 2019 workshop, June 15th 2019, Long Beach, CA

We solicit contributions that view the behavior of deep nets as natural phenomena, to be investigated with methods inspired from the natural sciences like physics, astronomy, and biology.
unit  workshop  acm  machine-learning  science  empirical  nitty-gritty  atoms  deep-learning  model-class  icml  data-science  rigor  replication  examples  ben-recht  physics 
april 2019 by nhaliday
Lateralization of brain function - Wikipedia
Language
Language functions such as grammar, vocabulary and literal meaning are typically lateralized to the left hemisphere, especially in right handed individuals.[3] While language production is left-lateralized in up to 90% of right-handers, it is more bilateral, or even right-lateralized, in approximately 50% of left-handers.[4]

Broca's area and Wernicke's area, two areas associated with the production of speech, are located in the left cerebral hemisphere for about 95% of right-handers, but about 70% of left-handers.[5]:69

Auditory and visual processing
The processing of visual and auditory stimuli, spatial manipulation, facial perception, and artistic ability are represented bilaterally.[4] Numerical estimation, comparison and online calculation depend on bilateral parietal regions[6][7] while exact calculation and fact retrieval are associated with left parietal regions, perhaps due to their ties to linguistic processing.[6][7]

...

Depression is linked with a hyperactive right hemisphere, with evidence of selective involvement in "processing negative emotions, pessimistic thoughts and unconstructive thinking styles", as well as vigilance, arousal and self-reflection, and a relatively hypoactive left hemisphere, "specifically involved in processing pleasurable experiences" and "relatively more involved in decision-making processes".

Chaos and Order; the right and left hemispheres: https://orthosphere.wordpress.com/2018/05/23/chaos-and-order-the-right-and-left-hemispheres/
In The Master and His Emissary, Iain McGilchrist writes that a creature like a bird needs two types of consciousness simultaneously. It needs to be able to focus on something specific, such as pecking at food, while it also needs to keep an eye out for predators which requires a more general awareness of environment.

These are quite different activities. The Left Hemisphere (LH) is adapted for a narrow focus. The Right Hemisphere (RH) for the broad. The brains of human beings have the same division of function.

The LH governs the right side of the body, the RH, the left side. With birds, the left eye (RH) looks for predators, the right eye (LH) focuses on food and specifics. Since danger can take many forms and is unpredictable, the RH has to be very open-minded.

The LH is for narrow focus, the explicit, the familiar, the literal, tools, mechanism/machines and the man-made. The broad focus of the RH is necessarily more vague and intuitive and handles the anomalous, novel, metaphorical, the living and organic. The LH is high resolution but narrow, the RH low resolution but broad.

The LH exhibits unrealistic optimism and self-belief. The RH has a tendency towards depression and is much more realistic about a person’s own abilities. LH has trouble following narratives because it has a poor sense of “wholes.” In art it favors flatness, abstract and conceptual art, black and white rather than color, simple geometric shapes and multiple perspectives all shoved together, e.g., cubism. Particularly RH paintings emphasize vistas with great depth of field and thus space and time,[1] emotion, figurative painting and scenes related to the life world. In music, LH likes simple, repetitive rhythms. The RH favors melody, harmony and complex rhythms.

...

Schizophrenia is a disease of extreme LH emphasis. Since empathy is RH and the ability to notice emotional nuance facially, vocally and bodily expressed, schizophrenics tend to be paranoid and are often convinced that the real people they know have been replaced by robotic imposters. This is at least partly because they lose the ability to intuit what other people are thinking and feeling – hence they seem robotic and suspicious.

Oswald Spengler’s The Decline of the West as well as McGilchrist characterize the West as awash in phenomena associated with an extreme LH emphasis. Spengler argues that Western civilization was originally much more RH (to use McGilchrist’s categories) and that all its most significant artistic (in the broadest sense) achievements were triumphs of RH accentuation.

The RH is where novel experiences and the anomalous are processed and where mathematical, and other, problems are solved. The RH is involved with the natural, the unfamiliar, the unique, emotions, the embodied, music, humor, understanding intonation and emotional nuance of speech, the metaphorical, nuance, and social relations. It has very little speech, but the RH is necessary for processing all the nonlinguistic aspects of speaking, including body language. Understanding what someone means by vocal inflection and facial expressions is an intuitive RH process rather than explicit.

...

RH is very much the center of lived experience; of the life world with all its depth and richness. The RH is “the master” from the title of McGilchrist’s book. The LH ought to be no more than the emissary; the valued servant of the RH. However, in the last few centuries, the LH, which has tyrannical tendencies, has tried to become the master. The LH is where the ego is predominantly located. In split brain patients where the LH and the RH are surgically divided (this is done sometimes in the case of epileptic patients) one hand will sometimes fight with the other. In one man’s case, one hand would reach out to hug his wife while the other pushed her away. One hand reached for one shirt, the other another shirt. Or a patient will be driving a car and one hand will try to turn the steering wheel in the opposite direction. In these cases, the “naughty” hand is usually the left hand (RH), while the patient tends to identify herself with the right hand governed by the LH. The two hemispheres have quite different personalities.

The connection between LH and ego can also be seen in the fact that the LH is competitive, contentious, and agonistic. It wants to win. It is the part of you that hates to lose arguments.

Using the metaphor of Chaos and Order, the RH deals with Chaos – the unknown, the unfamiliar, the implicit, the emotional, the dark, danger, mystery. The LH is connected with Order – the known, the familiar, the rule-driven, the explicit, and light of day. Learning something means to take something unfamiliar and making it familiar. Since the RH deals with the novel, it is the problem-solving part. Once understood, the results are dealt with by the LH. When learning a new piece on the piano, the RH is involved. Once mastered, the result becomes a LH affair. The muscle memory developed by repetition is processed by the LH. If errors are made, the activity returns to the RH to figure out what went wrong; the activity is repeated until the correct muscle memory is developed in which case it becomes part of the familiar LH.

Science is an attempt to find Order. It would not be necessary if people lived in an entirely orderly, explicit, known world. The lived context of science implies Chaos. Theories are reductive and simplifying and help to pick out salient features of a phenomenon. They are always partial truths, though some are more partial than others. The alternative to a certain level of reductionism or partialness would be to simply reproduce the world which of course would be both impossible and unproductive. The test for whether a theory is sufficiently non-partial is whether it is fit for purpose and whether it contributes to human flourishing.

...

Analytic philosophers pride themselves on trying to do away with vagueness. To do so, they tend to jettison context which cannot be brought into fine focus. However, in order to understand things and discern their meaning, it is necessary to have the big picture, the overview, as well as the details. There is no point in having details if the subject does not know what they are details of. Such philosophers also tend to leave themselves out of the picture even when what they are thinking about has reflexive implications. John Locke, for instance, tried to banish the RH from reality. All phenomena having to do with subjective experience he deemed unreal and once remarked about metaphors, a RH phenomenon, that they are “perfect cheats.” Analytic philosophers tend to check the logic of the words on the page and not to think about what those words might say about them. The trick is for them to recognize that they and their theories, which exist in minds, are part of reality too.

The RH test for whether someone actually believes something can be found by examining his actions. If he finds that he must regard his own actions as free, and, in order to get along with other people, must also attribute free will to them and treat them as free agents, then he effectively believes in free will – no matter his LH theoretical commitments.

...

We do not know the origin of life. We do not know how or even if consciousness can emerge from matter. We do not know the nature of 96% of the matter of the universe. Clearly all these things exist. They can provide the subject matter of theories but they continue to exist as theorizing ceases or theories change. Not knowing how something is possible is irrelevant to its actual existence. An inability to explain something is ultimately neither here nor there.

If thought begins and ends with the LH, then thinking has no content – content being provided by experience (RH), and skepticism and nihilism ensue. The LH spins its wheels self-referentially, never referring back to experience. Theory assumes such primacy that it will simply outlaw experiences and data inconsistent with it; a profoundly wrong-headed approach.

...

Gödel’s Theorem proves that not everything true can be proven to be true. This means there is an ineradicable role for faith, hope and intuition in every moderately complex human intellectual endeavor. There is no one set of consistent axioms from which all other truths can be derived.

Alan Turing’s proof of the halting problem proves that there is no effective procedure for finding effective procedures. Without a mechanical decision procedure, (LH), when it comes to … [more]
gnon  reflection  books  summary  review  neuro  neuro-nitgrit  things  thinking  metabuch  order-disorder  apollonian-dionysian  bio  examples  near-far  symmetry  homo-hetero  logic  inference  intuition  problem-solving  analytical-holistic  n-factor  europe  the-great-west-whale  occident  alien-character  detail-architecture  art  theory-practice  philosophy  being-becoming  essence-existence  language  psychology  cog-psych  egalitarianism-hierarchy  direction  reason  learning  novelty  science  anglo  anglosphere  coarse-fine  neurons  truth  contradiction  matching  empirical  volo-avolo  curiosity  uncertainty  theos  axioms  intricacy  computation  analogy  essay  rhetoric  deep-materialism  new-religion  knowledge  expert-experience  confidence  biases  optimism  pessimism  realness  whole-partial-many  theory-of-mind  values  competition  reduction  subjective-objective  communication  telos-atelos  ends-means  turing  fiction  increase-decrease  innovation  creative  thick-thin  spengler  multi  ratty  hanson  complex-systems  structure  concrete  abstraction  network-s 
september 2018 by nhaliday
Moravec's paradox - Wikipedia
Moravec's paradox is the discovery by artificial intelligence and robotics researchers that, contrary to traditional assumptions, high-level reasoning requires very little computation, but low-level sensorimotor skills require enormous computational resources. The principle was articulated by Hans Moravec, Rodney Brooks, Marvin Minsky and others in the 1980s. As Moravec writes, "it is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility".[1]

Similarly, Minsky emphasized that the most difficult human skills to reverse engineer are those that are unconscious. "In general, we're least aware of what our minds do best", he wrote, and added "we're more aware of simple processes that don't work well than of complex ones that work flawlessly".[2]

...

One possible explanation of the paradox, offered by Moravec, is based on evolution. All human skills are implemented biologically, using machinery designed by the process of natural selection. In the course of their evolution, natural selection has tended to preserve design improvements and optimizations. The older a skill is, the more time natural selection has had to improve the design. Abstract thought developed only very recently, and consequently, we should not expect its implementation to be particularly efficient.

As Moravec writes:

Encoded in the large, highly evolved sensory and motor portions of the human brain is a billion years of experience about the nature of the world and how to survive in it. The deliberate process we call reasoning is, I believe, the thinnest veneer of human thought, effective only because it is supported by this much older and much more powerful, though usually unconscious, sensorimotor knowledge. We are all prodigious olympians in perceptual and motor areas, so good that we make the difficult look easy. Abstract thought, though, is a new trick, perhaps less than 100 thousand years old. We have not yet mastered it. It is not all that intrinsically difficult; it just seems so when we do it.[3]

A compact way to express this argument would be:

- We should expect the difficulty of reverse-engineering any human skill to be roughly proportional to the amount of time that skill has been evolving in animals.
- The oldest human skills are largely unconscious and so appear to us to be effortless.
- Therefore, we should expect skills that appear effortless to be difficult to reverse-engineer, but skills that require effort may not necessarily be difficult to engineer at all.
concept  wiki  reference  paradox  ai  intelligence  reason  instinct  neuro  psychology  cog-psych  hardness  logic  deep-learning  time  evopsych  evolution  sapiens  the-self  EEA  embodied  embodied-cognition  abstraction  universalism-particularism  gnosis-logos  robotics 
june 2018 by nhaliday
Is the human brain analog or digital? - Quora
The brain is neither analog nor digital, but works using a signal processing paradigm that has some properties in common with both.
 
Unlike a digital computer, the brain does not use binary logic or binary addressable memory, and it does not perform binary arithmetic. Information in the brain is represented in terms of statistical approximations and estimations rather than exact values. The brain is also non-deterministic and cannot replay instruction sequences with error-free precision. So in all these ways, the brain is definitely not "digital".
 
At the same time, the signals sent around the brain are "either-or" states that are similar to binary. A neuron fires or it does not. These all-or-nothing pulses are the basic language of the brain. So in this sense, the brain is computing using something like binary signals. Instead of 1s and 0s, or "on" and "off", the brain uses "spike" or "no spike" (referring to the firing of a neuron).
q-n-a  qra  expert-experience  neuro  neuro-nitgrit  analogy  deep-learning  nature  discrete  smoothness  IEEE  bits  coding-theory  communication  trivia  bio  volo-avolo  causation  random  order-disorder  ems  models  methodology  abstraction  nitty-gritty  computation  physics  electromag  scale  coarse-fine 
april 2018 by nhaliday
AI-complete - Wikipedia
In the field of artificial intelligence, the most difficult problems are informally known as AI-complete or AI-hard, implying that the difficulty of these computational problems is equivalent to that of solving the central artificial intelligence problem—making computers as intelligent as people, or strong AI.[1] To call a problem AI-complete reflects an attitude that it would not be solved by a simple specific algorithm.

AI-complete problems are hypothesised to include computer vision, natural language understanding, and dealing with unexpected circumstances while solving any real world problem.[2]

Currently, AI-complete problems cannot be solved with modern computer technology alone, but would also require human computation. This property can be useful, for instance to test for the presence of humans as with CAPTCHAs, and for computer security to circumvent brute-force attacks.[3][4]

...

AI-complete problems are hypothesised to include:

Bongard problems
Computer vision (and subproblems such as object recognition)
Natural language understanding (and subproblems such as text mining, machine translation, and word sense disambiguation[8])
Dealing with unexpected circumstances while solving any real world problem, whether it's navigation or planning or even the kind of reasoning done by expert systems.

...

Current AI systems can solve very simple and/or restricted versions of AI-complete problems, but never in their full generality. When AI researchers attempt to "scale up" their systems to handle more complicated, real world situations, the programs tend to become excessively brittle without commonsense knowledge or a rudimentary understanding of the situation: they fail as unexpected circumstances outside of its original problem context begin to appear. When human beings are dealing with new situations in the world, they are helped immensely by the fact that they know what to expect: they know what all things around them are, why they are there, what they are likely to do and so on. They can recognize unusual situations and adjust accordingly. A machine without strong AI has no other skills to fall back on.[9]
concept  reduction  cs  computation  complexity  wiki  reference  properties  computer-vision  ai  risk  ai-control  machine-learning  deep-learning  language  nlp  order-disorder  tactics  strategy  intelligence  humanity  speculation  crux 
march 2018 by nhaliday
Information Processing: Mathematical Theory of Deep Neural Networks (Princeton workshop)
"Recently, long-past-due theoretical results have begun to emerge. These results, and those that will follow in their wake, will begin to shed light on the properties of large, adaptive, distributed learning architectures, and stand to revolutionize how computer science and neuroscience understand these systems."
hsu  scitariat  commentary  links  research  research-program  workshop  events  princeton  sanjeev-arora  deep-learning  machine-learning  ai  generalization  explanans  off-convex  nibble  frontier  speedometer  state-of-art  big-surf  announcement 
january 2018 by nhaliday
Sequence Modeling with CTC
A visual guide to Connectionist Temporal Classification, an algorithm used to train deep neural networks in speech recognition, handwriting recognition and other sequence problems.
acmtariat  techtariat  org:bleg  nibble  better-explained  machine-learning  deep-learning  visual-understanding  visualization  analysis  let-me-see  research  sequential  audio  classification  model-class  exposition  language  acm  approximation  comparison  markov  iteration-recursion  concept  atoms  distribution  orders  DP  heuristic  optimization  trees  greedy  matching  gradient-descent  org:popup 
december 2017 by nhaliday
[1709.06560] Deep Reinforcement Learning that Matters
https://twitter.com/WAWilsonIV/status/912505885565452288
I’ve been experimenting w/ various kinds of value function approaches to RL lately, and its striking how primitive and bad things seem to be
At first I thought it was just that my code sucks, but then I played with the OpenAI baselines and nope, it’s the children that are wrong.
And now, what comes across my desk but this fantastic paper: (link: https://arxiv.org/abs/1709.06560) arxiv.org/abs/1709.06560 How long until the replication crisis hits AI?

https://twitter.com/WAWilsonIV/status/911318326504153088
Seriously I’m not blown away by the PhDs’ records over the last 30 years. I bet you’d get better payoff funding eccentrics and amateurs.
There are essentially zero fundamentally new ideas in AI, the papers are all grotesquely hyperparameter tuned, nobody knows why it works.

Deep Reinforcement Learning Doesn't Work Yet: https://www.alexirpan.com/2018/02/14/rl-hard.html
Once, on Facebook, I made the following claim.

Whenever someone asks me if reinforcement learning can solve their problem, I tell them it can’t. I think this is right at least 70% of the time.
papers  preprint  machine-learning  acm  frontier  speedometer  deep-learning  realness  replication  state-of-art  survey  reinforcement  multi  twitter  social  discussion  techtariat  ai  nibble  org:mat  unaffiliated  ratty  acmtariat  liner-notes  critique  sample-complexity  cost-benefit  todo 
september 2017 by nhaliday
New Theory Cracks Open the Black Box of Deep Learning | Quanta Magazine
A new idea called the “information bottleneck” is helping to explain the puzzling success of today’s artificial-intelligence algorithms — and might also explain how human brains learn.

sounds like he's just talking about autoencoders?
news  org:mag  org:sci  popsci  announcement  research  deep-learning  machine-learning  acm  information-theory  bits  neuro  model-class  big-surf  frontier  nibble  hmm  signal-noise  deepgoog  expert  ideas  wild-ideas  summary  talks  video  israel  roots  physics  interdisciplinary  ai  intelligence  shannon  giants  arrows  preimage  lifts-projections  composition-decomposition  characterization  markov  gradient-descent  papers  liner-notes  experiment  hi-order-bits  generalization  expert-experience  explanans  org:inst  speedometer 
september 2017 by nhaliday
Superintelligence Risk Project Update II
https://www.jefftk.com/p/superintelligence-risk-project-update

https://www.jefftk.com/p/conversation-with-michael-littman
For example, I asked him what he thought of the idea that to we could get AGI with current techniques, primarily deep neural nets and reinforcement learning, without learning anything new about how intelligence works or how to implement it ("Prosaic AGI" [1]). He didn't think this was possible, and believes there are deep conceptual issues we still need to get a handle on. He's also less impressed with deep learning than he was before he started working in it: in his experience it's a much more brittle technology than he had been expecting. Specifically, when trying to replicate results, he's often found that they depend on a bunch of parameters being in just the right range, and without that the systems don't perform nearly as well.

The bottom line, to him, was that since we are still many breakthroughs away from getting to AGI, we can't productively work on reducing superintelligence risk now.

He told me that he worries that the AI risk community is not solving real problems: they're making deductions and inferences that are self-consistent but not being tested or verified in the world. Since we can't tell if that's progress, it probably isn't. I asked if he was referring to MIRI's work here, and he said their work was an example of the kind of approach he's skeptical about, though he wasn't trying to single them out. [2]

https://www.jefftk.com/p/conversation-with-an-ai-researcher
Earlier this week I had a conversation with an AI researcher [1] at one of the main industry labs as part of my project of assessing superintelligence risk. Here's what I got from them:

They see progress in ML as almost entirely constrained by hardware and data, to the point that if today's hardware and data had existed in the mid 1950s researchers would have gotten to approximately our current state within ten to twenty years. They gave the example of backprop: we saw how to train multi-layer neural nets decades before we had the computing power to actually train these nets to do useful things.

Similarly, people talk about AlphaGo as a big jump, where Go went from being "ten years away" to "done" within a couple years, but they said it wasn't like that. If Go work had stayed in academia, with academia-level budgets and resources, it probably would have taken nearly that long. What changed was a company seeing promising results, realizing what could be done, and putting way more engineers and hardware on the project than anyone had previously done. AlphaGo couldn't have happened earlier because the hardware wasn't there yet, and was only able to be brought forward by massive application of resources.

https://www.jefftk.com/p/superintelligence-risk-project-conclusion
Summary: I'm not convinced that AI risk should be highly prioritized, but I'm also not convinced that it shouldn't. Highly qualified researchers in a position to have a good sense the field have massively different views on core questions like how capable ML systems are now, how capable they will be soon, and how we can influence their development. I do think these questions are possible to get a better handle on, but I think this would require much deeper ML knowledge than I have.
ratty  core-rats  ai  risk  ai-control  prediction  expert  machine-learning  deep-learning  speedometer  links  research  research-program  frontier  multi  interview  deepgoog  games  hardware  performance  roots  impetus  chart  big-picture  state-of-art  reinforcement  futurism  🤖  🖥  expert-experience  singularity  miri-cfar  empirical  evidence-based  speculation  volo-avolo  clever-rats  acmtariat  robust  ideas  crux  atoms  detail-architecture  software  gradient-descent 
july 2017 by nhaliday
Unsupervised learning, one notion or many? – Off the convex path
(Task A) Learning a distribution from samples. (Examples: gaussian mixtures, topic models, variational autoencoders,..)

(Task B) Understanding latent structure in the data. This is not the same as (a); for example principal component analysis, clustering, manifold learning etc. identify latent structure but don’t learn a distribution per se.

(Task C) Feature Learning. Learn a mapping from datapoint → feature vector such that classification tasks are easier to carry out on feature vectors rather than datapoints. For example, unsupervised feature learning could help lower the amount of labeled samples needed for learning a classifier, or be useful for domain adaptation.

Task B is often a subcase of Task C, as the intended user of “structure found in data” are humans (scientists) who pour over the representation of data to gain some intuition about its properties, and these “properties” can be often phrased as a classification task.

This post explains the relationship between Tasks A and C, and why they get mixed up in students’ mind. We hope there is also some food for thought here for experts, namely, our discussion about the fragility of the usual “perplexity” definition of unsupervised learning. It explains why Task A doesn’t in practice lead to good enough solution for Task C. For example, it has been believed for many years that for deep learning, unsupervised pretraining should help supervised training, but this has been hard to show in practice.
acmtariat  org:bleg  nibble  machine-learning  acm  thinking  clarity  unsupervised  conceptual-vocab  concept  explanation  features  bayesian  off-convex  deep-learning  latent-variables  generative  intricacy  distribution  sampling  grokkability-clarity  org:popup 
june 2017 by nhaliday
Born Red - The New Yorker
Obama-Xi State Visit: How China's President Defines the Chinese Dream: https://www.theatlantic.com/international/archive/2015/09/xi-jinping-china-book-chinese-dream/406387/
interesting glimpse into Chinese cultural overtones

What’s new on Xi Jinping’s bookshelf this year: https://medium.com/shanghaiist/whats-new-on-xi-jinping-s-bookshelf-this-year-8d913dcc261f

China Moves to Let Xi Stay in Power by Abolishing Term Limit: https://www.nytimes.com/2018/02/25/world/asia/china-xi-jinping.html
news  org:mag  profile  china  asia  politics  government  foreign-policy  authoritarianism  sinosphere  history  mostly-modern  corruption  expansionism  usa  ideology  orient  statesmen  civil-liberty  democracy  obama  leadership  egalitarianism-hierarchy  kinship  communism  cold-war  elite  power  class  class-warfare  organizing  markets  capitalism  noble-lie  anomie  morality  multi  culture  polisci  civilization  expression-survival  individualism-collectivism  diversity  books  review  summary  facebook  barons  current-events  nationalism-globalism  ethnocentrism  identity-politics  great-powers  n-factor  alien-character  org:med  trends  list  speedometer  technology  ai  deep-learning  polanyi-marx  europe  the-great-west-whale  literature  big-peeps  the-classics  military  defense  letters  economics  broad-econ  environment  technocracy  org:rec 
april 2017 by nhaliday
Peter Norvig, the meaning of polynomials, debugging as psychotherapy | Quomodocumque
He briefly showed a demo where, given values of a polynomial, a machine can put together a few lines of code that successfully computes the polynomial. But the code looks weird to a human eye. To compute some quadratic, it nests for-loops and adds things up in a funny way that ends up giving the right output. So has it really ”learned” the polynomial? I think in computer science, you typically feel you’ve learned a function if you can accurately predict its value on a given input. For an algebraist like me, a function determines but isn’t determined by the values it takes; to me, there’s something about that quadratic polynomial the machine has failed to grasp. I don’t think there’s a right or wrong answer here, just a cultural difference to be aware of. Relevant: Norvig’s description of “the two cultures” at the end of this long post on natural language processing (which is interesting all the way through!)
mathtariat  org:bleg  nibble  tech  ai  talks  summary  philosophy  lens  comparison  math  cs  tcs  polynomials  nlp  debugging  psychology  cog-psych  complex-systems  deep-learning  analogy  legibility  interpretability  composition-decomposition  coupling-cohesion  apollonian-dionysian  heavyweights 
march 2017 by nhaliday
How do these "neural network style transfer" tools work? - Julia Evans
When we put an image into the network, it starts out as a vector of numbers (the red/green/blue values for each pixel). At each layer of the network we get another intermediate vector of numbers. There’s no inherent meaning to any of these vectors.

But! If we want to, we could pick one of those vectors arbitrarily and declare “You know, I think that vector represents the content” of the image.

The basic idea is that the further down you get in the network (and the closer towards classifying objects in the network as a “cat” or “house” or whatever”), the more the vector represents the image’s “content”.

In this paper, they designate the “conv4_2” later as the “content” layer. This seems to be pretty arbitrary – it’s just a layer that’s pretty far down the network.

Defining “style” is a bit more complicated. If I understand correctly, the definition “style” is actually the major innovation of this paper – they don’t just pick a layer and say “this is the style layer”. Instead, they take all the “feature maps” at a layer (basically there are actually a whole bunch of vectors at the layer, one for each “feature”), and define the “Gram matrix” of all the pairwise inner products between those vectors. This Gram matrix is the style.
techtariat  bangbang  deep-learning  model-class  explanation  art  visuo  machine-learning  acm  SIGGRAPH  init  inner-product  nibble 
february 2017 by nhaliday
« earlier      
per page:    204080120160

bundles : acmframetechie

related tags

:/  aaronson  abstraction  academia  accretion  accuracy  acm  acmtariat  additive  advanced  adversarial  advertising  advice  aggregator  ai  ai-control  algorithms  alien-character  alignment  allodium  ama  amazon  analogy  analysis  analytical-holistic  anglo  anglosphere  announcement  anomie  aphorism  api  apollonian-dionysian  applicability-prereqs  applications  approximation  arbitrage  arms  arrows  art  article  asia  atoms  attention  audio  authoritarianism  auto-learning  automata-languages  automation  average-case  axioms  backup  bandits  bangbang  bare-hands  barons  bayesian  being-becoming  ben-recht  benchmarks  berkeley  best-practices  better-explained  biases  big-peeps  big-picture  big-surf  bio  biodet  biohacking  bioinformatics  biotech  bitcoin  bits  blockchain  blog  boltzmann  books  bostrom  bots  brain-scan  broad-econ  browser  business  c(pp)  calculator  canada  capitalism  career  causation  characterization  chart  cheatsheet  checking  checklists  chemistry  china  circuits  civil-liberty  civilization  clarity  class  class-warfare  classic  classification  clever-rats  cloud  coalitions  coarse-fine  cocktail  cocoa  code-dive  coding-theory  cog-psych  cold-war  collaboration  commentary  common-case  communication  communism  community  comparison  compensation  competition  compilers  complement-substitute  complex-systems  complexity  composition-decomposition  computation  computer-memory  computer-vision  concept  conceptual-vocab  concrete  concurrency  conference  confidence  config  confluence  confusion  consilience  consumerism  contest  context  contradiction  contrarianism  convexity-curvature  cool  cooperate-defect  coordination  core-rats  correlation  corruption  cost-benefit  counterexample  coupling-cohesion  course  creative  CRISPR  critique  crosstab  crux  crypto  crypto-anarchy  cryptocurrency  cs  culture  curiosity  current-events  curvature  data  data-science  data-structures  dataset  dataviz  dbs  debate  debugging  decision-making  decision-theory  deep-learning  deep-materialism  deepgoog  defense  definite-planning  definition  democracy  dennett  descriptive  desktop  detail-architecture  developmental  devops  devtools  differential  dimensionality  diogenes  direct-indirect  direction  dirty-hands  discipline  discrete  discussion  disease  distributed  distribution  diversity  documentation  dotnet  DP  driving  dumb-ML  duplication  dynamic  dysgenics  economics  ecosystem  editors  education  EEA  egalitarianism-hierarchy  electromag  elegance  elite  embeddings  embodied  embodied-cognition  emergent  empirical  ems  encyclopedic  ends-means  engineering  enhancement  ensembles  entropy-like  environment  equilibrium  ergo  error  essay  essence-existence  estimate  ethics  ethnocentrism  europe  events  evidence-based  evolution  evopsych  examples  existence  exocortex  expansionism  experiment  expert  expert-experience  explanans  explanation  exploratory  explore-exploit  exposition  expression-survival  extra-introversion  extratricky  extrema  facebook  fall-2015  fall-2016  faq  features  fiction  finiteness  fixed-point  flexibility  flux-stasis  foreign-lang  foreign-policy  form-design  formal-values  forum  fourier  frameworks  frequency  frontend  frontier  futurism  game-theory  games  GCTA  generalization  generative  genetic-load  genetics  genomics  giants  gif  gig-econ  glitch  gnon  gnosis-logos  golang  google  gotchas  government  grad-school  gradient-descent  graph-theory  graphical-models  graphics  graphs  great-powers  greedy  grokkability  grokkability-clarity  ground-up  guide  GWAS  gwern  hanson  hard-tech  hardness  hardware  haskell  healthcare  heavyweights  heuristic  hi-order-bits  high-dimension  higher-ed  history  hmm  hn  homepage  homo-hetero  homogeneity  howto  hsu  huge-data-the-biggest  human-capital  human-ml  humanity  hypothesis-testing  icml  ideas  identity  identity-politics  ideology  idk  IEEE  impact  impetus  incentives  increase-decrease  individualism-collectivism  inference  info-foraging  infographic  information-theory  init  inner-product  innovation  insight  instinct  intelligence  interdisciplinary  interests  interface  interface-compatibility  internet  interpretability  interview  interview-prep  intricacy  intuition  ios  iq  israel  iteration-recursion  jargon  javascript  jobs  jvm  kaggle  kernels  keyboard  kinship  knowledge  labor  land  language  latent-variables  leadership  learning  learning-theory  lecture-notes  lectures  legibility  lens  lesswrong  let-me-see  letters  levers  lexical  libraries  lifts-projections  linear-algebra  linear-models  linearity  liner-notes  linguistics  links  linux  list  literature  local-global  logic  long-term  longform  lower-bounds  machine-learning  manifolds  map-territory  marginal  markets  markov  martingale  matching  math  math.CA  math.DS  math.GN  mathtariat  maxim-gun  measure  measurement  mechanics  media  medicine  meta:math  meta:reading  meta:research  metabuch  metal-to-virtual  metameta  methodology  metrics  michael-jordan  michael-nielsen  microsoft  military  minimum-viable  miri-cfar  mit  mobile  model-class  model-selection  models  moloch  moments  money  monte-carlo  mooc  morality  mostly-modern  motivation  move-fast-(and-break-things)  multi  multiplicative  music  n-factor  nationalism-globalism  nature  near-far  network-structure  networking  neuro  neuro-nitgrit  neurons  new-religion  news  nibble  nips  nitty-gritty  nlp  no-go  noble-lie  nonlinearity  notetaking  novelty  number  numerics  obama  ocaml-sml  occident  ocr  off-convex  offense-defense  oly  online-learning  open-closed  open-problems  openai  operational  optimism  optimization  order-disorder  orders  org:anglo  org:biz  org:bleg  org:com  org:edu  org:inst  org:mag  org:mat  org:med  org:nat  org:ngo  org:popup  org:rec  org:sci  organization  organizing  orient  os  oss  overflow  p2p  p:*  p:**  p:***  p:someday  p:whenever  PAC  papers  paradox  parsimony  paste  pdf  people  performance  personality  perturbation  pessimism  phd  philosophy  photography  physics  pic  pigeonhole-markov  planning  plots  pls  podcast  polanyi-marx  polisci  politics  polynomials  popsci  power  pragmatic  pre-2013  prediction  predictive-processing  preimage  preprint  presentation  princeton  prioritizing  priors-posteriors  privacy  pro-rata  probability  problem-solving  prof  profile  programming  progression  project  proof-systems  properties  protocol-metadata  psych-architecture  psychiatry  psychology  psychometrics  puzzles  python  q-n-a  qra  quantum  quantum-info  questions  quixotic  quiz  quotes  random  ranking  rant  rationality  ratty  reading  realness  reason  recommendations  reddit  reduction  reference  reflection  regression  regularization  regularizer  regulation  reinforcement  replication  repo  research  research-program  responsibility  retention  review  rhetoric  rigor  rigorous-crypto  risk  roadmap  robotics  robust  roots  rounding  rust  s:*  s:***  saas  sample-complexity  sampling  sanctity-degradation  sanjeev-arora  sapiens  scale  scaling-tech  scholar  scholar-pack  sci-comp  science  science-anxiety  scifi-fantasy  scitariat  search  sebastien-bubeck  security  seminar  sequential  series  shannon  shift  SIGGRAPH  signal-noise  similarity  simulation  singularity  sinosphere  skeleton  skunkworks  sleep  slides  smoothness  social  social-psych  soft-question  software  sparsity  spatial  speaking  spectral  speculation  speedometer  spengler  spock  ssc  stackex  stanford  startups  stat-power  state  state-of-art  statesmen  static-dynamic  stats  stochastic-processes  stock-flow  stories  strategy  stream  stress  strings  stripe  structure  study  studying  sub-super  subjective-objective  success  summary  supply-demand  survey  sv  symmetry  syntax  synthesis  system-design  systematic-ad-hoc  systems  tactics  talks  tcs  tcstariat  tech  tech-infrastructure  technocracy  technology  techtariat  telos-atelos  tensors  terminal  the-bones  the-classics  the-great-west-whale  the-self  the-trenches  the-world-is-just-atoms  theory-of-mind  theory-practice  theos  thick-thin  things  thinking  threat-modeling  tightness  time  time-series  todo  toolkit  tools  top-n  topics  topology  traces  track-record  tradeoffs  transitions  transportation  trees  trends  tricki  tricks  trivia  truth  tumblr  turing  tutorial  tv  twitter  ubiquity  unaffiliated  uncertainty  uniqueness  unit  universalism-particularism  unsupervised  usa  vague  values  vc-dimension  venture  video  virginia-DC  virtualization  visual-understanding  visualization  visuo  volo-avolo  vr  vulgar  waves  web  whiggish-hegelian  whole-partial-many  wiki  wild-ideas  wire-guided  workflow  working-stiff  workshop  worrydream  writing  yak-shaving  yc  yoga  yvain  🎓  👳  🔬  🖥  🤖 

Copy this bookmark:



description:


tags: