nhaliday + model-class   93

Workshop Abstract | Identifying and Understanding Deep Learning Phenomena
ICML 2019 workshop, June 15th 2019, Long Beach, CA

We solicit contributions that view the behavior of deep nets as natural phenomena, to be investigated with methods inspired from the natural sciences like physics, astronomy, and biology.
unit  workshop  acm  machine-learning  science  empirical  nitty-gritty  atoms  deep-learning  model-class  icml  data-science  rigor  replication  examples  ben-recht  physics 
april 2019 by nhaliday
Sequence Modeling with CTC
A visual guide to Connectionist Temporal Classification, an algorithm used to train deep neural networks in speech recognition, handwriting recognition and other sequence problems.
acmtariat  techtariat  org:bleg  nibble  better-explained  machine-learning  deep-learning  visual-understanding  visualization  analysis  let-me-see  research  sequential  audio  classification  model-class  exposition  language  acm  approximation  comparison  markov  iteration-recursion  concept  atoms  distribution  orders  DP  heuristic  optimization  trees  greedy  matching  gradient-descent  org:popup 
december 2017 by nhaliday
Fitting a Structural Equation Model
seems rather unrigorous: nonlinear optimization, possibility of nonconvergence, doesn't even mention local vs. global optimality...
pdf  slides  lectures  acm  stats  hypothesis-testing  graphs  graphical-models  latent-variables  model-class  optimization  nonlinearity  gotchas  nibble  ML-MAP-E  iteration-recursion  convergence 
november 2017 by nhaliday
New Theory Cracks Open the Black Box of Deep Learning | Quanta Magazine
A new idea called the “information bottleneck” is helping to explain the puzzling success of today’s artificial-intelligence algorithms — and might also explain how human brains learn.

sounds like he's just talking about autoencoders?
news  org:mag  org:sci  popsci  announcement  research  deep-learning  machine-learning  acm  information-theory  bits  neuro  model-class  big-surf  frontier  nibble  hmm  signal-noise  deepgoog  expert  ideas  wild-ideas  summary  talks  video  israel  roots  physics  interdisciplinary  ai  intelligence  shannon  giants  arrows  preimage  lifts-projections  composition-decomposition  characterization  markov  gradient-descent  papers  liner-notes  experiment  hi-order-bits  generalization  expert-experience  explanans  org:inst  speedometer 
september 2017 by nhaliday
Is the U.S. Aggregate Production Function Cobb-Douglas? New Estimates of the Elasticity of Substitution∗
world-wide: http://www.socsci.uci.edu/~duffy/papers/jeg2.pdf
We find that IPP capital entirely explains the observed decline of the US labor share, which otherwise is secularly constant over the past 65 years for structures and equipment capital. The labor share decline simply reflects the fact that the US economy is undergoing a transition toward a larger IPP sector.
The Fall of the Labor Share and the Rise of Superstar Firms: http://www.nber.org/papers/w23396
The Decline of the U.S. Labor Share: https://www.brookings.edu/wp-content/uploads/2016/07/2013b_elsby_labor_share.pdf
Table 2 has industry disaggregation
Estimating the U.S. labor share: https://www.bls.gov/opub/mlr/2017/article/estimating-the-us-labor-share.htm

Why Workers Are Losing to Capitalists: https://www.bloomberg.com/view/articles/2017-09-20/why-workers-are-losing-to-capitalists
Automation and offshoring may be conspiring to reduce labor's share of income.
pdf  study  economics  growth-econ  econometrics  usa  data  empirical  analysis  labor  capital  econ-productivity  manifolds  magnitude  multi  world  🎩  piketty  econotariat  compensation  inequality  winner-take-all  org:ngo  org:davos  flexibility  distribution  stylized-facts  regularizer  hmm  history  mostly-modern  property-rights  arrows  invariance  industrial-org  trends  wonkish  roots  synthesis  market-power  efficiency  variance-components  business  database  org:gov  article  model-class  models  automation  nationalism-globalism  trade  news  org:mag  org:biz  org:bv  noahpinion  explanation  summary  methodology  density  polarization  map-territory  input-output 
july 2017 by nhaliday
How do these "neural network style transfer" tools work? - Julia Evans
When we put an image into the network, it starts out as a vector of numbers (the red/green/blue values for each pixel). At each layer of the network we get another intermediate vector of numbers. There’s no inherent meaning to any of these vectors.

But! If we want to, we could pick one of those vectors arbitrarily and declare “You know, I think that vector represents the content” of the image.

The basic idea is that the further down you get in the network (and the closer towards classifying objects in the network as a “cat” or “house” or whatever”), the more the vector represents the image’s “content”.

In this paper, they designate the “conv4_2” later as the “content” layer. This seems to be pretty arbitrary – it’s just a layer that’s pretty far down the network.

Defining “style” is a bit more complicated. If I understand correctly, the definition “style” is actually the major innovation of this paper – they don’t just pick a layer and say “this is the style layer”. Instead, they take all the “feature maps” at a layer (basically there are actually a whole bunch of vectors at the layer, one for each “feature”), and define the “Gram matrix” of all the pairwise inner products between those vectors. This Gram matrix is the style.
techtariat  bangbang  deep-learning  model-class  explanation  art  visuo  machine-learning  acm  SIGGRAPH  init  inner-product  nibble 
february 2017 by nhaliday
[1604.03640] Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex
We discuss relations between Residual Networks (ResNet), Recurrent Neural Networks (RNNs) and the primate visual cortex. We begin with the observation that a shallow RNN is exactly equivalent to a very deep ResNet with weight sharing among the layers. A direct implementation of such a RNN, although having orders of magnitude fewer parameters, leads to a performance similar to the corresponding ResNet. We propose 1) a generalization of both RNN and ResNet architectures and 2) the conjecture that a class of moderately deep RNNs is a biologically-plausible model of the ventral stream in visual cortex. We demonstrate the effectiveness of the architectures by testing them on the CIFAR-10 dataset.
papers  preprint  neuro  biodet  interdisciplinary  deep-learning  model-class  identity  machine-learning  nibble  org:mat  computer-vision 
february 2017 by nhaliday
Performance Trends in AI | Otium
Deep learning has revolutionized the world of artificial intelligence. But how much does it improve performance? How have computers gotten better at different tasks over time, since the rise of deep learning?

In games, what the data seems to show is that exponential growth in data and computation power yields exponential improvements in raw performance. In other words, you get out what you put in. Deep learning matters, but only because it provides a way to turn Moore’s Law into corresponding performance improvements, for a wide class of problems. It’s not even clear it’s a discontinuous advance in performance over non-deep-learning systems.

In image recognition, deep learning clearly is a discontinuous advance over other algorithms. But the returns to scale and the improvements over time seem to be flattening out as we approach or surpass human accuracy.

In speech recognition, deep learning is again a discontinuous advance. We are still far away from human accuracy, and in this regime, accuracy seems to be improving linearly over time.

In machine translation, neural nets seem to have made progress over conventional techniques, but it’s not yet clear if that’s a real phenomenon, or what the trends are.

In natural language processing, trends are positive, but deep learning doesn’t generally seem to do better than trendline.


The learned agent performs much better than the hard-coded agent, but moves more jerkily and “randomly” and doesn’t know the law of reflection. Similarly, the reports of AlphaGo producing “unusual” Go moves are consistent with an agent that can do pattern-recognition over a broader space than humans can, but which doesn’t find the “laws” or “regularities” that humans do.

Perhaps, contrary to the stereotype that contrasts “mechanical” with “outside-the-box” thinking, reinforcement learners can “think outside the box” but can’t find the box?

ratty  core-rats  summary  prediction  trends  analysis  spock  ai  deep-learning  state-of-art  🤖  deepgoog  games  nlp  computer-vision  nibble  reinforcement  model-class  faq  org:bleg  shift  chart  technology  language  audio  accuracy  speaking  foreign-lang  definite-planning  china  asia  microsoft  google  ideas  article  speedometer  whiggish-hegelian  yvain  ssc  smoothness  data  hsu  scitariat  genetics  iq  enhancement  genetic-load  neuro  neuro-nitgrit  brain-scan  time-series  multiplicative  iteration-recursion  additive  multi  arrows 
january 2017 by nhaliday
CS 731 Advanced Artificial Intelligence - Spring 2011
- statistical machine learning
- sparsity in regression
- graphical models
- exponential families
- variational methods
- dimensionality reduction, eg, PCA
- Bayesian nonparametrics
- compressive sensing, matrix completion, and Johnson-Lindenstrauss
course  lecture-notes  yoga  acm  stats  machine-learning  graphical-models  graphs  model-class  bayesian  learning-theory  sparsity  embeddings  markov  monte-carlo  norms  unit  nonparametric  compressed-sensing  matrix-factorization  features 
january 2017 by nhaliday
Xavier Amatriain's answer to What is the difference between L1 and L2 regularization? - Quora
So, as opposed to what Andrew Ng explains in his "Feature selection, l1 vs l2 regularization, and rotational invariance" (Page on stanford.edu), I would say that as a rule-of-thumb, you should always go for L2 in practice.
best-practices  q-n-a  machine-learning  acm  optimization  tidbits  advice  qra  regularization  model-class  regression  sparsity  features  comparison  model-selection  norms  nibble 
november 2016 by nhaliday
« earlier      
per page:    204080120160

bundles : acmmeta

related tags

:/  abstraction  academia  accuracy  acm  acmtariat  additive  advanced  adversarial  advice  age-generation  ai  ai-control  alignment  analysis  announcement  apollonian-dionysian  applications  approximation  arrows  art  article  asia  assortative-mating  atoms  attention  audio  automata-languages  automation  average-case  backup  bangbang  bare-hands  bayesian  behavioral-gen  ben-recht  benchmarks  best-practices  better-explained  bias-variance  big-picture  big-surf  bio  biodet  bioinformatics  bits  boltzmann  books  bostrom  brain-scan  business  c(pp)  capital  causation  characterization  chart  cheatsheet  checking  checklists  china  circuits  classic  classification  clever-rats  coarse-fine  columbia  commentary  comparison  compensation  competition  complement-substitute  composition-decomposition  compressed-sensing  computation  computer-vision  concept  conceptual-vocab  concrete  concurrency  confidence  confluence  confusion  convergence  convexity-curvature  cool  cooperate-defect  coordination  core-rats  correlation  counterexample  course  critique  crux  crypto  curvature  dan-luu  data  data-science  database  dbs  debugging  decision-making  decision-theory  deep-learning  deepgoog  definite-planning  definition  density  descriptive  differential  dimensionality  direct-indirect  discovery  discussion  distribution  documentation  DP  duality  dumb-ML  duplication  dynamic  econ-productivity  econometrics  economics  econotariat  ecosystem  efficiency  embeddings  empirical  ends-means  engineering  enhancement  ensembles  entropy-like  equilibrium  error  estimate  examples  exocortex  experiment  expert  expert-experience  explanans  explanation  exploratory  explore-exploit  exposition  extratricky  extrema  facebook  faq  features  finiteness  flexibility  flux-stasis  foreign-lang  fourier  frequency  frontier  futurism  game-theory  games  gaussian-processes  generalization  generative  genetic-load  genetics  genomics  giants  google  gotchas  gradient-descent  graph-theory  graphical-models  graphics  graphs  greedy  ground-up  growth-econ  guide  GWAS  gwern  GxE  heuristic  hi-order-bits  high-dimension  history  hmm  hn  homepage  homo-hetero  homogeneity  hsu  human-capital  human-ml  hypothesis-testing  icml  ideas  identity  idk  IEEE  impact  incentives  industrial-org  inequality  infographic  information-theory  init  inner-product  input-output  intelligence  interdisciplinary  interests  interface  interface-compatibility  interpretability  intricacy  intuition  invariance  iq  israel  iteration-recursion  jargon  kaggle  kernels  knowledge  labor  language  latent-variables  learning-theory  lecture-notes  lectures  len:short  let-me-see  levers  libraries  life-history  lifts-projections  linear-algebra  linear-models  linearity  liner-notes  links  list  low-hanging  lower-bounds  machine-learning  magnitude  manifolds  map-territory  market-power  markov  martingale  matching  math  math.CA  math.DS  math.GN  matrix-factorization  measure  measurement  meta:math  metabuch  metameta  methodology  michael-jordan  microfoundations  microsoft  miri-cfar  mit  ML-MAP-E  model-class  model-selection  models  moloch  moments  money  monte-carlo  mostly-modern  motivation  multi  multiplicative  nationalism-globalism  network-structure  networking  neuro  neuro-nitgrit  news  nibble  nitty-gritty  nlp  noahpinion  nonlinearity  nonparametric  norms  numerics  objektbuch  off-convex  oly  online-learning  openai  optimization  orders  org:biz  org:bleg  org:bv  org:com  org:davos  org:edu  org:gov  org:inst  org:lite  org:mag  org:mat  org:med  org:ngo  org:popup  org:sci  oss  overflow  p:*  p:***  p:someday  papers  parametric  parsimony  paste  pdf  performance  perturbation  physics  pigeonhole-markov  piketty  pinboard  pls  polarization  popsci  population-genetics  positivity  ppl  pragmatic  pre-2013  prediction  preimage  preprint  princeton  prioritizing  priors-posteriors  probability  problem-solving  programming  project  properties  property-rights  protocol-metadata  python  q-n-a  qra  quixotic  random  ranking  ratty  reading  realness  recommendations  reduction  reference  reflection  regression  regularization  regularizer  regulation  reinforcement  replication  repo  research  research-program  retention  rhetoric  rigor  risk  roadmap  robust  roots  rounding  s:*  s:***  saas  sample-complexity  sanjeev-arora  scholar-pack  sci-comp  science  scitariat  search  sequential  series  shannon  shift  SIGGRAPH  signal-noise  signum  similarity  skeleton  slides  smoothness  social  soft-question  software  sparsity  speaking  spectral  speculation  speedometer  spock  ssc  stanford  stat-power  state-of-art  stats  stochastic-processes  stories  strategy  structure  study  studying  stylized-facts  success  summary  survey  synthesis  systematic-ad-hoc  systems  talks  tcs  teaching  tech  technology  techtariat  telos-atelos  the-self  the-trenches  thesis  thinking  threat-modeling  tidbits  tightness  time  time-series  todo  toolkit  tools  top-n  topology  track-record  trade  trees  trends  tricki  tricks  turing  tutorial  twin-study  twitter  unit  unsupervised  usa  values  variance-components  vc-dimension  video  visual-understanding  visualization  visuo  volo-avolo  waves  whiggish-hegelian  white-paper  wiki  wild-ideas  winner-take-all  wonkish  workshop  world  yoga  yvain  🌞  🎓  🎩  👳  🖥  🤖 

Copy this bookmark: