nhaliday + acm + model-class   49

Workshop Abstract | Identifying and Understanding Deep Learning Phenomena
ICML 2019 workshop, June 15th 2019, Long Beach, CA

We solicit contributions that view the behavior of deep nets as natural phenomena, to be investigated with methods inspired from the natural sciences like physics, astronomy, and biology.
unit  workshop  acm  machine-learning  science  empirical  nitty-gritty  atoms  deep-learning  model-class  icml  data-science  rigor  replication  examples  ben-recht  physics 
april 2019 by nhaliday
Sequence Modeling with CTC
A visual guide to Connectionist Temporal Classification, an algorithm used to train deep neural networks in speech recognition, handwriting recognition and other sequence problems.
acmtariat  techtariat  org:bleg  nibble  better-explained  machine-learning  deep-learning  visual-understanding  visualization  analysis  let-me-see  research  sequential  audio  classification  model-class  exposition  language  acm  approximation  comparison  markov  iteration-recursion  concept  atoms  distribution  orders  DP  heuristic  optimization  trees  greedy  matching  gradient-descent 
december 2017 by nhaliday
Fitting a Structural Equation Model
seems rather unrigorous: nonlinear optimization, possibility of nonconvergence, doesn't even mention local vs. global optimality...
pdf  slides  lectures  acm  stats  hypothesis-testing  graphs  graphical-models  latent-variables  model-class  optimization  nonlinearity  gotchas  nibble  ML-MAP-E  iteration-recursion  convergence 
november 2017 by nhaliday
New Theory Cracks Open the Black Box of Deep Learning | Quanta Magazine
A new idea called the “information bottleneck” is helping to explain the puzzling success of today’s artificial-intelligence algorithms — and might also explain how human brains learn.

sounds like he's just talking about autoencoders?
news  org:mag  org:sci  popsci  announcement  research  deep-learning  machine-learning  acm  information-theory  bits  neuro  model-class  big-surf  frontier  nibble  hmm  signal-noise  deepgoog  expert  ideas  wild-ideas  summary  talks  video  israel  roots  physics  interdisciplinary  ai  intelligence  shannon  giants  arrows  preimage  lifts-projections  composition-decomposition  characterization  markov  gradient-descent  papers  liner-notes  experiment  hi-order-bits  generalization  expert-experience  explanans  org:inst  speedometer 
september 2017 by nhaliday
How do these "neural network style transfer" tools work? - Julia Evans
When we put an image into the network, it starts out as a vector of numbers (the red/green/blue values for each pixel). At each layer of the network we get another intermediate vector of numbers. There’s no inherent meaning to any of these vectors.

But! If we want to, we could pick one of those vectors arbitrarily and declare “You know, I think that vector represents the content” of the image.

The basic idea is that the further down you get in the network (and the closer towards classifying objects in the network as a “cat” or “house” or whatever”), the more the vector represents the image’s “content”.

In this paper, they designate the “conv4_2” later as the “content” layer. This seems to be pretty arbitrary – it’s just a layer that’s pretty far down the network.

Defining “style” is a bit more complicated. If I understand correctly, the definition “style” is actually the major innovation of this paper – they don’t just pick a layer and say “this is the style layer”. Instead, they take all the “feature maps” at a layer (basically there are actually a whole bunch of vectors at the layer, one for each “feature”), and define the “Gram matrix” of all the pairwise inner products between those vectors. This Gram matrix is the style.
techtariat  bangbang  deep-learning  model-class  explanation  art  visuo  machine-learning  acm  SIGGRAPH  init  inner-product  nibble 
february 2017 by nhaliday
CS 731 Advanced Artificial Intelligence - Spring 2011
- statistical machine learning
- sparsity in regression
- graphical models
- exponential families
- variational methods
- dimensionality reduction, eg, PCA
- Bayesian nonparametrics
- compressive sensing, matrix completion, and Johnson-Lindenstrauss
course  lecture-notes  yoga  acm  stats  machine-learning  graphical-models  graphs  model-class  bayesian  learning-theory  sparsity  embeddings  markov  monte-carlo  norms  unit  nonparametric  compressed-sensing  matrix-factorization  features 
january 2017 by nhaliday
Xavier Amatriain's answer to What is the difference between L1 and L2 regularization? - Quora
So, as opposed to what Andrew Ng explains in his "Feature selection, l1 vs l2 regularization, and rotational invariance" (Page on stanford.edu), I would say that as a rule-of-thumb, you should always go for L2 in practice.
best-practices  q-n-a  machine-learning  acm  optimization  tidbits  advice  qra  regularization  model-class  regression  sparsity  features  comparison  model-selection  norms  nibble 
november 2016 by nhaliday

bundles : academeacmframemeta

related tags

:/  abstraction  academia  accuracy  acm  acmtariat  advanced  adversarial  advice  ai  analysis  announcement  apollonian-dionysian  applications  approximation  arrows  art  article  assortative-mating  atoms  audio  automata-languages  bangbang  bare-hands  bayesian  behavioral-gen  ben-recht  benchmarks  best-practices  better-explained  big-picture  big-surf  biodet  bioinformatics  bits  boltzmann  causation  characterization  chart  checklists  classic  classification  clever-rats  coarse-fine  columbia  commentary  comparison  composition-decomposition  compressed-sensing  computer-vision  concept  conceptual-vocab  confidence  confluence  confusion  convergence  convexity-curvature  correlation  counterexample  course  curvature  data  data-science  decision-making  decision-theory  deep-learning  deepgoog  definition  descriptive  differential  dimensionality  discovery  discussion  distribution  DP  duality  dumb-ML  embeddings  empirical  ends-means  engineering  ensembles  entropy-like  error  estimate  examples  exocortex  experiment  expert  expert-experience  explanans  explanation  exploratory  exposition  extrema  features  finiteness  fourier  frequency  frontier  game-theory  generalization  generative  genetics  genomics  giants  gotchas  gradient-descent  graph-theory  graphical-models  graphs  greedy  ground-up  guide  GxE  heuristic  hi-order-bits  history  hmm  hn  homepage  homo-hetero  homogeneity  human-ml  hypothesis-testing  icml  ideas  idk  IEEE  impact  information-theory  init  inner-product  intelligence  interdisciplinary  intricacy  intuition  israel  iteration-recursion  jargon  kernels  knowledge  language  latent-variables  learning-theory  lecture-notes  lectures  let-me-see  levers  libraries  lifts-projections  linear-algebra  linear-models  linearity  liner-notes  links  list  low-hanging  lower-bounds  machine-learning  manifolds  markov  martingale  matching  math  math.CA  math.DS  math.GN  matrix-factorization  measure  measurement  meta:math  metabuch  metameta  methodology  michael-jordan  ML-MAP-E  model-class  model-selection  models  moments  monte-carlo  mostly-modern  motivation  multi  network-structure  neuro  news  nibble  nitty-gritty  nlp  nonlinearity  nonparametric  norms  numerics  objektbuch  off-convex  openai  optimization  orders  org:bleg  org:edu  org:inst  org:mag  org:mat  org:sci  overflow  p:*  p:***  p:someday  papers  parametric  parsimony  pdf  perturbation  physics  pigeonhole-markov  pinboard  pls  popsci  population-genetics  positivity  ppl  pragmatic  pre-2013  preimage  preprint  princeton  prioritizing  priors-posteriors  probability  problem-solving  programming  project  properties  python  q-n-a  qra  quixotic  random  ranking  ratty  reading  realness  recommendations  reduction  reference  reflection  regression  regularization  reinforcement  replication  research  rigor  roadmap  robust  roots  rounding  s:*  s:***  sample-complexity  sanjeev-arora  scholar-pack  science  search  sequential  series  shannon  SIGGRAPH  signal-noise  signum  skeleton  slides  smoothness  soft-question  sparsity  spectral  speedometer  stanford  stat-power  state-of-art  stats  stochastic-processes  stories  structure  study  studying  success  summary  synthesis  systematic-ad-hoc  talks  tcs  teaching  tech  techtariat  telos-atelos  the-trenches  thesis  thinking  tidbits  tightness  todo  toolkit  top-n  topology  track-record  trees  tricki  tricks  tutorial  twin-study  unit  unsupervised  values  variance-components  vc-dimension  video  visual-understanding  visualization  visuo  volo-avolo  waves  wiki  wild-ideas  workshop  yoga  🌞  🎓  👳  🖥 

Copy this bookmark: