nhaliday + features   37

Unsupervised learning, one notion or many? – Off the convex path
(Task A) Learning a distribution from samples. (Examples: gaussian mixtures, topic models, variational autoencoders,..)

(Task B) Understanding latent structure in the data. This is not the same as (a); for example principal component analysis, clustering, manifold learning etc. identify latent structure but don’t learn a distribution per se.

(Task C) Feature Learning. Learn a mapping from datapoint → feature vector such that classification tasks are easier to carry out on feature vectors rather than datapoints. For example, unsupervised feature learning could help lower the amount of labeled samples needed for learning a classifier, or be useful for domain adaptation.

Task B is often a subcase of Task C, as the intended user of “structure found in data” are humans (scientists) who pour over the representation of data to gain some intuition about its properties, and these “properties” can be often phrased as a classification task.

This post explains the relationship between Tasks A and C, and why they get mixed up in students’ mind. We hope there is also some food for thought here for experts, namely, our discussion about the fragility of the usual “perplexity” definition of unsupervised learning. It explains why Task A doesn’t in practice lead to good enough solution for Task C. For example, it has been believed for many years that for deep learning, unsupervised pretraining should help supervised training, but this has been hard to show in practice.
acmtariat  org:bleg  nibble  machine-learning  acm  thinking  clarity  unsupervised  conceptual-vocab  concept  explanation  features  bayesian  off-convex  deep-learning  latent-variables  generative  intricacy  distribution  sampling  grokkability-clarity  org:popup 
june 2017 by nhaliday
CS 731 Advanced Artificial Intelligence - Spring 2011
- statistical machine learning
- sparsity in regression
- graphical models
- exponential families
- variational methods
- MCMC
- dimensionality reduction, eg, PCA
- Bayesian nonparametrics
- compressive sensing, matrix completion, and Johnson-Lindenstrauss
course  lecture-notes  yoga  acm  stats  machine-learning  graphical-models  graphs  model-class  bayesian  learning-theory  sparsity  embeddings  markov  monte-carlo  norms  unit  nonparametric  compressed-sensing  matrix-factorization  features 
january 2017 by nhaliday
Information Processing: Thought vectors and the dimensionality of the space of concepts
If we trained a deep net to translate sentences about Physics from Martian to English, we could (roughly) estimate the "conceptual depth" of the subject. We could even compare two different subjects, such as Physics versus Art History.
hsu  ai  deep-learning  google  speculation  commentary  news  language  embeddings  neurons  thinking  papers  summary  scitariat  dimensionality  conceptual-vocab  vague  nlp  nibble  state-of-art  features 
december 2016 by nhaliday
Xavier Amatriain's answer to What is the difference between L1 and L2 regularization? - Quora
So, as opposed to what Andrew Ng explains in his "Feature selection, l1 vs l2 regularization, and rotational invariance" (Page on stanford.edu), I would say that as a rule-of-thumb, you should always go for L2 in practice.
best-practices  q-n-a  machine-learning  acm  optimization  tidbits  advice  qra  regularization  model-class  regression  sparsity  features  comparison  model-selection  norms  nibble 
november 2016 by nhaliday

bundles : abstractacm

related tags

:/  abstraction  academia  accuracy  acm  acmtariat  adversarial  advice  ai  algorithms  analysis  anglo  applications  arrows  asia  atoms  bayesian  best-practices  better-explained  biases  bits  boltzmann  brain-scan  checklists  china  clarity  classic  classification  cocktail  code-organizing  cog-psych  commentary  common-case  comparison  composition-decomposition  compressed-sensing  computation  computer-vision  concentration-of-measure  concept  conceptual-vocab  confusion  context  convexity-curvature  cool  correlation  counterexample  course  data-science  debate  debugging  deep-learning  deepgoog  definition  devtools  dimensionality  direction  distribution  dumb-ML  duplication  embeddings  engineering  ensembles  entropy-like  error  ethical-algorithms  existence  exocortex  explanans  explanation  exploratory  exposition  extrema  facebook  features  foreign-lang  fourier  french  frequency  generalization  generative  google  gotchas  graphical-models  graphs  grokkability  grokkability-clarity  guide  hashing  heuristic  hmm  hn  hsu  human-ml  information-theory  interpretability  intricacy  isotropy  iteration-recursion  kaggle  knowledge  language  latent-variables  learning-theory  lecture-notes  let-me-see  levers  libraries  linear-algebra  linear-models  linearity  liner-notes  links  list  lower-bounds  machine-learning  marginal  markov  matching  matrix-factorization  measure  methodology  metric-space  metrics  minimum-viable  mit  model-class  model-organism  model-selection  models  monte-carlo  multi  nature  network-structure  neuro  neuro-nitgrit  neurons  news  nibble  nlp  nonlinearity  nonparametric  norms  off-convex  oly  online-learning  optimization  orders  org:anglo  org:bleg  org:com  org:lite  org:mag  org:mat  org:nat  org:popup  org:sci  oss  overflow  p:someday  PAC  papers  pdf  perturbation  pic  pinboard  popsci  pragmatic  preprint  princeton  probability  prof  programming  project  proofs  psychology  q-n-a  qra  quantitative-qualitative  quora  ranking  recommendations  reference  reflection  regression  regularization  repo  research  robust  roots  sampling  sanjeev-arora  scitariat  sequential  shipping  SIGGRAPH  similarity  slides  social  sparsity  speculation  speedometer  stanford  state-of-art  stats  stochastic-processes  strings  structure  study  summary  symmetry  synthesis  system-design  talks  tcs  techtariat  thesis  thinking  tidbits  todo  tradeoffs  turing  tutorial  ubiquity  uniqueness  unit  unsupervised  vague  visual-understanding  visualization  waves  wiki  wormholes  worrydream  yak-shaving  yoga 

Copy this bookmark:



description:


tags: