latent-variables   46

Fitting a Structural Equation Model
seems rather unrigorous: nonlinear optimization, possibility of nonconvergence, doesn't even mention local vs. global optimality...
pdf  slides  lectures  acm  stats  hypothesis-testing  graphs  graphical-models  latent-variables  model-class  optimization  nonlinearity  gotchas  nibble  ML-MAP-E  iteration-recursion  convergence 
november 2017 by nhaliday
[1711.00464] An Information-Theoretic Analysis of Deep Latent-Variable Models
We present an information-theoretic framework for understanding trade-offs in unsupervised learning of deep latent-variables models using variational inference. This framework emphasizes the need to consider latent-variable models along two dimensions: the ability to reconstruct inputs (distortion) and the communication cost (rate). We derive the optimal frontier of generative models in the two-dimensional rate-distortion plane, and show how the standard evidence lower bound objective is insufficient to select between points along this frontier. However, by performing targeted optimization to learn generative models with different rates, we are able to learn many models that can achieve similar generative performance but make vastly different trade-offs in terms of the usage of the latent variable. Through experiments on MNIST and Omniglot with a variety of architectures, we show how our framework sheds light on many recent proposed extensions to the variational autoencoder family.
papers  deep-learning  latent-variables  information-theory  neural-net  analysis 
november 2017 by arsyed
A Hybrid Causal Search Algorithm for Latent Variable Models
Existing score-based causal model search algorithms such as GES (and a speeded up version, FGS) are asymptotically correct, fast, and reliable, but make the unrealistic assumption that the true causal graph does not contain any unmeasured confounders. There are several constraint-based causal search algorithms (e.g RFCI, FCI, or FCI+) that are asymptotically correct without assuming that there are no unmeasured confounders, but often perform poorly on small samples. We describe a combined score and constraint-based algorithm, GFCI, that we prove is asymptotically correct. On synthetic data, GFCI is only slightly slower than RFCI but more accurate than FCI, RFCI and FCI+.
tetrad  causal-discovery  latent-variables  gfci  fci  algorithms 
september 2017 by arsyed
Stat 260/CS 294: Bayesian Modeling and Inference
- Priors (conjugate, noninformative, reference)
- Hierarchical models, spatial models, longitudinal models, dynamic models, survival models
- Testing
- Model choice
- Inference (importance sampling, MCMC, sequential Monte Carlo)
- Nonparametric models (Dirichlet processes, Gaussian processes, neutral-to-the-right processes, completely random measures)
- Decision theory and frequentist perspectives (complete class theorems, consistency, empirical Bayes)
- Experimental design
unit  course  berkeley  expert  michael-jordan  machine-learning  acm  bayesian  probability  stats  lecture-notes  priors-posteriors  markov  monte-carlo  frequentist  latent-variables  decision-theory  expert-experience  confidence  sampling 
july 2017 by nhaliday
Unsupervised learning, one notion or many? – Off the convex path
(Task A) Learning a distribution from samples. (Examples: gaussian mixtures, topic models, variational autoencoders,..)

(Task B) Understanding latent structure in the data. This is not the same as (a); for example principal component analysis, clustering, manifold learning etc. identify latent structure but don’t learn a distribution per se.

(Task C) Feature Learning. Learn a mapping from datapoint → feature vector such that classification tasks are easier to carry out on feature vectors rather than datapoints. For example, unsupervised feature learning could help lower the amount of labeled samples needed for learning a classifier, or be useful for domain adaptation.

Task B is often a subcase of Task C, as the intended user of “structure found in data” are humans (scientists) who pour over the representation of data to gain some intuition about its properties, and these “properties” can be often phrased as a classification task.

This post explains the relationship between Tasks A and C, and why they get mixed up in students’ mind. We hope there is also some food for thought here for experts, namely, our discussion about the fragility of the usual “perplexity” definition of unsupervised learning. It explains why Task A doesn’t in practice lead to good enough solution for Task C. For example, it has been believed for many years that for deep learning, unsupervised pretraining should help supervised training, but this has been hard to show in practice.
acmtariat  org:bleg  nibble  machine-learning  acm  thinking  clarity  unsupervised  conceptual-vocab  concept  explanation  features  bayesian  off-convex  deep-learning  latent-variables  generative  intricacy  distribution  sampling 
june 2017 by nhaliday
[1705.07904] Semantically Decomposing the Latent Spaces of Generative Adversarial Networks
We propose a new algorithm for training generative adversarial networks to jointly learn latent codes for both identities (e.g. individual humans) and observations (e.g. specific photographs). In practice, this means that by fixing the identity portion of latent codes, we can generate diverse images of the same subject, and by fixing the observation portion we can traverse the manifold of subjects while maintaining contingent aspects such as lighting and pose. Our algorithm features a pairwise training scheme in which each sample from the generator consists of two images with a common identity code. Corresponding samples from the real dataset consist of two distinct photographs of the same subject. In order to fool the discriminator, the generator must produce images that are both photorealistic, distinct, and appear to depict the same person. We augment both the DCGAN and BEGAN approaches with Siamese discriminators to accommodate pairwise training. Experiments with human judges and an off-the-shelf face verification system demonstrate our algorithm's ability to generate convincing, identity-matched photographs.
papers  gan  latent-variables 
may 2017 by arsyed

related tags

2014  ability-competence  academia  accuracy  acm  acmtariat  advanced  age-generation  agri-mindset  ai-control  akrasia  alan-willsky  algorithms  alt-inst  analogy  analysis  ankur-moitra  applications  approximation  article  arxiv  asia  assortative-mating  atoms  automata-languages  average-case  bayesian  behavioral-gen  berkeley  biases  biodet  bioinformatics  boltzmann  books  britain  broad-econ  c:**  calculation  causal-discovery  causation  china  clarity  class  classic  clever-rats  cliometrics  cog-psych  comparison  concept  conceptual-vocab  confidence  confounding  confusion  convergence  convex-optimization  convexity-curvature  correlation  counterfactual  course  critique  curvature  data-science  data  david-blei  decision-theory  deep-learning  dimensionality  direction  discussion  distribution  dp  duality  dumb-ml  economics  econotariat  embeddings  ensembles  entropy-like  estimate  europe  examples  exocortex  expert-experience  expert  explanans  explanation  exploratory  exposition  factor-analysis  fci  features  flux-stasis  frequentist  gan  gaussian-processes  generalization  generative  genetic-correlation  genetics  genomics  gfci  gotchas  gradient-descent  graph-theory  graphical-models  graphs  gregory-clark  gwas  gxe  hashing  history  honor  howto  human-capital  human-ml  hypothesis-testing  ideas  ieee  impact  inequality  information-theory  init  integrity  interdisciplinary  internet  intricacy  isotropy  iteration-recursion  iterative-methods  japan  jargon  korea  language  latin-america  learning-theory  learning  lecture-notes  lectures  legacy  libraries  linear-algebra  linear-programming  linearity  liner-notes  links  list  long-short-run  machine-learning  machinelearning  markov  matching  math.ds  math  mathtariat  matrix-factorization  mental-math  metabuch  methodology  michael-jordan  microfoundations  missing-heritability  mit  ml-map-e  mobility  model-class  model-selection  modeling  models  monte-carlo  motivation  mplus  multi  multivariate  network-structure  neural-net  neurons  nibble  nitty-gritty  nlp  nonlinearity  nordic  null-result  off-convex  online-learning  optimization  org:bleg  org:mat  oss  overflow  p:***  p:*  p:someday  package  paper  papers  pdf  pennsylvania  people  perturbation  phys-energy  pinboard  pop-structure  population-genetics  preprint  princeton  priors-posteriors  probability  prof  programming  project  psychology  psychometrics  puzzles  python  q-n-a  qra  quantum-info  quantum  quixotic  r-project  random-matrices  random-networks  ranking  ratty  reading  recommendations  reddit  reference  reflection  regression-to-mean  regularization  regularizer  reinforcement  repo  research-article  research-program  research  review  robust  roots  rounding  s-factor  sampling  sanjeev-arora  scitariat  sdp  search  sem  signal-noise  similarity  simulation  sleuthin  slides  social-science  social  sociology  software  spearhead  spectral-methods  spock  sports  stackex  stanford  stat-power  stata  state-of-art  state  statistics  stats  status  stochastic-processes  street-fighting  study  sublinear  submodular  summary  surveys  synthesis  talks  tcs  techtariat  tensor-decomposition  tensor  tetrad  thesis  thinking  tidbits  time-series  todo  toolkit  tools  topic-models  tutorial  twin-study  unit  unsupervised  values  variance-components  virginia-dc  visualization  west-hunter  wiki  yoga  🌞  🎩  👳 

Copy this bookmark: