**cshalizi + chatterjee.sourav**
5

[1511.01437] The sample size required in importance sampling

october 2016 by cshalizi

"The goal of importance sampling is to estimate the expected value of a given function with respect to a probability measure ν using a random sample of size n drawn from a different probability measure μ. If the two measures μ and ν are nearly singular with respect to each other, which is often the case in practice, the sample size required for accurate estimation is large. In this article it is shown that in a fairly general setting, a sample of size approximately exp(D(ν||μ)) is necessary and sufficient for accurate estimation by importance sampling, where D(ν||μ) is the Kullback--Leibler divergence of μ from ν. In particular, the required sample size exhibits a kind of cut-off in the logarithmic scale. The theory is applied to obtain a fairly general formula for the sample size required in importance sampling for exponential families (Gibbs measures). We also show that the standard variance-based diagnostic for convergence of importance sampling is fundamentally problematic. An alternative diagnostic that provably works in certain situations is suggested."

to:NB
to_read
statistics
monte_carlo
probability
information_theory
to_teach:statcomp
chatterjee.sourav
diaconis.persi
via:ded-maxim
re:fitness_sampling
october 2016 by cshalizi

[1212.1247] Matrix estimation by Universal Singular Value Thresholding

january 2013 by cshalizi

"Consider the problem of estimating the entries of a large matrix, when the observed entries are noisy versions of a small random fraction of the original entries. This problem has received widespread attention in recent times, especially after the pioneering works of Emmanuel Candes and collaborators. Typically, it is assumed that the underlying matrix has low rank. This paper introduces a simple estimation procedure, called Universal Singular Value Thresholding (USVT), that works for any matrix that has `a little bit of structure'. In particular, the matrix need not be of low rank. The procedure is very simple and fast, works under minimal assumptions, and is applicable for very large matrices. Surprisingly, this simple estimator achieves the minimax error rate up to a constant factor. The method is applied to give simple solutions to difficult questions in low rank matrix estimation, blockmodels, distance matrix completion, latent space models, positive definite matrix completion, problems related to graph limits, and generalized Bradley-Terry models for pairwise comparison."

in_NB
estimation
statistics
spectral_clustering
graph_limits
chatterjee.sourav
to_read
january 2013 by cshalizi

[1102.2650] Estimating and Understanding Exponential Random Graph Models

february 2011 by cshalizi

Rigorous results on conditions under which mean field approximations become exact for large ERGMs, and consequently they end up looking like Erdos-Renyi graphs (at some effective density); these come from clever large-deviations arguments. To over-simplify some pretty theorems, the cases are (1) all interactions between edges are positive, so introducing any positive density of edges at all tends to produce a blow-up towards a nearly complete graph (where edges are independent); or (2) all interactions are extremely weak, and consequently negligible in the large-scale limit.

exponential_family_random_graphs
network_data_analysis
large_deviations
mean-field_theory
statistics
stochastic_processes
re:almost_none
graph_limits
statistical_mechanics
have_read
re:smoothing_adjacency_matrices
re:your_favorite_ergm_sucks
in_NB
chatterjee.sourav
diaconis.persi
february 2011 by cshalizi

[1005.1136] Random graphs with a given degree sequence

july 2010 by cshalizi

Per Newman, Strogatz & Watts (2001; cond-mat/0007235), these graphs are very natural null models for networks, so this may be useful for getting analytic theories for such tests.

random_graphs
network_data_analysis
via:arinaldo
re:smoothing_adjacency_matrices
graph_limits
have_read
diaconis.persi
chatterjee.sourav
july 2010 by cshalizi

**related tags**

Copy this bookmark: