[1901.01624] Composite optimization for robust blind deconvolution
The blind deconvolution problem seeks to recover a pair of vectors from a set of rank one bilinear measurements. We consider a natural nonsmooth formulation of the problem and show that under standard statistical assumptions, its moduli of weak convexity, sharpness, and Lipschitz continuity are all dimension independent. This phenomenon persists even when up to half of the measurements are corrupted by noise. Consequently, standard algorithms, such as the subgradient and prox-linear methods, converge at a rapid dimension-independent rate when initialized within constant relative error of the solution. We then complete the paper with a new initialization strategy, complementing the local search algorithms. The initialization procedure is both provably efficient and robust to outlying measurements. Numerical experiments, on both simulated and real data, illustrate the developed theory and methods.
approximation  inverse-problems  statistics  inference  algorithms  numerical-methods 
16 days ago by Vaguery
[1712.07381] Extreme Value Analysis Without the Largest Values: What Can Be Done?
In this paper we are concerned with the analysis of heavy-tailed data when a portion of the extreme values is unavailable. This research was motivated by an analysis of the degree distributions in a large social network. The degree distributions of such networks tend to have power law behavior in the tails. We focus on the Hill estimator, which plays a starring role in heavy-tailed modeling. The Hill estimator for this data exhibited a smooth and increasing "sample path" as a function of the number of upper order statistics used in constructing the estimator. This behavior became more apparent as we artificially removed more of the upper order statistics. Building on this observation we introduce a new version of the Hill estimator. It is a function of the number of the upper order statistics used in the estimation, but also depends on the number of unavailable extreme values. We establish functional convergence of the normalized Hill estimator to a Gaussian process. An estimation procedure is developed based on the limit theory to estimate the number of missing extremes and extreme value parameters including the tail index and the bias of Hill's estimator. We illustrate how this approach works in both simulations and real data examples.
statistics  extreme-values  rather-interesting  algorithms  estimation  inference  data-analysis  nudge-targets  consider:looking-to-see  consider:performance-measures 
16 days ago by Vaguery
Microsoft/onnxruntime: ONNX Runtime: cross-platform, high performance scoring engine for ML models
ONNX Runtime: cross-platform, high performance scoring engine for ML models - Microsoft/onnxruntime
github  inference  deep_learning 
5 weeks ago by lxp121
The Friendship Paradox and Systematic Biases in Perceptions and Social Norms | Journal of Political Economy: Vol 127, No 2
"The “friendship paradox” (first noted by Feld in 1991) refers to the fact that, on average, people have strictly fewer friends than their friends have. I show that this oversampling of more popular people can lead people to perceive more engagement than exists in the overall population. This feeds back to amplify engagement in behaviors that involve complementarities. Also, people with the greatest proclivity for a behavior choose to interact the most, leading to further feedback and amplification. These results are consistent with studies finding overestimation of peer consumption of alcohol, cigarettes, and drugs and with resulting high levels of drug and alcohol consumption."

So this is about misinference from networks then?
matthew.jackson  networks  inference 
5 weeks ago by MarcK
Introduction to Gravity Models of Migration & Trade | Programming Historian
> A gravity model's goal is to tell the user: given a number of influencing forces (distance, cost of living) affecting migration or movement of a large number of entities of the same type (people, coffee beans, widgets) between a set number of points (39 counties and London or Colombia and various countries), the model can suggest the most probable distribution of those people, coffee beans, or widgets. It operates on the principle that if you know the volume of movement, and you know the factors influencing it, you can predict with reasonable accuracy the outcome of even complex movement within a confined system.
history  modeling  statistical-modeling  inference  migration 
6 weeks ago by tarakc02
problog / DeepProbLog — Bitbucket
DeepProbLog is an extension of [ProbLog]( that integrates Probabilistic Logic Programming with Deep Learning
logic  probabilistic  programming  prolog  knowledgegraph  inference  deeplearning 
6 weeks ago by tobym
Introduction. — ProbLog: Probabilistic Programming
Probabilistic logic programs are logic programs in which some of the facts are annotated with probabilities.

ProbLog is a tool that allows you to intuitively build programs that do not only encode complex interactions between a large sets of heterogenous components but also the inherent uncertainties that are present in real-life situations
logic  probabilistic  programming  prolog  knowledgegraph  inference 
6 weeks ago by tobym
Common statistical tests are linear models (or: how to teach stats)
Most of the common statistical models (t-test, correlation, ANOVA; chi-square, etc.) are special cases of linear models or a very close approximation. This beautiful simplicity means that there is less to learn. In particular, it all comes down to y=a⋅x+b which most students know from highschool. Unfortunately, stats intro courses are usually taught as if each test is an independent tool, needlessly making life more complicated for students and teachers alike.
statistics  linear-models  inference  statistical-tests 
6 weeks ago by tarakc02
andrewheiss/diff-means-half-dozen-ways: Run standard t-tests, simulations, and Bayesian difference in means tests with R and Stan
Run standard t-tests, simulations, and Bayesian difference in means tests with R and Stan - andrewheiss/diff-means-half-dozen-ways
statistics  Data_Science  inference  github  CDS_101  Pedagogical_Resources 
6 weeks ago by jkglasbrenner

