**heavy_tails**144

A Non‐Gaussian Spatio‐Temporal Model for Daily Wind Speeds Based on a Multi‐Variate Skew‐t Distribution - Tagle - - Journal of Time Series Analysis - Wiley Online Library

10 weeks ago by cshalizi

"Facing increasing domestic energy consumption from population growth and industrialization, Saudi Arabia is aiming to reduce its reliance on fossil fuels and to broaden its energy mix by expanding investment in renewable energy sources, including wind energy. A preliminary task in the development of wind energy infrastructure is the assessment of wind energy potential, a key aspect of which is the characterization of its spatio‐temporal behavior. In this study we examine the impact of internal climate variability on seasonal wind power density fluctuations over Saudi Arabia using 30 simulations from the Large Ensemble Project (LENS) developed at the National Center for Atmospheric Research. Furthermore, a spatio‐temporal model for daily wind speed is proposed with neighbor‐based cross‐temporal dependence, and a multi‐variate skew‐t distribution to capture the spatial patterns of higher‐order moments. The model can be used to generate synthetic time series over the entire spatial domain that adequately reproduce the internal variability of the LENS dataset."

to:NB
spatio-temporal_statistics
heavy_tails
meteorology
statistics
to_teach:data_over_space_and_time
10 weeks ago by cshalizi

Skewed Wealth Distributions: Theory and Empirics

december 2018 by cshalizi

"Invariably across a cross-section of countries and time periods, wealth distributions are skewed to the right displaying thick upper tails, that is, large and slowly declining top wealth shares. In this survey we categorize the theoretical studies on the distribution of wealth in terms of the underlying economic mechanisms generating skewness and thick tails. Further, we show how these mechanisms can be micro-founded by the consumption-saving decisions of rational agents in specific economic and demographic environments. Finally we map the large empirical work on the wealth distribution to its theoretical underpinnings."

to:NB
heavy_tails
inequality
economics
december 2018 by cshalizi

Deadly Quarrels by David Wilkinson - Paperback - University of California Press

october 2018 by cshalizi

"Lewis Fry Richardson was one of the first to develop the systematic study of the causes of war; yet his great war data archive, Statistics of Deadly Quarrels, posthumously published, has yet to be fully systematized and assimilated by war-causation scholars. David Wilkinson has reanalyzed Richardson's data and drawn together the results of kindred quantitative work on the causes of war, from other as well as from Richardson. He has translated this classic of international relations literature into contemporary idiom, fully and accurately presenting the substance of Richardson's idea and at the same time bringing it up to date with judicious comment, updating the references to the critical and successor literature, and dealing in some detail with Richardson himself. Professor Wilkinson lists among the findings: 1. the death toll of war is largely the product of a very few immense wars; 2. most wars do not escalate out of control, they are vey likely to be small, brief, and exclusive; 3. great powers have done most of the world's fighting, inflicting and suffering most of the casualties; 4. the propensity of any two groups to fight increases as the ethnocultural differences between them increase. Contemporary peace strategy would therefore seem to be to avoid World War III by promoting superpower detente, and reanimating, accelerating, and civilizing the process of world economic development.

"This title is part of UC Press's Voices Revived program, which commemorates University of California Press’s mission to seek out and cultivate the brightest minds and give them voice, reach, and impact. Drawing on a backlist dating to 1893, Voices Revived makes high-quality, peer-reviewed scholarship accessible once again using print-on-demand technology. This title was originally published in 1980."

to:NB
violence
war
heavy_tails
lives_of_the_scholars
books:noted
"This title is part of UC Press's Voices Revived program, which commemorates University of California Press’s mission to seek out and cultivate the brightest minds and give them voice, reach, and impact. Drawing on a backlist dating to 1893, Voices Revived makes high-quality, peer-reviewed scholarship accessible once again using print-on-demand technology. This title was originally published in 1980."

october 2018 by cshalizi

How much has wealth concentration grown in the United States? A re-examination of data from 2001-2013

may 2018 by cshalizi

"Well known research based on capitalized income tax data shows robust growth in wealth concentration in the late 2000s. We show that these robust growth estimates rely on an assumption—homogeneous rates of return across the wealth distribution—that is not supported by data. When the capitalization model incorporates heterogeneous rates of return (on just interest-bearing assets), wealth concentration estimates in 2011 fall from 40.5% to 33.9%. These estimates are consistent in levels and trend with other micro wealth data and show that wealth concentration increases until the Great Recession, then declines before increasing again."

to:NB
economics
inequality
heavy_tails
class_struggles_in_america
may 2018 by cshalizi

Robust Regression on Stationary Time Series: A Self‐Normalized Resampling Approach - Akashi - 2018 - Journal of Time Series Analysis - Wiley Online Library

may 2018 by cshalizi

"This article extends the self‐normalized subsampling method of Bai et al. (2016) to the M‐estimation of linear regression models, where the covariate and the noise are stationary time series which may have long‐range dependence or heavy tails. The method yields an asymptotic confidence region for the unknown coefficients of the linear regression. The determination of these regions does not involve unknown parameters such as the intensity of the dependence or the heaviness of the distributional tail of the time series. Additional simulations can be found in a supplement. The computer codes are available from the authors."

to:NB
time_series
statistics
linear_regression
heavy_tails
long-range_dependence
may 2018 by cshalizi

The power of absolute discounting: all-dimensional distribution estimation

november 2017 by cshalizi

"Categorical models are the natural fit for many problems. When learning the distribution of categories from samples, high-dimensionality may dilute the data. Minimax optimality is too pessimistic to remedy this issue. A serendipitously discovered estimator, absolute discounting, corrects empirical frequencies by subtracting a constant from observed categories, which it then redistributes among the unobserved. It outperforms classical estimators empirically, and has been used extensively in natural language modeling. In this paper, we rigorously explain the prowess of this estimator using less pessimistic notions. We show (1) that absolute discounting recovers classical minimax KL-risk rates, (2) that it is \emph{adaptive} to an effective dimension rather than the true dimension, (3) that it is strongly related to the Good-Turing estimator and inherits its \emph{competitive} properties. We use power-law distributions as the corner stone of these results. We validate the theory via synthetic data and an application to the Global Terrorism Database."

to:NB
to_read
density_estimation
statistics
heavy_tails
november 2017 by cshalizi

Phys. Rev. Lett. 117, 230601 (2016) - Interevent Correlations from Avalanches Hiding Below the Detection Threshold

december 2016 by cshalizi

"Numerous systems ranging from deformation of materials to earthquakes exhibit bursty dynamics, which consist of a sequence of events with a broad event size distribution. Very often these events are observed to be temporally correlated or clustered, evidenced by power-law-distributed waiting times separating two consecutive activity bursts. We show how such interevent correlations arise simply because of a finite detection threshold, created by the limited sensitivity of the measurement apparatus, or used to subtract background activity or noise from the activity signal. Data from crack-propagation experiments and numerical simulations of a nonequilibrium crack-line model demonstrate how thresholding leads to correlated bursts of activity by separating the avalanche events into subavalanches. The resulting temporal subavalanche correlations are well described by our general scaling description of thresholding-induced correlations in crackling noise."

in_NB
heavy_tails
point_processes
time_series
december 2016 by cshalizi

AEAweb: JEP (30,1) p. 185 - Power Laws in Economics: An Introduction

february 2016 by cshalizi

"Many of the insights of economics seem to be qualitative, with many fewer reliable quantitative laws. However a series of power laws in economics do count as true and nontrivial quantitative laws—and they are not only established empirically, but also understood theoretically. I will start by providing several illustrations of empirical power laws having to do with patterns involving cities, firms, and the stock market. I summarize some of the theoretical explanations that have been proposed. I suggest that power laws help us explain many economic phenomena, including aggregate economic fluctuations. I hope to clarify why power laws are so special, and to demonstrate their utility. In conclusion, I list some power-law-related economic enigmas that demand further exploration."

to:NB
heavy_tails
economics
to_be_shot_after_a_fair_trial
gabaix.xaiver
february 2016 by cshalizi

[1504.04580] Robust estimation of U-statistics

december 2015 by cshalizi

"An important part of the legacy of Evarist Gin\'e is his fundamental contributions to our understanding of U-statistics and U-processes. In this paper we discuss the estimation of the mean of multivariate functions in case of possibly heavy-tailed distributions. In such situations, reliable estimates of the mean cannot be obtained by usual U-statistics. We introduce a new estimator, based on the so-called median-of-means technique. We develop performance bounds for this new estimator that generalizes an estimate of Arcones and Gin\'e (1993), showing that the new estimator performs, under minimal moment conditions, as well as classical U-statistics for bounded random variables. We discuss an application of this estimator to clustering."

in_NB
heavy_tails
statistics
estimation
deviation_inequalities
re:smoothing_adjacency_matrices
u-statistics
december 2015 by cshalizi

Anticipating Rare Events: Can Acts of Terror, Use of Weapons of Mass Destruction or Other High Profile Acts Be Anticipated? A Scientific Perspective on Problems, Pitfalls and Prospective Solutions

october 2015 by cshalizi

"This white paper covers topics related to the field of anticipating/forecasting specific categories of 'rare events' such as acts of terror, use of a weapon of mass destruction, or other high profile attacks. It is primarily meant for the operational community in DoD, DHS, and other USG agencies. […] The body of work before you should be viewed as the commencement of a journey with a somewhat murky destination-an exploration of terra incognita. Indeed the challenge addressed in this white paper, that of anticipating 'rare events' is daunting and represents a gathering threat to national security. The threat is supercharged by the increasing lateral connectedness of global societies enabled by the internet, cell phones and other technologies. This 'connected collective' as Carl Hunt has termed it, has allowed violent ideologies to metastasize globally often with no hierarchical, command-directed rules to govern their expansion. It is the emergent franchising of violence whose metaphorical 'genome' is exposed to constant co-evolutionary pressures and non-linearity that results in continuous adaptation and increasing resiliency making the task of effectively anticipating their courses of action all the more difficult. So what distinguishes a rare event in the context of national security? The easy response is to describe them as unlikely actions of high consequence and for which there is a sparse historical record from which to develop predictive patterns or indications."

--- I wonder how much of a period piece this now appears.

to:NB
heavy_tails
terrorism_fears
terrorism
prediction
to_be_shot_after_a_fair_trial
--- I wonder how much of a period piece this now appears.

october 2015 by cshalizi

[1507.03293] Tail Analysis without Tail Information : A Worst-case Perspective

august 2015 by cshalizi

"Tail modeling refers to the task of selecting the best probability distributions that describe the occurrences of extreme events. One common bottleneck in this task is that, due to their very nature, tail data are often very limited. The conventional approach uses parametric fitting, but the validity of the choice of a parametric model is usually hard to verify. This paper describes a reasonable alternative that does not require any parametric assumption. The proposed approach is based on a worst-case analysis under the geometric premise of tail convexity, a feature shared by all known parametric tail distributions. We demonstrate that the worst-case convex tail behavior is either extremely light-tailed or extremely heavy-tailed. We also construct low-dimensional nonlinear programs that can both distinguish between the two cases and find the worst-case tail. Numerical results show that the proposed approach gives a competitive performance versus using conventional parametric methods."

to:NB
statistics
heavy_tails
august 2015 by cshalizi

[1503.05077] Tail index estimation, concentration and adaptivity

may 2015 by cshalizi

"This paper presents an adaptive version of the Hill estimator based on Lespki's model selection method. This simple data-driven index selection method is shown to satisfy an oracle inequality and is checked to achieve the lower bound recently derived by Carpentier and Kim. In order to establish the oracle inequality, we derive non-asymptotic variance bounds and concentration inequalities for Hill estimators. These concentration inequalities are derived from Talagrand's concentration inequality for smooth functions of independent exponentially distributed random variables combined with three tools of Extreme Value Theory: the quantile transform, Karamata's representation of slowly varying functions, and R\'enyi's characterisation of the order statistics of exponential samples. The performance of this computationally and conceptually simple method is illustrated using Monte-Carlo simulations."

to:NB
heavy_tails
statistics
to_read
concentration_of_measure
may 2015 by cshalizi

[1505.01547] Understanding the Heavy Tailed Dynamics in Human Behavior

may 2015 by cshalizi

"The recent availability of electronic datasets containing large volumes of communication data has made it possible to study human behavior on a larger scale than ever before. From this, it has been discovered that across a diverse range of data sets, the inter-event times between consecutive communication events obey heavy tailed power law dynamics. Explaining this has proved controversial, and two distinct hypotheses have emerged. The first holds that these power laws are fundamental, and arise from the mechanisms such as priority queuing that humans use to schedule tasks. The second holds that they are a statistical artifact which only occur in aggregated data when features such as circadian rhythms and burstiness are ignored. We use a large social media data set to test these hypotheses, and find that although models that incorporate circadian rhythms and burstiness do explain part of the observed heavy tails, there is residual unexplained heavy tail behavior which suggests a more fundamental cause. Based on this, we develop a new quantitative model of human behavior which improves on existing approaches, and gives insight into the mechanisms underlying human interactions."

in_NB
to_read
heavy_tails
time_series
point_processes
statistics
may 2015 by cshalizi

What Do Data on Millions of U.S. Workers Reveal about Life-Cycle Earnings Risk?

february 2015 by cshalizi

"We study the evolution of individual labor earnings over the life cycle, using a large panel data set of earnings histories drawn from U.S. administrative records. Using fully nonparametric methods, our analysis reaches two broad conclusions. First, earnings shocks display substantial deviations from lognormality—the standard assumption in the literature on incomplete markets. In particular, earnings shocks display strong negative skewness and extremely high kurtosis—as high as 30 compared with 3 for a Gaussian distribution. The high kurtosis implies that, in a given year, most individuals experience very small earnings shocks, and a small but non-negligible number experience very large shocks. Second, these statistical properties vary significantly both over the life cycle and with the earnings level of individuals. We also estimate impulse response functions of earnings shocks and find important asymmetries: Positive shocks to high-income individuals are quite transitory, whereas negative shocks are very persistent; the opposite is true for low-income individuals. Finally, we use these rich sets of moments to estimate econometric processes with increasing generality to capture these salient features of earnings dynamics."

--- Last tag conditional on what exactly is in the "data appendix" at https://fguvenendotcom.files.wordpress.com/2014/04/moments_for_publication.xls

to:NB
to_read
economics
inequality
heavy_tails
to_teach:undergrad-ADA
statistics
great_risk_shift
--- Last tag conditional on what exactly is in the "data appendix" at https://fguvenendotcom.files.wordpress.com/2014/04/moments_for_publication.xls

february 2015 by cshalizi

[1405.0058] Underestimating extreme events in power-law behavior due to machine-dependent cutoffs

january 2015 by cshalizi

"Power-law distributions are typical macroscopic features occurring in almost all complex systems observable in nature. As a result, researchers in quantitative analyses must often generate random synthetic variates obeying power-law distributions. The task is usually performed through standard methods that map uniform random variates into the desired probability space. Whereas all these algorithms are theoretically solid, in this paper we show that they are subject to severe machine-dependent limitations. As a result, two dramatic consequences arise: (i) the sampling in the tail of the distribution is not random but deterministic; (ii) the moments of the sample distribution, which are theoretically expected to diverge as functions of the sample sizes, converge instead to finite values. We provide quantitative indications for the range of distribution parameters that can be safely handled by standard libraries used in computational analyses. Whereas our findings indicate possible reinterpretations of numerical results obtained through flawed sampling methodologies, they also pave the way for the search for a concrete solution to this central issue shared by all quantitative sciences dealing with complexity."

to:NB
to_read
heavy_tails
approximation
computational_statistics
have_skimmed
january 2015 by cshalizi

[1410.3192] Learning without Concentration for General Loss Functions

january 2015 by cshalizi

"We study prediction and estimation problems using empirical risk minimization, relative to a general convex loss function. We obtain sharp error rates even when concentration is false or is very restricted, for example, in heavy-tailed scenarios. Our results show that the error rate depends on two parameters: one captures the intrinsic complexity of the class, and essentially leads to the error rate in a noise-free (or realizable) problem; the other measures interactions between class members the target and the loss, and is dominant when the problem is far from realizable. We also explain how one may deal with outliers by choosing the loss in a way that is calibrated to the intrinsic complexity of the class and to the noise-level of the problem (the latter is measured by the distance between the target and the class)."

to:NB
learning_theory
heavy_tails
statistics
to_read
re:your_favorite_dsge_sucks
january 2015 by cshalizi

▶ Darius Kazemi, Tiny Subversions - How I Won the Lottery

october 2014 by cshalizi

Completely, utterly, and totally correct.

funny:pointed
funny:because_its_true
advice
heavy_tails
market_failures_in_everything
blogged
via:pinboard
october 2014 by cshalizi

[1408.1554] A complete data frame work for fitting power law distributions

august 2014 by cshalizi

"Over the last few decades power law distributions have been suggested as forming generative mechanisms in a variety of disparate fields, such as, astrophysics, criminology and database curation. However, fitting these heavy tailed distributions requires care, especially since the power law behaviour may only be present in the distributional tail. Current state of the art methods for fitting these models rely on estimating the cut-off parameter xmin. This results in the majority of collected data being discarded. This paper provides an alternative, principled approached for fitting heavy tailed distributions. By directly modelling the deviation from the power law distribution, we can fit and compare a variety of competing models in a single unified framework."

to:NB
heavy_tails
statistics
estimation
to_read
august 2014 by cshalizi

**related tags**

Copy this bookmark: