After a year of intensely investigating password theft, here's what Google found
In the year between March 2016 and 2017, Google found 12 million credentials obtained from phishing and 3.3 billion credentials swiped during third-party breaches. Google says that 15 percent of web users report having an account breached by hackers
hacking  statistics 
17 hours ago by dandv
過去三年,即二○一四年、二○一五年及二○一六年,在香港入境時被拒的訪客人次分別為42 177人次、56 855人次和53 499人次,這三年總訪港旅客入境人次是60 839 732人次、59 307 617人次和56 654 903人次,被拒入境的人次佔該年的總訪港旅客入境人次的百分之零點零七、百分之零點一和百分之零點零九,是一個非常少的比例,證明特區歡迎及盡量方便旅客來港。被拒入境的原因包括來港目的可疑、未持有妥當的旅行證件和使用偽造旅行證件等。因來港目的可疑而被拒入境的旅客的例子包括:懷疑水貨客、未有分娩預約的內地孕婦、懷疑來港從事非法僱傭工作的人士、懷疑來港後會逾期逗留的人士等。
Hongkong-inbound  Entry-denial  Statistics 
21 hours ago by quant18
Datasets | Kaggle
"The best place to discover and seamlessly analyze open data"
Tons of public datasets.
statistics  opendata  data  dataset 
23 hours ago by to
What is the Future of Behavioral Research and Large-scale Nudges? Five Practical Tips - | The BE Hub
Tremendous results have been found for nudges – behavioral interventions designed to facilitate choice for welfare-promoting outcomes.

The future of behavioral research will require understanding how the very act of introducing the event of an intervention causes individual-level differences in behavior, not only aggregate-level data. It may be possible to harm a subgroup in light of preferences while improving results on average.
statistics  economics  userneeds 
23 hours ago by robertocarroll
Standard Deviation - Data Science in JavaScript - Fun Fun Function - YouTube
This was great. Goes through how to calculate standard deviation using JavaScript.
statistics  data  javascript 
yesterday by robertocarroll
[1702.01522] Inverse statistical problems: from the inverse Ising problem to data science
Inverse problems in statistical physics are motivated by the challenges of `big data' in different fields, in particular high-throughput experiments in biology. In inverse problems, the usual procedure of statistical physics needs to be reversed: Instead of calculating observables on the basis of model parameters, we seek to infer parameters of a model based on observations. In this review, we focus on the inverse Ising problem and closely related problems, namely how to infer the coupling strengths between spins given observed spin correlations, magnetisations, or other data. We review applications of the inverse Ising problem, including the reconstruction of neural connections, protein structure determination, and the inference of gene regulatory networks. For the inverse Ising problem in equilibrium, a number of controlled and uncontrolled approximate solutions have been developed in the statistical mechanics community. A particularly strong method, pseudolikelihood, stems from statistics. We also review the inverse Ising problem in the non-equilibrium case, where the model parameters must be reconstructed based on non-equilibrium statistics.
data-science  statistics  inverse-problems  complexology  rather-interesting  inference  to-write-about  review  to-simulate  philosophy-of-science 
yesterday by Vaguery
Bayesian aggregation of average data: An application in drug development
"Here we consider the challenge when the model of interest is complex (hierarchical and nonlinear) and one dataset is given as raw data while the second dataset is given as averages only. ... We provide a Bayesian solution by using simulation to approximately reconstruct the likelihood of the external summary and allowing the parameters in the model to vary under the different conditions."
bayes  statistics  metaanalysis 
yesterday by aapl
Creative Destruction Whips through Corporate America | Innosight
At current churn rate, 75% of the S&P 500 will be replaced by 2027.
Company  Statistics  Research 
yesterday by lbenjamin
Frequency Trails: Modes and Modality
I've worked many performance issues where the latency or response time was multimodal, and higher-latency modes turned out to be the cause of the problem. Their existence isn't shown by the average – the arithmetic mean; it could only be seen by examining the distribution as a histogram, density plot, heat map, or frequency trail. Once you know that more than one mode is present, it's often straightforward to determine what causes the slower mode, by seeing what parameters of the operations involved are different: their type, size, URL, code path, etc.
performance  statistics  data-visualization 
yesterday by brandon.w.barry
Most winning A/B test results are illusory
"In this article I’ll show that badly performed A/B tests can produce winning results which are more likely to be false than true."
statistics  testing  webdesign 
yesterday by sometimesfood

