sharon_howardbundlesall_the_data_things   758
Keyword Extraction using RAKE – CodeLingo
If you’ve ever wanted to know what a document or piece of text is about without reading the entire thing, you’ll be glad to know you can do so using keywords. Keywords, in this context, are words or short phrases that concisely describe the contents of a larger text. This post describes the working of a relatively new approach to automatically generating keywords from a given document, called Rapid Automatic Keyword Extraction (RAKE).
keywords  data_mining  RAKE 
2 days ago by sharon_howard
New e-book by looks very promising: wordless tutorials on designing with Excel…
dataviz  from twitter_favs
5 days ago by sharon_howard
The Make Data Count (MDC) project is funded by the Alfred P. Sloan Foundation to develop and deploy the social and technical infrastructure necessary to elevate data to a first-class research output alongside more traditional products, such as publications. It will run between May 2017 and April 2019.

The project will address the significant social as well as technical barriers to widespread incorporation of data-level metrics in the research data management ecosystem through consultation, recommendation, new technical capability, and community outreach. Project work will build upon long-standing partner initiatives supporting research data management and DLM, leverage prior Sloan investments in key technologies such as Lagotto, and enlist the cooperation of the research, library, funder, and publishing stakeholder communities.
data  data_sharing 
7 days ago by sharon_howard
rvest: easy web scraping with R | RStudio Blog
package that makes it easy to scrape (or harvest) data from html web pages
r  data  html 
8 days ago by sharon_howard
rOpenSci | tinkr: editing Markdown documents using XML tools
The goal of tinkr is to convert Markdown files to XML and back to allow their editing with xml2 (XPath!) instead of numerous complicated regular expressions.
r  xml  markdown 
8 days ago by sharon_howard
Want to see some truly awful ? Check out . I've already wasted half an hour scrollin…
dataviz  from twitter_favs
11 days ago by sharon_howard
WTF Visualizations
Want to see some truly awful ? Check out . I've already wasted half an hour scrollin…
dataviz  from twitter_favs
11 days ago by sharon_howard
What Are We Plotting, What Are We Animating · Data Imaginist
> When we animate data visualisations we often do it by calculating intermediary data points resulting in a smooth transition between the states represented by the raw data. In gganimate this is done by adding a transition which defines how data should be expanded across the animation frames. Underneath it all most transitions calculate intermediary data representations using tweenr and transformr — so far, so good.

What we have glanced over, and what is at the center of the problem, is what state of the data we decide to use as basis for our expansion.
r  visualization  dataviz 
14 days ago by sharon_howard
Understanding Regression Error Metrics
smart statisticians have developed error metrics to judge the quality of a model and enable us to compare regresssions against other regressions with different parameters. These metrics are short and useful summaries of the quality of our data. This article will dive into four common regression metrics and discuss their use cases.

There are many types of regression, but this article will focus exclusively on metrics related to the linear regression.
statistics  regression 
16 days ago by sharon_howard
⚡ 📝 "Label line ends of time series with ggplot2"

trees  rstats  dataviz  from twitter_favs
20 days ago by sharon_howard
all_the_data_things 91 tags


Copy this bookmark: