prcleary + r   697

Automatically plot, analyse and rebase multiple run charts • runcharter
Automated analysis and re-basing of run charts at scale.
R  spc 
2 days ago by prcleary
Explore longitudinal data with brolgar | Credibly Curious
I’ve spent a fair bit of time this year with Di Cook and also Tania Prvan thinking about some ways to improve how we look at and explore longitudinal data. It is a hard problem, and I’m certainly not done yet, but we created the brolgar package to make it easier to explore and visualise longitudinal data.
R 
2 days ago by prcleary
Index
Learn statistics & R coding
A series of modules that teach in an intuitive, playful, and approachable way.
R 
2 days ago by prcleary
Subsetting and subassignment • tibble
This vignette is an attempt to provide a comprehensive overview over all subsetting and subassignment operations, highlighting where the tibble implementation differs from the data frame implementation.
R 
8 days ago by prcleary
Creating Custom Step Functions • recipes
The recipes package contains a number of different operations:
R 
10 days ago by prcleary
Preprocessing Tools to Create Design Matrices • recipes
The recipes package is an alternative method for creating and preprocessing design matrices that can be used for modeling or visualization.
R 
10 days ago by prcleary
Introducing correlationfunnel v0.1.0 - Speed Up Exploratory Data Analysis by 100X
I’m pleased to announce the introduction of correlationfunnel version 0.1.0, which officially hit CRAN yesterday. The correlationfunnel package is a tool that enables efficient exploration of data, understanding relationships, and get to business insights as fast as possible. I’ve taught correlationfunnel to my 500+ students enrolled in the Advanced Machine Learning course (DS4B 201-R) at Business Science University. The results have been great so far. Students are using it as the EDA step prior to Machine Learning to ensure that features have relationship before they spend significant time tuning ML models on bad data. It’s helped me get to business insights 100X faster, which has been a massive productivity boost. And, it’s a great communication tool that has helped explain business insights to executives and project stakeholders! Win-win-win.
R 
10 days ago by prcleary
David's blog
You need to declare generic functions in S4 before you can define methods for them.
R 
10 days ago by prcleary
Google’s R Style Guide | styleguide
R is a high-level programming language used primarily for statistical computing and graphics. The goal of the R Programming Style Guide is to make our R code easier to read, share, and verify.
R  coding 
10 days ago by prcleary
Introducing the funneljoin package
In this post, I’ll use funneljoin::after_join() to analyze data about all Stack Overflow questions and answers with the tag R up to September 24th, 2017. The data was downloaded from Kaggle here. The next post in this series will look at the funnel_start() and funnel_step() functions, which we’ll use when all of the events or behavior are in one table.
R 
11 days ago by prcleary
GitHub - dmi3kno/bunny: Magick Helper
The goal of bunny is to provide useful helper functions for working with magick.
R 
11 days ago by prcleary
Code Highlighting with demoR • demoR
The primary goal of the demoR package is to simplify the presentation of R code.
R 
11 days ago by prcleary
Create Interactive Chart with the JavaScript 'ApexCharts' Library • apexcharter
Htmlwidget for apexcharts.js : A modern JavaScript charting library to build interactive charts and visualizations with simple API.
R 
11 days ago by prcleary
GitHub - ThinkR-open/chameleon: Build And Highlight Package Documentation With Customized Templates
The goal of {chameleon} is to build and highlight package documentation with customized templates.
R 
11 days ago by prcleary
R Packages
Packages are the fundamental units of reproducible R code. They include reusable R functions, the documentation that describes how to use them, and sample data. In this book you’ll learn how to turn your code into packages that others can easily download and use. Writing a package can seem overwhelming at first. So start with the basics and improve it over time. It doesn’t matter if your first version isn’t perfect as long as the next version is better.
R 
11 days ago by prcleary
Best practices for API packages
So you want to write an R client for a web API? This document walks through the key issues involved in writing API wrappers in R. If you’re new to working with web APIs, you may want to start by reading “An introduction to APIs” by zapier.
R 
13 days ago by prcleary
Exact Matching in R - NHS-R Community
I’ve been working with a group of analysts in East London who are interested in joined-up health and social care data. They’ve created a powerful, unique dataset that shows how each resident of the London Borough of Barking & Dagenham interacts with NHS and council services. The possibilities are enormous, both in terms of understanding the way that people use services across a whole system, and for more general public health research.

You can download a sample of the data here:

sample_lbbd_datasetDownload
Today we’re interested in whether social isolation is related to healthcare costs, and we’re going to use exact matching to explore this issue. Our theory is that people who live alone have higher healthcare costs because they have less support from family members.
R 
14 days ago by prcleary
Count of working days function - NHS-R Community
It’s at this time of year I need to renew my season ticket and I usually get one for the year. Out of interest, I wanted to find out how much the ticket cost per day, taking into account I don’t use it on weekends or my paid holidays. I started my workings out initially in Excel but got as far as typing the formula =WORKDAYS() before I realised it was going to take some working out and perhaps I should give it a go in R as a function…

@ChrisBeeley had recently shown me functions in R and I was surprised how familiar they were as I’ve seen them on Stack Overflow (usually skimmed over those) and they are similar to functions in SQL which I’ve used (not written) where you feed in parameters. When I write code I try to work out how each part works and build it up but writing a function requires running the whole thing and then checking the result, the objects that are created in the function do not materialise so are never available to check. Not having objects building up in the environment console is one of the benefits of using a function, that and not repeating scripts which then ALL need updating if something changes.
R 
14 days ago by prcleary
GitHub - edzer/UseR2019
This tutorial dives into some of the modern spatial and spatiotemporal analysis packages available in R. It will show how support, the spatial size of the area to which a data value refers, plays a role in spatial analysis, and how this is handled in R. It will show how package stars complements package sf for handling spatial time series, raster data, raster time series, and more complex multidimensional data such as dynamic origin-destination matrices. It will also show how stars handles out-of- memory datasets, with an example that uses Sentinel-2 satellite time series. This will be connected to analysing the data with packages that assume spatial processes as their modelling framework, including gstat, spdep, and R-INLA. Familiarity with package sf and the tidyverse will be helpful for taking this tutorial.
R  Bayesian 
14 days ago by prcleary
useR! 2019: Getting the most out of Git
useR! 2019: Getting the most out of Git
Contents
Colin Gillespie (@csgillespie)

@jumping_uk
© 2019 Jumping Rivers (jumpingrivers.com)http://bit.ly/user2019-git
Contents
Course Pre-Reqs

Chapter 1: Introduction to CI and Git

Chapter 2: Travis Part 1

Chapter 3: GitHub Pat

Chapter 4: Travis Part 2

Chapter 5: The pkgdown website

Jumping Rivers are RStudio Certified partners. We are currently developing methods for live monitoring of RStudio Server and Connect. If you are interested in hearing more, please sign up to get information about our beta version.
R  git 
14 days ago by prcleary
R Packages
Packages are the fundamental units of reproducible R code. They include reusable R functions, the documentation that describes how to use them, and sample data. In this book you’ll learn how to turn your code into packages that others can easily download and use. Writing a package can seem overwhelming at first. So start with the basics and improve it over time. It doesn’t matter if your first version isn’t perfect as long as the next version is better.
R 
14 days ago by prcleary
Deal with Dependencies • attachment
The goal of attachment is to help to deal with package dependencies during package development. It also gives useful tools to install or list missing packages used inside Rscripts or Rmds.
R 
14 days ago by prcleary
The Ultimate Helper for Clumsy Fingers • fcuk
A package designed to help people with clumsy fingers.
R 
14 days ago by prcleary
Thoughts on Animation and Movement in Data Visualization
The Pudding’s project was my inspiration and motivation for d3rain, my latest package. At it’s core, the package is a fun way to visualize distibutions, and the downward movement can reinforce various subtleties within the subject. The example below shows the distribution of 2015 police killings by ‘armed status’.
R  dataviz 
14 days ago by prcleary
Can {drake} RAP? - rostrum.blog
The {drake} package records file interdependecies in your analysis. When files are changed, {drake} only re-runs the parts that need to be re-run. This saves time and reduces error.

This could be useful for Reproducible Analytical Pipelines (RAP), an automated approach to producing UK government statistics that minimises error and speeds production.
R 
14 days ago by prcleary
Create Tessellated Hexagon Maps • sugarbag
The sugarbag package creates tesselated hexagon maps for visualising geo-spatial data. Hexagons of equal size are positioned to best preserve relationships between individual areas and the closest focal point, and minimise distance from their actual location. This method allows all regions to be compared on the same visual scale, and provides an alternative to cartograms.

Maps containing regions with a few small and densely populated areas are extremely distorted in cartograms. An example of this is a population cartogram of Australia, which distorts the map into an unrecognisable shape. The technique implemented in this package is particularly useful for these regions.
R 
14 days ago by prcleary
Just Quickly: How to show verbatim inline R code | Credibly Curious
I’ve recently asked on the Rstudio community page how to make code chunks appear verbatim.

Not sure what I mean by this?

Well, showing your entire code chunk is something taht comes up when you are teaching people about rmarkdown. One of the issues is that you want to show people what an rmarkdown code chunk looks like - not just echo the output.
R 
14 days ago by prcleary
Start an event - SatRdays Knowledge Base
Woohoo! You’re interested in running a satRday event near you. That’s super awesome, but of course you need to know what’s involved before you commit to an event.
R 
14 days ago by prcleary
Creating RStudio addin to modify selection
A lot of the popular addins follows the same simple formula

extract highlighted text
modify extracted text
replace highlighted text with modified text.
if your problem can be solved with the above steps, then this post is for you.
R 
15 days ago by prcleary
Statistical Rethinking – Richard McElreath
Statistical Rethinking is an introduction to applied Bayesian data analysis, aimed at PhD students and researchers in the natural and social sciences. This audience has had some calculus and linear algebra, and one or two joyless undergraduate courses in statistics. I've been teaching applied statistics to this audience for about a decade now, and this book has evolved from that experience.
R  Bayesian 
17 days ago by prcleary
GitHub - rundel/ghclass: Tools for managing classroom organizations
Tools for managing github class organization accounts
This package is for everyone! But really, if you're an instructor who uses GitHub for your class management, e.g. students submit assignments via GitHub repos, this package is definitely for you! The package also assumes that you're an R user, and you probably teach R as well, though that's not a requirement since this package is all about setting up repositories with the right permissions, not what your students put in those repositories.

See package vignette for details on how to use the package.
R  git 
18 days ago by prcleary
Happy Git and GitHub for the useR
Happy Git provides opinionated instructions on how to:

Install Git and get it working smoothly with GitHub, in the shell and in the RStudio IDE.
Develop a few key workflows that cover your most common tasks.
Integrate Git and GitHub into your daily work with R and R Markdown.
The target reader is someone who uses R for data analysis or who works on R packages, although some of the content may be useful to those working in adjacent areas.
R  git 
18 days ago by prcleary
STAT 360 Syllabus – Spring 2019
This course introduces students to an advanced statistical software package to effectively apply statistical methods, in general. Students create data sets from raw data files, create variables within a data set, append and/or modify data sets, create subsets, then apply a whole host of statistical procedures, create graphs and produce reports. The course will be based on several leading advanced statistical software packages, which will be chosen from semester to semester to match the needs of the community.
R 
18 days ago by prcleary
Counting commits and peer code review · Teach Data Science
This past semester, I taught two sections of a course called Advanced Statistical Software (yes, I’m aware of the acronym. We’re changing the course title soon…). The course was focused on R: we spent the first half of the semester going through R for Data Science and learning about doing data science in R, and the second half reading selections from Advanced R, and learning about R as a language. It was the most computationally-focused class I’ve ever been able to teach, which was fantastic.

On top of R, I also taught students how to use git and GitHub, and all student work was submitted through private repositories. At the beginning of the semester, 63% of my students said they had never used git before (even though the majority of students were upper-level CS majors), and by the end of the semester everyone could commit, push, pull, manage merge conflicts, and more. I’m so proud of them!
R  git 
18 days ago by prcleary
Scraping Dynamic Websites with PhantomJS | Robert Hickman
For a recent blogpost, I required data on the ELO ratings of national football teams over time. Such a list exists online at eloratings.net and so in theory this was just a simple task for rvest to read the html pages on that site and then fish out the data I wanted. However, while this works for the static websites which make up the vast majority of sites containing tables of data, it struggles with websites that use JavaScript to dynamically generate pages.
R 
20 days ago by prcleary
Communicating IELTS Averages with Maps - Part II - Educators R Learners
The previous post focused on collecting the data needed to make custom maps. This post will make use of that data to create maps like the one above as well as some others. While it isn’t necessary to read the previous post, it is recommended.
R 
20 days ago by prcleary
Logging and Error Handling in Operational Systems | Working With Data
Operational systems, by definition, need to work without human input. Systems are considered “operational” after they have ben thoroughly tested and shown to work properly with a variety of input.

However, no software is perfect and no real-world system operates with 100% availability or 100% consistent input. Things occasionally go wrong – perhaps intermittently. In a situation with occasional failures it is vitally important to have good logging and error handling. The newly released MazamaCoreUtils package helps with these tasks.
R  python 
20 days ago by prcleary
Statistical Rethinking with brms, ggplot2, and the tidyverse
So, this project is an attempt to reexpress the code in McElreath’s textbook. His models are re-fit with brms, the figures are reproduced or reimagined with ggplot2, and the general data wrangling code now predominantly follows the tidyverse style.
R  Bayesian 
20 days ago by prcleary
Retrieve Magic Attributes from Files and Directories • wand
MIME types are shorthand descriptors for file contents and can be determined from “magic” bytes in file headers, file contents or intuited from file extensions. Tools are provided to perform curated “magic” tests as well as mapping MIME types from a database of over 1,800 extension mappings.
R 
20 days ago by prcleary
Forecasting unemployment
Forecasting unemployment is hard, with lots of complex bi-directional causality. Also, while AIC is asymptotically equivalent to cross-validation, it's probably better to check. It turns out that interest rates or stock prices don't have any useful information for nowcasting unemployment.
R 
20 days ago by prcleary
Unsupervised Machine Learning in R: K-Means – data technik
K-Means clustering is unsupervised machine learning because there is not a target variable. Clustering can be used to create a target variable, or simply group data by certain characteristics.
R 
20 days ago by prcleary
Explaining Predictions: Random Forest Post-hoc Analysis (permutation & impurity variable importance)
There are 2 approaches to explaining models

Use simple interpretable models. This approach was covered in the previous posts where we looked at logistic regression and decision trees as examples of white box models.
Conduct post-hoc interpretation on models. There are two are two types of post-hoc analysis which can be done, model specific and model agonistic.
R 
20 days ago by prcleary
Advanced Graphics and Image-Processing in R • rOpenSci: magick
Bindings to ImageMagick: the most comprehensive open-source image processing library available. Supports many common formats (png, jpeg, tiff, pdf, etc) and manipulations (rotate, scale, crop, trim, flip, blur, etc). All operations are vectorized via the Magick++ STL meaning they operate either on a single frame or a series of frames for working with layers, collages, or animation. In RStudio images are automatically previewed when printed to the console, resulting in an interactive editing environment.
R 
20 days ago by prcleary
Drawing maps in R
If you want to start drawing maps in R, the best place to begin is to familiarize yourself with the Simple Features (sf) data format. This is an open standard developed by the Open Geospatial Consortium, meant to represent geographical vector data.

The sf package in R is a great implementation of this, and allows you to work with the sf data as regular data frames. I.e. you can do typical data operations, like filters, grouped calculations, join with other datasets, etc. There has also been functionality implemented for ggplot2, so you can draw maps using geom_sf. All of the good stuff.
R 
23 days ago by prcleary
RCC: R to C Compiler at Rice University
We are working to create an open-source, portable, retargetable, high-quality R compiler suitable for use with production codes.
R 
25 days ago by prcleary
Data Science: R Basics | edX
The first in our Professional Certificate Program in Data Science, this course will introduce you to the basics of R programming. You can better retain R when you learn it to solve a specific problem, so you’ll use a real-world dataset about crime in the United States. You will learn the R skills needed to answer essential questions about differences in crime across the different states.

We’ll cover R's functions and data types, then tackle how to operate on vectors and when to use advanced functions like sorting. You’ll learn how to apply general programming features like “if-else,” and “for loop” commands, and how to wrangle, analyze and visualize data.

Rather than covering every R skill you might need, you’ll build a strong foundation to prepare you for the more in-depth courses later in the series, where we cover concepts like probability, inference, regression, and machine learning. We help you develop a skill set that includes R programming, data wrangling with dplyr, data visualization with ggplot2, file organization with UNIX/Linux, version control with git and GitHub, and reproducible document preparation with RStudio.

The demand for skilled data science practitioners is rapidly growing, and this series prepares you to tackle real-world data analysis challenges.
R 
25 days ago by prcleary
A.8. Using this DHIS2 Web API with R
DHIS2 has a powerful Web API which can be used to integrate applications together. In this section, we will illustrate a few trivial examples of the use of the Web API, and how we can retrieve data and metadata for use in R. The Web API uses basic HTTP authentication (as described in the Web API section of this document). Using two R packages "RCurl" and "XML", we will be able to work with the output of the API in R. In the first example, we will get some metadata from the database.
R  e-health 
4 weeks ago by prcleary
Interactions and contrasts
As a running example to learn about more complex linear models, we will be using a dataset which compares the different frictional coefficients on the different legs of a spider. Specifically, we will be determining whether more friction comes from a pushing or pulling motion of the leg. The original paper from which the data was provided is:

Jonas O. Wolff & Stanislav N. Gorb, Radial arrangement of Janus-like setae permits friction control in spiders, Scientific Reports, 22 January 2013.
R 
4 weeks ago by prcleary
Headless Chrme Automation with
Headless Chrme
Automation with
About the crrri package
Romain Lesur & Christophe Dervieux
useR! 2019 - 2019/07/12
Toulouse - France
R 
4 weeks ago by prcleary
Read Untidy Excel Files • tidyxl
tidyxl imports non-tabular data from Excel files into R. It exposes cell content, position, formatting and comments in a tidy structure for further manipulation, especially by the unpivotr package. It supports the xml-based file formats ‘.xlsx’ and ‘.xlsm’ via the embedded RapidXML C++ library. It does not support the binary file formats ‘.xlsb’ or ‘.xls’.

It also provides a function xlex() for tokenizing formulas. See the vignette for details. It is useful for detecting ‘spreadsheet smells’ (poor practice such as embedding constants in formulas, or using deep levels of nesting), and for understanding the dependency structures within spreadsheets.
R 
5 weeks ago by prcleary
A feast of time series tools | Rob J Hyndman
Modern time series are often high-dimensional and observed at high frequency, but most existing R packages for time series are designed to handle low-dimensional and low frequency data such as annual, monthly and quarterly data. The feasts package is part of new collection of tidyverts packages designed for modern time series analysis using the tidyverse framework and structures. It uses the tsibble package to provide the basic data class and data manipulation tools.

The feasts package provides Feature Extraction And Statistics for Time Series, and includes tools for exploratory data analysis, data visualization, and data summary. For example, it includes autocorrelation plots, seasonality plots, time series decomposition, tests for units roots and autocorrelations, etc.

I will demonstrate the design and use of the feasts package using a variety of real data, highlighting its power for handling large collections of related time series in an efficient and user-friendly manner.
R 
5 weeks ago by prcleary
SF on R 3.5 can't find correct version of gdal - Stack Overflow
Update: Going through the same thing with an updated R to 3.6, I found that if I removed rgdal and reinstalled (with no special options) I was able to then install sf successfully.
R 
5 weeks ago by prcleary
How to use `recipes` package from `tidymodels` for one hot encoding
Since once of the best way to learn, is to explain, I want to share with you this quick introduction to recipes package, from the tidymodels family.
It can help us to automatize some data preparation tasks.

The overview is:

How to create a recipe
How to add a step
How to do the prep
Getting the data with juice!
Apply the prep to new data
What is the difference between bake and juice?
Dealing with new values in recipes (step_novel)
R 
5 weeks ago by prcleary
emayili: Sending Email from R - datawookie
At Exegetic we do a lot of automated reporting with R. Being able to easily and reliably send emails is a high priority.

There is already a selection of packages for sending email from R:

{mailR}
{gmailr}
{blastula}
{blatr} (Windows)
{mail} and
{sendmailR}.
We’ve had the most experience with the first two, both of which are really solid packages. However, {gmailr} uses the Google Mail API so it doesn’t work with all SMTP servers and {mailR} has a dependency on {rJava} which can be a bit of a hurdle for deploying in some environments.

We wrote {emayili} with the following two design goals: works with all SMTP servers and has few (or easily satisfied) dependencies.
R 
5 weeks ago by prcleary
From base R • stringr
This vignette compares stringr functions to their base R equivalents to help users transitioning from using base R to stringr.
R 
5 weeks ago by prcleary
davidgohel/rvg
rvg is providing two graphics devices that produces Vector Graphics outputs in DrawingML format for Microsoft PowerPoint with dml_pptx and for Microsoft Excel with dml_xlsx. Theses formats let users edit the graphic elements (editable graphics) within PowerPoint or Excel and have a very good rendering.
R 
5 weeks ago by prcleary
Putting the into Reproducible Research
Reproducibility has the potential to serve as a minimum standard for judging scientific claims when full independent replication of a study is not possible.
R  markdown 
5 weeks ago by prcleary
« earlier      
per page:    204080120160

Copy this bookmark:



description:


tags: