R / Notes

This page contains a collection of short notes on using the R statistical software.
Subscribe to updates via its Atom feed, or read other R blogs at R-Bloggers.

Asking R questions

This note lists a few places where one can ask R-related questions and get an answer, usually in no more than a few hours.

  • July 24th, 2017

Some thoughts about the R language survey

The only point of this note is to invite you to fill in the R language survey launched by the R Consortium earlier this month.

  • July 24th, 2017

More components of the R ecosystem

This note lists a few of the organizations that are pushing the R language forward, as of early 2017. R is a happy language right now.

  • January 16th, 2017

R as a data science language

The R language is a ‘DSL’ – a domain-specific language. The domain that it deals with, however, is not well-defined. In this note, I call R a “data science language” and link to a few resources that make the point better than I could.

  • January 5th, 2017

Technologies worth learning for data science

As a complement to my note on R as a data science language, this note lists ten other technologies that you might want to learn to use, or at least monitor, if you are interested in learning data science.

  • January 5th, 2017

Turning KML into tidy data frames

This note briefly introduces the tidykml package, which turns basic KML geometries into tidy data frames that can be visualized with ggplot2.

  • December 31st, 2016

Scraping Web sources: Two illustrations

Per request from a couple of students in a course on open data that I contribute to, here's a short guide to the "why" and "how" questions about (Web) scraping, with links to examples to illustrate the usefulness of the technique.

  • December 14th, 2016

Remember to use the RDS format

Note to self – Remember to serialize R objects as RDS files when it makes sense.

  • December 12th, 2016

Compiling the ggplot2 book on Mac OS

This note explains to compile Hadley Wickham's ggplot2 book on Mac OS.

  • December 10th, 2016

One year of R / Notes

My collection of R notes is now slightly over one year old. This note reflects on how useful the exercise of blogging about R has been so far, and answers some of the questions that I have received about it.

  • September 21st, 2016

Collapsing a bipartite co-occurrence network

This note is a follow-up to the previous one. It shows how to use student-submitted keywords to find clusters of shared interests between the students.

  • September 16th, 2016

Turning keywords into a co-occurrence network

This note is addressed to the GLM Fall 2016 students who are currently taking my Statistical Reasoning and Quantitative Methods course at Sciences Po in Paris.

  • September 10th, 2016

An awesome list of network analysis resources

Inspired by the awesome R list that I mentioned a few months ago, I have started the awesome-network-analysis list, which features a large section on R packages.

  • April 11th, 2016

ggnetwork: Network geometries for ggplot2

This note is a ~~shameless plug~~ demo of the ggnetwork package, which provides several geoms to plot network objects with ggplot2, and which just got published on CRAN. See the package vignette for a more detailed guide to its functionalities.

  • March 28th, 2016

Submitting packages to CRAN

This note lists a few of the mistakes that one can make before submitting a package to CRAN. The list is based on my own mistakes when submitting the ggnetwork package to CRAN for the first time (see this other note for comments about the package itself).

  • March 24th, 2016

Sustainable code for social scientists

Over at his blog “One Tip per Day”, Xianjun Dong has produced an excellent list of “15 Practical Tips For a Bioinformatician”. This note is my own version of these tips, aimed at social scientists who need to write sustainable (i.e. reproducible) code for either individual or collective research projects.

  • February 19th, 2016

Scraping legislative data with R: A progress report

This note discusses the results of this project, which collects legislative data from several European parliaments (plus Israel). The project is coded in R, which has had consequences on its development.

  • February 7th, 2016

Quick shell commands for R users

This note explains how to use an application launcher along with text expansion and shell commands to accomplish a few specific tasks that can be useful to R users.

  • February 7th, 2016

Exponential random graph models with R

This note documents the small but growing microverse of R packages on CRAN to produce various forms of exponential random graph models (ERGMs), which are a kind of modelling strategy akin to logistic regression for dyadic data.

  • February 6th, 2016

Convex hulls with dplyr and ggplot2

This note shows a quick way to draw convex hulls, using dplyr and ggplot2.

  • January 11th, 2016

Scraping form results with httr

This note shows how to use the httr package to scrape the results of a search form.

  • January 10th, 2016

String manipulations on full names

This note shows how to use the stringr package to clean a list of full names that need to be turned into unique identifiers, i.e. something that can be assigned as row names to a data frame.

  • January 8th, 2016

Latest R bookmarks

This note describes how I compile(d) my R-related bookmarks on Pinboard.

  • December 14th, 2015

Teaching with RStudio

The RStudio IDE is a central component of the R software ecology that makes it easier to code in R, or to use R with other tools. This note discusses its use in a teaching environment.

  • December 14th, 2015

Unit tests for R packages

Karl Broman has written a nice blog post to recommend writing unit tests for R packages. Here are a few more pointers on how to write these tests.

  • December 8th, 2015

From networkx to igraph

This note translates the code from an interesting blog post (in French) from Python to R. The code includes a function to compute closeness vitality with the igraph package.

  • October 31st, 2015

Visualizing networks with ggplot2

This note describes a few ways to handle network objects (which might be objects of class igraph or network, or data frames representing edge lists) through graphical methods that rely on ggplot2.

  • October 5th, 2015

Serrano et al.'s disparity filter algorithm for directed networks

The disparity filter algorithm by Serrano et al. is a network reduction technique to identify the ‘backbone’ of a weighted network. This note explains how to implement the algorithm in full, based on existing implementations and on Serrano et al.'s paper.

  • October 2nd, 2015

Weighting co-authorship networks

This note explains how to implement two edge weighting schemes that are relevant to co-authorship networks, based on my work on legislative cosponsorship networks.

  • September 18th, 2015

Working with R network objects

Here are a few things that I have learnt while working with R network objects, using the igraph and network + sna packages (the last two packages go well together).

  • September 17th, 2015

Working with statistical model results

Working with statistical model results in R often means that the user has to learn about the class of the model to further manipulate it. A few packages can help with that.

  • September 17th, 2015

Mapping the R software ecology: awesome-R

Qin Wenfeng maintains a “curated list of awesome R frameworks, packages and software”, which also includes links to websites, books and other resources to learn R.

  • September 6th, 2015