Introduction to Data Analysis


This section documents the list of handbooks and optional readings that you can find in the course syllabus. We will cover internal R documentation and online help in the early stages of the course itself.

Course handbooks

border-inline border-inline border-inline

Handbook chapters are assigned weekly to establish a learning baseline, and independent study of online material is greatly encouraged. All chapters should be read after each session to expand on the course material.

Additional readings

border-inline border-inline border-inline

Finally, these books go further than what we will explore, but serve as good examples of what you can learn to do with slightly more advanced computing and visualization skills:


This course is our best effort to document empirical data analysis with R by example, but you are very welcome to find more relevant resources that better fit your interests. There's tons of R tutorials out there: here's a good one, and here's another good one.

From our experience, these tutorials will teach you something new every time you read them:

You can see from the links above that R is being taught a bit everywhere in the United States and increasingly in Europe. There are a few initiatives in France, including courses at Sciences Po and a few R user groups (RUGs), like the Fl\tauR group run by Ensae and Insee users.

If you like video tutorials, we will link to a few of them, and in particular:

The R video tutorials by Google developers are also worth a look.


In addition to R tutorials, there are R blogs and online communities like Stack Overflow where R users can share questions and answers about the software. Check also a few data and stats communities like Cross Validated or the /r/datasets channel at Reddit.

The visualization component of the course can be explored through Alberto Cairo's selection of resources in that area, which branches with infographics. The most passionate are referred to John P. Boyd's teaching notes on Scientific Visualization and Information Architecture.

Last, the SRQM blog, created for another course that we teach together with Ivaylo, does its best to deliver a regular amount of illustrated links on statistical analysis for the social sciences as well as on data visualization and other related themes.

Next: Setup.