Introduction to Data Analysis

# 1.3. Practice

Now that you are fully set up to work in RStudio, you get to download your first exercise and run the entire code while reading the comments along the way. This exercise will teach you a few more things about keyboard shortcuts, R syntax, and trivia like this:

rep("See you next week!", 6)

[1] "See you next week!" "See you next week!" "See you next week!"
[4] "See you next week!" "See you next week!" "See you next week!"


Instructions: this week’s exercise is called 1_hello.R. Download or copy-paste that file into a new R script, then open it and run it with the working directory set as the IDA folder. If you download the script, make sure that your browser preserved its .R file extension.

Be careful if you decide to download the script directly from your browser, which is what most users do by simply clicking links instead of right-clicking the link to save the source. Your browser might try to save R scripts as HTML (.html) or plain text (.txt): in that case, make sure to rename the file properly with a .R file extension.

Once you are done with the exercise, please turn to the final setup instructions below.

## Folder architecture

The exercises for this course require that you keep things tidy inside your working directory. Start by making sure that you understand, from reading the previous pages, what the working directory is, and how to set it in RStudio. We recommended that you simply call your working directory IDA. You will then need to create two folders inside it:

• A code folder to archive the course exercises. You can move all previously created scripts, and create all future scripts, in there. Even when you run a script from that folder, make sure that your working directory stays the IDA folder.
• A data folder to archive the course datasets. This as much a requirement as the previous step, because our scripts assume that this is where you store the data, and will therefore look for it. You will run into errors if you do not create that directory.

All scripts in this class assume that you have this folder architecture in place inside your working directory. You can ask R to create the folders for you. First, check that the working directory is the IDA folder by typing getwd(). If not, change it with the 'Session' menu of RStudio or the 'Misc' menu in R. Then run the following lines:

# Check the working directory.
if(!grepl("IDA$", getwd())) warning("Not sure whether the working directory is really ", "the IDA folder...", "\nCarrying on anyway...") # Create folders if necessary. if(!file.exists("code")) dir.create("code") if(!file.exists("data")) dir.create("data")  ## Security notice The code above will warn you if the working directory does not end with “IDA” and then create two folders. Notice that R can easily create, copy or move files and folders: it also has the same ability to destroy them, by removal or overwrite, just as if you were doing it by hand. If you are running Linux, you know that and you can prevent running into issues. The code block below illustrates what is meant here: it will move any .R script that might be lying around in your main IDA folder to the code subfolder, as to insist on keeping your files tidy. You can run this code safely because the scripts object, which is a list of files ending in .R matched by a regular expression, is identical at the copying and removal stages: # Match filenames ending in .R in the working directory. regex <- list.files(".", ".R$")
# This variable will be 0 (FALSE) if nothing is matched.
clean <- length(regex)
# Move .R scripts to code/ subfolder.
if(clean) {
message("Moving files to code folder:\n", paste(scripts, collapse = "\n"))
file.copy(scripts, "code")
file.remove(scripts)
} else {
message("No R script was found lying around.")
}

No R script was found lying around.


If you ever plan to run R code without full understanding of what the code might accomplish, first consider whether you really want to do that, and give a second look at the code and its source. Consider using a “sandboxed” environment like the one at Rapporter.net to check the code without any possible effect on your system.

The course itself is documented in more detail on its wiki. You might also turn to the README files that were written for the course and its datasets if you need additional details on the contents and examples of each session.