Skip main navigation

Best practices for setting your working environment

RStudio environment best practices, about creating RStudio project for each data analysis and keeping comments in scripts
best practices words written in blue on a pink background, a pencil is visible in the foreground
© NCMA (National Community Media Alliance)

Good science needs to follow FAIR principles: it needs to be Findable, Accessible, Interoperable and Reproducible. Although original applied to data science, these principles are relent to many areas and scientific disciplines

When performing data analysis for instance, it is a common best practice to prepare a document detailing and explaining each step. This is how you will be able to document your work, reproduce your own results, and share it with your colleagues or the wider community for efficient reuse. Here are important tips on how to follow these principles in RStudio and when using the R programming environment.

1. Create a Project

For each individual project to work on, RStudio users will at some point need to gather different types of data: bring input data from outside the RStudio environment to inside, record their analytical steps in a R script for reproducibility, output analytical files and probably generate many plots. For this, RStudio advanced users make sure to keep all files related to a project together, and RStudio even has built-in support for this: creating Projects.

Let’s create a project for you to work on during this course. Using the RStudio toolbar and the windows that will open, click successively on:

a) File > New Project screenshopt of RStudio with drop down menu indicating New Project

b) Choose New Directory screenshot of RStudio interface with options to choose directory

c) Select New Project screenshopt of RStudio with menu indicating types of Projects

d) Name your directory and choose location

At this final stage you will be asked to provide a name for your project and to place it in your preferred directory (and sub-directory if needed). screenshopt of RStudio with textbook waiting for input

Once this is done, a new RStudio project will be created for this course. You can check that your working directory has been properly set by typing getwd() in your Console.

> getwd()

An example output could be:

[1] "/Users/macbookair/Desktop/R/B4B_II_Advanced"

2. Set your working directory

If you just opened your new project, your working directory should have been automatically set as explained above. If you want at a later stage to change from one project to another or do some testing on a new directory, setting your working directory is a very first step you need to get used to. Indeed, if you want to become an efficient RStudio user, then you should assign specific directories for your projects, and when working on a project, set your working directory to its associated project directory.

Once this is done, R will, by default, look into this directory to search for your input files, and this is where it will save your output files. If you need to set it, you can do this using setwd():

> setwd("/path/to/your/working/directory")

To set the path to your working directory, it is recommended that you use relative paths, not absolute paths.

To check what your current directory is in R code, run getwd():

> getwd()
[1] "/Users/macbookair/Desktop/R/B4B_II_Advanced"

Alternatively, RStudio will point you to your current working directory at the top of the Console panel:

screenshot showing RStudio interface indicating current working directory in the Console tab

3. Document your commands: Open a script

When working in RStudio, you might have different projects running in parallel that you would like to keep separate. Even more importantly, you might want to reproduce an analysis you did previously with new data. Well, with an R script and data files, you will be able to reproduce your analysis.

You can open a new script by clicking on the icon indicating that the script is accessible by default in the upper left-hand side of the panel. You can then name and save your script to your current directory.

screenshot of RStudio interface with drop down menu indicating that the script is accessible by default in the upper left-hand panel

Notice that a “.R” file (“your_script_name.R”) has been created in your working directory. You can control this using the “Files” tab in the lower right-hand panel, or using the list.files() command.

> list.files()
[1] "Script_B4B_Advanced.R"

The output shows that this working directory contains now the newly created script file.

4. Document your strategy: Add explicit comments to each step

Comments are classically used to describe a piece of code in a script, whether in R, bash or other programs. They consist of explanations or metadata to add to explain steps of a script, in the form of single-line comments in R. To do this, simply add the “#” symbol before each line to “comment”. Some steps of the program can be ignored when “commented” (using the “#”).

# this code prints Hello World
> print("Hello World")

Comments are not essential to the code itself, as they will not be interpreted and will be overlooked during the execution of the code. However, it is a wise and very common practice as it allows a user to:

  • Remember each working step of a code
  • Increase script readability for additional readers when shared.

Summary

In summary, the RStudio environment gives you the possibility to:

  • Create a specific RStudio project for each data analysis project you have
  • Each project is associated to a specific directory / sub-directory
  • Input and output data files can easily be organized in one project and one location (input files, analytical output files, scripts, plots, etc), making all data associated with a project separate from others
  • Comments in scripts can ease the understanding of each analytical step in a project.
  • Loading external input files into RStudio is possible
© Wellcome Connecting Science
This article is from the free online

Bioinformatics for Biologists: Analysing and Interpreting Genomics Datasets

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now