Skip main navigation

Starting Hadoop, RStudio and RHadoop

How to start the required services and check that everything is working inside the virtual machine. Dr Leon Kos will explain these tests.
Screen which demonstrates how to start DFS, YARN and RStudio
© PRACE and University of Ljubljana

We need to quickly check whether everything is working in our virtual machine and to become familiar with the environment we will be working with.

Open the terminal by clicking on the black icon on the bottom left and type (note that the symbol $ is the prompt in the terminal and not part of the command you type in):

$ start-dfs.sh
$ start-yarn.sh
$ hadoop fs -ls

Then type:

$ rstudio &

to open the RStudio GUI. We should open an R script file and save it (e.g., init.R) to the local or any other folder up to our choice. Next we should set up system variables by copying the following lines into the script file and execute them:

Sys.setenv(HADOOP_OPTS="-Djava.library.path=/usr/local/hadoop/lib/native")
Sys.setenv(HADOOP_HOME="/usr/local/hadoop")
Sys.setenv(HADOOP_CMD="/usr/local/hadoop/bin/hadoop")
Sys.setenv(HADOOP_STREAMING="/usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.6.5.jar")
Sys.setenv(JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64")

Alternatively, we can execute the above lines one by one in the GUI terminal of RStudio.

We load (in the GUI terminal of RStudio) RHadoop by loading the libraries rhdfs and rmr2 and executing hdfs.init():

library(rhdfs)
library(rmr2)
hdfs.init()

You might see the log and some warning messages, but no filesystem errors while executing the above commands. There might be some glyph rendering errors, which we can simply ignore.

Try to quit and stop the Hadoop servers by using counterpart commands stop-yarn.sh and stop-dfs.sh in terminal.

© PRACE and University of Ljubljana
This article is from the free online

Managing Big Data with R and Hadoop

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education