RStudio: A Walkthrough
I’m gonna use the R and RStudio a lot in the subsequent posts, which is why I would first like to demonstrate how to install and use these extremely useful tools.
The R
The RStudio depends on the R package, which is why you should download and install R first before moving on to installing the studio.
Ubuntu
In Ubuntu, all you need to do in order to download and install R and its dependencies is
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
sudo add-apt-repository 'deb [arch=amd64] https://cran.rstudio.com/bin/linux/ubuntu xenial/'
sudo apt-get update
sudo apt-get install r-base
Other Operating Systems
If you have Windows, Mac OS X or just a different distribution of Linux, follow instructions in the Download and Install R section of the RStudio website.
The RStudio
Download
The RStudio Desktop can be downloaded from the RStudio home page. As described here, the free version has the same features as the paid version but comes with a different lincense (AGPL) and without support.
If you decide to go with the free version, click on the Download button located at the bottom of the FREE column and focus on the Installers section of the download page. There you should pick an installer that is pertinent to your operating system – there is an EXE installer for Windows, DMG for Mac OS X, DEB package for Ubuntu/Debian and RPM for Fedora/RedHat/openSUSE.
Installation
Run the installer. When it completes, start the RStudio Desktop so that you can inspect it’s user interface.
Working with Projects
Download the Functions.zip archive –it contains a sample project which we will use in this section to describe various tabs of the RStudio. After the archive has been downloaded, extract it, go into a newly created Functions folder and double-click on the Functions.Rproj icon. This opens up the Functions project in the RStudio.
On the Files tab in the bottom-right corner of the window (indicated with red in the screenshot above), double click on the 01_linear.R script. A new tab with that script opens up in the top-left section of RStudio’s window. In that new window, select the following lines and press Ctrl+Enter.
constFun <- function(x) rep(3, length(x))
curve(constFun, -1, 6)
The combination of Ctrl+Enter actually executes a selected portion of a script, which causes multiple tabs to update:
- Console (bottom-left section of the window) – the executed code appears in this window (as well as possible errors and warnings)
- Environment (top-right section of the window) – the first line of the selected script defines a new object –namely a function called constFun. When this line has executed, the function will show up in the Environment box.
- Plots (bottom-right section of the window) – the Plots tab gets focus over the Files tab and it contains a plot of a constant function constFun when the second line of the script is executed.
Getting Help on R
You may want to go through the Introduction to R to familiarize yourself with the R language.
Also, please keep in mind that you can invoke documentation on any R function by running ?functionname in the RStudio’s console. For example, the command below will get you documentation on the lm function:
?lm
If you want to study documentation of a particular package, issue the command library(help = “packagename”). For example, you can use the following command in order to get documentation for the datasets package:
library(help = "datasets")
Getting Sample Data
Sample data will certainly come in handy when you will be learning the R language or functions in its’ packages.
The following commands can be used to load and explore the Iris data set from the datasets package (it comes with the standard R installation):
data(iris)
dim(iris)
levels(iris$Species)
head(iris)
Or you can install a new package with new data sets, as illustrated below:
install.library("mlbench")
library(mlbench)
library(help = "mlbench")
data(BostonHousing)
dim(BostonHousing)
head(BostonHousing)
Further Learning Resources
RStudio Wiki
YouTube
Data Sets
- Machine Learning Datasets in R (10 datasets you can use right now)
- The R Datasets Package
- R Built-in Data Sets