Using RStudio To Write Scripts
ASI: Introduction to R
RStudio
Introduction to RStudio
R
and RStudio
are two separate but connected things
R
is like the engine of your car- We’ve just tinkered with the engine
RStudio
is the ‘cabin’ we use to control the engine- Comes with extra features not related to
R
- Known as an IDE (Integrated Development Environment)
- Comes with extra features not related to
R
does all the calculations, manages the data, generates plotsRStudio
helps manage our code, display the plots etc
What is RStudio
- RStudio is product of a for profit company (Posit)
- RStudio (Desktop) is free
- RStudio Server has annual licence fee of $’000s
- Posit employs many of the best & brightest package developers
- e.g.
tidyverse
,bookdown
,reticulate
,roxygen2
etc - The CEO (JJ Allaire) is still an active developer
- e.g.
- Other IDEs also exist (e.g. emacs, VSCode, positron)
- I remember being at the launch of RStudio (Coventry, 2011). It was a room full of R programmers thinking “holy crap, this changes everything”
- RStudio/Posit is a corporation whilst R is an academic-led volunteer community. So far relatively good relationship
- Heard JJ Allaire present some of his latest work a couple of years ago
Some very helpful features of RStudio
- We can write scripts and execute code interactively
- We can see everything we need (directories, plots, code, history etc.)
- Predictive auto-completion
- Integration with Github Co-Pilot
- Integration with other languages
- markdown, \(\LaTeX\), bash, python, C++, git etc.
- Numerous add-ons to simplify larger tasks
- Posit is now developing Positron to better enable a variety of languages
Create an R Project
I use R Projects
to manage each analysis
- Create a directory on your computer for today’s material
- We recommend
R_Training
in your home directory
- We recommend
- Now open
RStudio
RStudio
will always open in a directory somewhere- Look in the
Files
pane (bottom-right) to see where it’s looking
(Or typegetwd()
in the Console pane) - This is the current working directory for
R
Create An Empty R Script
File
>New File
>R Script
- Save As
Introduction.R
RStudio
This is the basic layout we often work with
We’ll come back to the script window in the next section
The R Console
- This is the R Console within the RStudio IDE
- We’ve already explored this briefly
- In the same pane we also have two other tabs
- Terminal: An approximation of a
bash
terminal (or PowerShell for Windows) - Background Jobs shows progress when compiling RMarkdown & Quarto \(\implies\) Not super relevant
- Terminal: An approximation of a
The R Environment
Like we did earlier, in the R Console type:
> x <- 5
Where have we created the object x
?
- Is it on your hard drive somewhere?
- Is it in a file somewhere?
- We have placed
x
in ourR Environment
- Formally known as your
Global Environment
The History Tab
- Next to the Environment Tab is the History Tab
- Keeps a record of the last ~200 lines of code
- Very useful for remembering steps during exploration
- Best practice is to enter + execute code from the Script Window
- We can generally ignore the Connections and any other tabs
- A
git
tab will also appear for those who use git in their project
- A
Accessing Help
- May be issues with
URL '/help/library/base/html/00Index.html' not found
- The examples in this help page are a bit rubbish…
> ?sqrt
- This will take you to the
Help
Tab for thesqrt()
function- Contents may look confusing at this point but will become clearer
- Many inbuilt functions are organised into a package called
base
- Packages group similar/related functions together
base
is always installed and loaded withR
Additional Sources For Help
As a package author, I’m always reading my own help pages. I simply can’t remember everything I’ve written
- Help pages in
R
can be hit & miss- Some are excellent and informative \(\implies\) some aren’t
- I regularly read my own help pages
- Bioconductor has a support forum for Bioconductor packages
- https://support.bioconductor.org
- All packages have a vignette (again varying quality)
- Google is your friend \(\implies\) maybe ChatGPT?
The Plots Pane
- We’ve already seen the Files Tab
- Plots appear in the Plots Tab
> plot(cars)
Cheatsheet and Shortcuts
Help > Cheatsheets > RStudio IDE Cheat Sheet
Page 2 has lots of hints:
Ctrl + 1
places focus on the Script WindowCtrl + 2
places focus on the ConsoleCtrl + 3
places focus on the Help Tab
The Script Window
RStudio: The Script Window
Best practice for analysis is to enter all code in the Script Window
- This is a plain text editor \(\implies\)
RStudio
will:- highlight syntax for us
- help manage indenting
- enable auto-completion (it can be slower than your typing)
- Enter code in your script and send it to the R Console
- We save this file as a record of what we’ve done
- Code is the important object \(\implies\) can recreate all results
Vectors
Vectors
- The object
x
is avector
\(\implies\) fundamental structure inR
- Like a single column in a spreadsheet
- In
R
when we pass avector
to a function, the entire vector is evaluated- No need to select a column from your spreadsheet
Add the following to your script
# Are any values of x are greater than one?
> 1 x
[1] FALSE TRUE TRUE TRUE TRUE
Notice we have a value returned for each element of x
Subsetting Vectors
- We can subset vectors using square brackets
[]
# What are the first 5 letters of the alphabet?
1:5] letters[
[1] "a" "b" "c" "d" "e"
Vector Types
R
has 6 types of atomic vectors \(\implies\) only 4 are commonly used
logical
: Can only containTRUE
orFALSE
- Are binary (i.e. single-bit) values
integer
: Only contains whole numbers- 32 bit upper limit
numeric
: Contains numbers with decimal points (akadoubles
)- Larger memory requirements than integers
character
Remaining types are:
complex
(sqrt(-1))raw
holds raw bytescharToRaw("abc")
Examples
A logical vector is returned by a logical test
# This returns a logical vector the same length as x
# Let's save the output as a new vector using the <- symbol
<- x > 1
logi_vec logi_vec
[1] FALSE TRUE TRUE TRUE TRUE
typeof(logi_vec)
[1] "logical"
Taking square roots will return values with decimal points
# The square roots have decimal points so they are doubles
<- sqrt(x)
dbl_vec dbl_vec
[1] 1.000000 1.414214 1.732051 2.000000 2.236068
typeof(dbl_vec)
[1] "double"
Coercion
- Vectors can be coerced to other types
# Coercing x to a character vector will show every element with quotation marks
<- as.character(x)
char_vec char_vec
[1] "1" "2" "3" "4" "5"
typeof(char_vec)
[1] "character"
- Can easily coerce in order of complexity without information loss
- Information is lost going backwards
as.integer(logi_vec)
[1] 0 1 1 1 1
as.logical(x)
[1] TRUE TRUE TRUE TRUE TRUE
Advanced Subsetting
- We could use our results from the logical test to subset
x
# These two commands return the same vector
> 1] x[x
[1] 2 3 4 5
x[logi_vec]
[1] 2 3 4 5
# This returns the positions within logi_vec which are TRUE
which(logi_vec)
[1] 2 3 4 5
Creating Vectors
- Normally to create vectors we use
c()
- Stands for combine (i.e. we combine vectors)
- Is an empty vector (i.e. NULL) by default
c()
NULL
Conclusion
- Make sure you save the file
Introduction.R
- This is now a complete R Script
- Can be re-run at any time in the same order
- Will always produce identical results
- We’ve also (accidentally) learned about vectors
- Will be super helpful for the rest of the workshop
- Vectors are the most fundamental structure in
R