RAdelaide 2024
July 9, 2024
x <- 1:5)R equivalent to a spreadsheet is known as a data.frame
tibbletbl_df objects referring to SQL tablesdata.frame with pretty bows & ribbonsR object
R objectdata.frames are structured with vectors as columnsR
If we’re not careful:
File > New File > R Script (Or Ctrl+Shift+N)GuineaPigs.RThen get the data for this exercise.
data.zip from the workshop homepageRAdelaide24data
data not in data/datadata directory using the Files pane(You should see pigs.csv in data)
pigs.csv by clicking on it (View File)
R
Click on pigs.csv, choose Import Dataset then stop! 🛑

(Click Update if you don’t see this)
We have a preview of the data
We also have a preview of the code we’re about to execute
Code Preview Box
ImportNow paste the copied code at the top of your script
The code we copied has 3 lines:
library(readr) loads the package readr
readr functions are about importing datareadr contains the function read_csv()read_csv() tells R what to do with a csv fileThe code we copied has 3 lines:
R Environmentpigs by using the file name (pigs.csv)The code we copied has 3 lines:
Excel-like format
Close the preview by clicking the cross
read_csv()pigs is now in our R Environment
Environment Tab click the broom icon 🧹
R EnvironmentRun
Environment Tab again and pigs is backYou can delete the line View(pigs)
Ctrl/Cmd + Enterlibrary(readr) then type# to be a commentlibrary(readr) then enterpigs is known as a data.frame
R equivalent to a spreadsheet
NAInstead of View() \(\implies\) preview by typing the object name
Gives a preview up to 10 lines with:
A tibble60 X 3len, supp, dose<dbl>, <chr>, <chr>I personally find this more informative than View()
readr uses a variant called a tbl_df or tbl (pronounced tibble)
data.frame with convenient featurestidyversetidyverse is a collection of thematically-linked packages
library(tidyverse) loads all of these packages
readr is one of these \(\implies\) usually just load the tidyverse [1] "broom" "conflicted" "cli" "dbplyr"
[5] "dplyr" "dtplyr" "forcats" "ggplot2"
[9] "googledrive" "googlesheets4" "haven" "hms"
[13] "httr" "jsonlite" "lubridate" "magrittr"
[17] "modelr" "pillar" "purrr" "ragg"
[21] "readr" "readxl" "reprex" "rlang"
[25] "rstudioapi" "rvest" "stringr" "tibble"
[29] "tidyr" "xml2" "tidyverse"
library(readr) with library(tidyverse)glimpse is from the package (pillar)
library(tidyverse)What were the differences between each method?
Rhead() and glimpse()
pigsRhead() \(\implies\) x and n
x has no default value \(\implies\) we need to provide somethingn = 6L means n has a default value of 6 (L \(\implies\) integer)RLower down the page you’ll see
Arguments
x an object
n an integer vector of length up to dim(x) (or 1, for non-dimensioned objects). Blah, blah, blah…
head() prints the first part of an objectglimpse()
pillarwidth argument to see what happensread_csv()R function read_csv()read_csv()read_csv()read_csv(
file,
col_names = TRUE, col_types = NULL, col_select = NULL,
id = NULL, locale = default_locale(),
na = c("", "NA"), quoted_na = TRUE,
quote = "\"", comment = "",
trim_ws = TRUE,
skip = 0, n_max = Inf,
guess_max = min(1000, n_max),
name_repair = "unique",
num_threads = readr_threads(),
progress = show_progress(),
show_col_types = should_show_types(),
skip_empty_rows = TRUE,
lazy = should_read_lazy()
)file, col_names etc.)col_names = TRUE)read_csv()All arguments were defined somewhere in the GUI.
First Row as Names checkboxTry clicking/unclicking a few more & try understand the consequences
read_csv()NAsread_csv() Vs read.csv()RStudio now uses read_csv() from readr by defaultread.csv() (from utils) in older scriptsreadr) version is:
utils are read.*() (csv, delim etc.)readr has the functions read_*() (csv, tsv, delim etc.)readxl is for loading .xls and xlsx files.Import Dataset
Sheet1 looks pretty simpleSheet2 & Sheet3