[1] 1 2 3 4 5 6 7 8 9 10
RAdelaide 2024
July 10, 2024
The key building blocks for R objects: Vectors
RWhat is a vector?
Definition
A vector is zero or more values of the same type
A simple vector would be
[1] 1 2 3 4 5 6 7 8 9 10
What type of values are in this vector?
Another vector might be
[1] "a" "cat" "video"
What type of values are in this vector?
[1] "742" "Evergreen" "Tce"
What type of values are in this vector?
RTRUE or FALSEnumericWhy are these called doubles?
These are the basic building blocks for all R objects
complex & rawR data structures are built on these 6 vector typesWhat four defining properties might a vector have?
length()typeof()
class()attributes()
names etc.Let’s try them on our vectors
Were you surprised by any of the results?
We can combine two vectors in R, using the function c()
1 & 2 were both vectors with length 1What would happen if we combined two vectors of different types?
Q: What happened to the logical values?
Answer: R will coerce them into a common type (i.e. integers).
What other types could logical vectors be coerced into?
Try using the functions: as.integer(), as.double() & as.character() on logi_vec
One or more elements of a vector can be called using []
Double brackets ([[]]) can be used to return single elements only
If you tried y[[1:3]] you would receive an error message
If a vector has name attributes, we can call values by name
Try repeating the call-by-name approach using double brackets
What was the difference in the output?
[] returned the vector with the identical structure[[]] removed the attributes & just gave the valueIs it better to call by position, or by name?
Things to consider:
What is really happening in this line?
We are using the integer vector 1:5 to extract values from euro
R Functions are designed to work on vectors
This is one of the real strengths of R
We can also combine the above logical test and subsetting
An additional logical test: %in% (read as: “is in”)
Returns TRUE/FALSE for each value in dbl_vec if it is in int_vec
NB: int_vec was coerced silently to a double vector
length attribute.matrix is the two dimensional equivalentdim(), nrow() ncol()rownames() & colnames()Some commands to try:
Ask questions if anything is confusing
x[row, col]row or col blank selects the entire row/columnHow would we just get the first column?
NB: Forgetting the comma when subsetting will treat the matrix as a single vector running down the columns
Requesting a row or column that doesn’t exist is the source of a very common error message
Error in int_mat[5, ] : subscript out of bounds
Arrays extend matrices to 3 or more dimensions
Beyond the scope of today, but we just have more commas in the square brackets, e.g.
Summary of main data types in R
| Dimension | Homogeneous | Heterogeneous |
|---|---|---|
| 1d | vector |
list |
| 2d | matrix |
data.frame |
| 3d+ | array |
A list is a heterogeneous vector.
R objectvector, or matrixlistR object type we haven’t seen yetThese are incredibly common in R
Many R functions provide output as a list
NB: There is a function (print.htest()) that tells R how to print the results to the Console
We can call the individual components of a list using the $ symbol followed by the name
Note that each component is quite different to the others.
A list is a vector so we can also subset using the [] method
list
Double brackets again retrieve a single element of the vector
R objectWhen would we use either method?
Finally!
vectordim(), nrow(), ncol(), rownames(), colnames()colnames() & rownames() are NOT optional \(\implies\) assigned by default
tibble variants have simple row numbers as rownamesLet’s load pigs again
Individual entries can also be extracted using the square brackets
Thinking of columns being vectors is quite useful
data.frame using the $ operatorThis does NOT work for rows!!!
R is column major by default (as is FORTRAN & Matlab)
R was designed for statistical analysis, but has developed capabilities far beyond thisWe will see this advantage this afternoon
Data frames are actually special cases of lists
data.frame is a component of a listlistForgetting the comma, now gives a completely different result to a matrix!
Was that what you expected?
Try using the double bracket method
What do you think will happen if we type:
Error: Column index must be at most 3 if positive, not 5
R ObjectsHow do we assign names?
Can we remove names?
The NULL, or empty, vector in R is created using c()
We can also use this to remove names
Don’t forget to put the names back…
We can convert vectors to matrices, as earlier
R is column major so fills columns by default
We can assign row names & column names after creation
Or using dimnames()
This a list of length 2 with rownames then colnames as the components.
OR
What happens if we try this?
This is exactly the same as creating lists, but
The names attribute will also be the colnames()
What happens if we try to combine components that aren’t the same length?
