library(tidyverse)RAdelaide 2024
July 9, 2024
text.Rcharacter vectorsRun 1 in one file and Run_001 in anotherregexp) are incredibly powerful tools in this spaceregexp syntax is not unique to RR does have a few unique “quirks” thoughstringr contains functions for text manipulationstr_detect()str_remove()str_extract()str_replace()grepl(), grep(), gsub() etc. from basestringr::str_detect()str_detect() returns a logical vector
stringr::str_detect(). as a wild card
* has different meaning to many other contexts. obviously needed to follow M in this searchstringr::str_detect()[]o or u needed to follow the Mstringr::str_detect()^)$)stringr::str_view()str_view()stringr::str_extract()str_extract() to extract patternsstringr::str_extract()stringr::str_extract()+
stringr::str_extract()[:alpha:]?base::regexstringr::str_extract_all()str_extract() will only return the first matchx
stringr::str_remove()str_remove_all() will remove all occurencesVery useful for removing file suffixes etc
stringr::str_replace()str_replace() is used for extracting/modifying text strings
str_extract()string “Hi Mum” for the pattern “Mum”, andstringr::str_replace()stringr::str_replace()(pattern)
stringr::str_replace()*
+ except the match is zero or more timesstringr::str_replace()str_replace() only replaces the first match in a stringstr_replace_all() replaces all matchespaste() is a very useful one
" "paste0() has the default separator as ""glue has revolutionised text manipulation
glue
charactertidyverse syntax (e.g. rlang)A common data type in statistics is a categorical variable (i.e. a factor)
character vector/column
character vectoras.factor()We can manually set these categories as levels using factor()
levelforcatsforcats is a part of the core tidyverse
factorsstringrfct_inorder() sets categories in the order they appear
data.frame then apply fct_inorder() for nice structured plotsn entries