body(summary)UseMethod("summary")
RAdelaide 2024
July 11, 2024
Tools > Install Packages...
install.packages("pkg_name")Seurat for scRNAngsReports nearly every day (still…)
R) in 2001R generally has bi-annual releases (R 4.4.0 April 24th, 2024)
BiocManager() is a CRAN packagegithubDESeq2 & edgeR for bulk RNA-Seq AnalysisDiffBind & extraChIPs for ChIP-Seq Analysisfgsea for GSEA within RGenomicRanges for working with GRanges objectsTaken from https://carpentries-incubator.github.io/bioc-project/02-introduction-to-bioconductor.html
body(function_name) always has comments removedR directory
browseVignettes()
R has two common types of objects
S3 are very common & old (1970s)
lm() or t.test()S4 introduced in ’90s
S4 objects
summary()
vector or data.frameHow does summary() know what to do for different data structures?
summary() it’s a bit oddsummary() uses different methods depending on the object class [1] summary.aov summary.aovlist*
[3] summary.aspell* summary.check_packages_in_dir*
[5] summary.connection summary.data.frame
[7] summary.Date summary.default
[9] summary.ecdf* summary.factor
[11] summary.glm summary.infl*
[13] summary.lm summary.loess*
[15] summary.manova summary.matrix
[17] summary.mlm* summary.nls*
[19] summary.packageStatus* summary.POSIXct
[21] summary.POSIXlt summary.ppr*
[23] summary.prcomp* summary.princomp*
[25] summary.proc_time summary.rlang_error*
[27] summary.rlang_message* summary.rlang_trace*
[29] summary.rlang_warning* summary.rlang:::list_of_conditions*
[31] summary.srcfile summary.srcref
[33] summary.stepfun summary.stl*
[35] summary.table summary.tukeysmooth*
[37] summary.warnings
see '?methods' for accessing help and source code
summary.data.frame() summary() is called on a data.framesummary.lm() lm (produced by lm())summary.prcomp() prcomp (produced by prcomp())summary.default()body(summary.default)
summary(letters) [1] [ [[ [[<- [<- $<- aggregate
[7] anyDuplicated anyNA as.data.frame as.list as.matrix as.vector
[13] by cbind coerce dim dimnames dimnames<-
[19] droplevels duplicated edit format formula head
[25] initialize is.na Math merge na.exclude na.omit
[31] Ops plot print prompt rbind row.names
[37] row.names<- rowsum show slotsFromS3 sort_by split
[43] split<- stack str subset summary Summary
[49] t tail transform type.convert unique unstack
[55] within xtfrm
see '?methods' for accessing help and source code
print() methodprint(my_tbl, n = 20)
print.tbl (which is hidden)S3 objectsdata.frame, list, htest, lm etc)is() instead of class()R looks for print.tbl_df() \(\rightarrow\) print.tbl() \(\rightarrow\) print.data.frame() etcprint.default()Many Bioconductor Packages define S4 objects
@ symbol for “slots” as well as $ for list elements
tidyversetidyverse by > 10 yearstidyomics is an active area of Bioconductor development

S4 implementations of S3 objects
data.frame (S3) Vs DataFrame (S4)list (S3) Vs List (S4)vector (S3) Vs Vector (S4)rle (S3) Vs Rle (S4)DataFrame and you have a data.frame [1] "X" "X" "X" "X" "X" "X" "X" "X" "X" "X" "Y" "Y" "Y" "Y" "Y"
character-Rle of length 15 with 2 runs
Lengths: 10 5
Values : "X" "Y"
data.frame Objectsdata.frame Objectsdata.frame
rownamestibble aka tbl_df
rownames are always 1:nrow(df)data.frame typeDataFrame objectsS4 version
tidyverse (dplyr, ggplot2 etc)tidyomicstibble directlyas_tibble() for DataFrame objects
extraChIPsS4 objects to ggplot()DataFrame objectsdplyr will not work on DataFrame objectstidyverse)
subset() pre-dates dplyr::filter()rbind() and combineRows() \(\implies\) bind_rows()cbind(), combineCols() and merge() \(\implies\) joinssort() \(\implies\) arrange()unique() \(\implies\) distinct()mutate(), summarise(), across(), pivot_*()DataFrame objectstbl_df objects)
CharacterList() from IRangesS4 lists can be typed \(\implies\) memory efficiencyList objects can exist in a compressed form \(\implies\) memory efficiencyDataFrame objects can have S4 objects as columns
S3 data frames (including tibbles) cannotDataFrame objectslibrary(IRanges)
genes <- c("A", "B")
transcripts <- CharacterList(
c("A1", "A2", "A3"), c("B1", "B2")
)
transcriptsCharacterList of length 2
[[1]] A1 A2 A3
[[2]] B1 B2
DataFrame objectsDataFrame objectslistFormal class 'DFrame' [package "S4Vectors"] with 6 slots
..@ rownames : NULL
..@ nrows : int 2
..@ elementType : chr "ANY"
..@ elementMetadata: NULL
..@ metadata :List of 1
.. ..$ details: chr "Created for RAdelaide 2024"
..@ listData :List of 2
.. ..$ Gene : chr [1:2] "A" "B"
.. ..$ Transcripts:Formal class 'CompressedCharacterList' [package "IRanges"] with 5 slots
DataFrame objectsmcols()
DataFrame with 2 rows and 1 column
meta
<character>
Gene Made-up genes
Transcripts Made-up transcripts
Formal class 'DFrame' [package "S4Vectors"] with 6 slots
..@ rownames : NULL
..@ nrows : int 2
..@ elementType : chr "ANY"
..@ elementMetadata:Formal class 'DFrame' [package "S4Vectors"] with 6 slots
..@ metadata :List of 1
.. ..$ details: chr "Created for RAdelaide 2024"
..@ listData :List of 2
.. ..$ Gene : chr [1:2] "A" "B"
.. ..$ Transcripts:Formal class 'CompressedCharacterList' [package "IRanges"] with 5 slots
S4 Object StructureS4 objects have slots denoted with @
S4 class
NULL) objectsS3 or S4 objectsS4 Object Structurelapply our way through these objectsobject@slotName
slot(object, "slotName")S4 Object StructureslotNames(object)S4 MethodsS3 method dispatch uses the method.class syntaxS4 is very different but has some similaritiesS4 objects almost always have hierarchical classes
S3 objectsGeneric function must be defined for each method/classS4 Methodsis() [1] "DFrame" "DataFrame" "SimpleList" "RectangularData"
[5] "List" "DataFrame_OR_NULL" "Vector" "list_OR_List"
[9] "Annotated" "vector_OR_Vector"
S4 Methodsbody() will return standardGeneric()
UseMethod()R
S4 object classes are common
CRAN packages (spatial/GIS)tidyverse
