body(summary)
UseMethod("summary")
RAdelaide 2024
July 11, 2024
Tools > Install Packages...
install.packages("pkg_name")
Seurat
for scRNAngsReports
nearly every day (still…)R
) in 2001R
generally has bi-annual releases (R 4.4.0 April 24th, 2024)
BiocManager()
is a CRAN packagegithub
DESeq2
& edgeR
for bulk RNA-Seq AnalysisDiffBind
& extraChIPs
for ChIP-Seq Analysisfgsea
for GSEA within R
GenomicRanges
for working with GRanges
objectsTaken from https://carpentries-incubator.github.io/bioc-project/02-introduction-to-bioconductor.html
body(function_name)
always has comments removedR
directory
browseVignettes()
R has two common types of objects
S3
are very common & old (1970s)
lm()
or t.test()
S4
introduced in ’90s
S4
objects
summary()
vector
or data.frame
How does summary()
know what to do for different data structures?
summary()
it’s a bit oddsummary()
uses different methods depending on the object class [1] summary.aov summary.aovlist*
[3] summary.aspell* summary.check_packages_in_dir*
[5] summary.connection summary.data.frame
[7] summary.Date summary.default
[9] summary.ecdf* summary.factor
[11] summary.glm summary.infl*
[13] summary.lm summary.loess*
[15] summary.manova summary.matrix
[17] summary.mlm* summary.nls*
[19] summary.packageStatus* summary.POSIXct
[21] summary.POSIXlt summary.ppr*
[23] summary.prcomp* summary.princomp*
[25] summary.proc_time summary.rlang_error*
[27] summary.rlang_message* summary.rlang_trace*
[29] summary.rlang_warning* summary.rlang:::list_of_conditions*
[31] summary.srcfile summary.srcref
[33] summary.stepfun summary.stl*
[35] summary.table summary.tukeysmooth*
[37] summary.warnings
see '?methods' for accessing help and source code
summary.data.frame()
summary()
is called on a data.frame
summary.lm()
lm
(produced by lm()
)summary.prcomp()
prcomp
(produced by prcomp()
)summary.default()
body(summary.default)
summary(letters)
[1] [ [[ [[<- [<- $<- aggregate
[7] anyDuplicated anyNA as.data.frame as.list as.matrix as.vector
[13] by cbind coerce dim dimnames dimnames<-
[19] droplevels duplicated edit format formula head
[25] initialize is.na Math merge na.exclude na.omit
[31] Ops plot print prompt rbind row.names
[37] row.names<- rowsum show slotsFromS3 sort_by split
[43] split<- stack str subset summary Summary
[49] t tail transform type.convert unique unstack
[55] within xtfrm
see '?methods' for accessing help and source code
print()
methodprint(my_tbl, n = 20)
print.tbl
(which is hidden)S3
objectsdata.frame
, list
, htest
, lm
etc)is()
instead of class()
R
looks for print.tbl_df()
\(\rightarrow\) print.tbl()
\(\rightarrow\) print.data.frame()
etcprint.default()
Many Bioconductor Packages define S4
objects
@
symbol for “slots” as well as $
for list elements
tidyverse
tidyverse
by > 10 yearstidyomics
is an active area of Bioconductor development
S4
implementations of S3
objects
data.frame
(S3) Vs DataFrame
(S4)list
(S3) Vs List
(S4)vector
(S3) Vs Vector
(S4)rle
(S3) Vs Rle
(S4)DataFrame
and you have a data.frame
[1] "X" "X" "X" "X" "X" "X" "X" "X" "X" "X" "Y" "Y" "Y" "Y" "Y"
character-Rle of length 15 with 2 runs
Lengths: 10 5
Values : "X" "Y"
data.frame
Objectsdata.frame
Objectsdata.frame
rownames
tibble
aka tbl_df
rownames
are always 1:nrow(df)
data.frame
typeDataFrame
objectsS4
version
tidyverse
(dplyr
, ggplot2
etc)tidyomics
tibble
directlyas_tibble()
for DataFrame
objects
extraChIPs
S4
objects to ggplot()
DataFrame
objectsdplyr
will not work on DataFrame
objectstidyverse
)
subset()
pre-dates dplyr::filter()
rbind()
and combineRows()
\(\implies\) bind_rows()
cbind()
, combineCols()
and merge()
\(\implies\) joins
sort()
\(\implies\) arrange()
unique()
\(\implies\) distinct()
mutate()
, summarise()
, across()
, pivot_*()
DataFrame
objectstbl_df
objects)
CharacterList()
from IRanges
S4
lists can be typed \(\implies\) memory efficiencyList
objects can exist in a compressed form \(\implies\) memory efficiencyDataFrame
objects can have S4
objects as columns
S3
data frames (including tibbles) cannotDataFrame
objectslibrary(IRanges)
genes <- c("A", "B")
transcripts <- CharacterList(
c("A1", "A2", "A3"), c("B1", "B2")
)
transcripts
CharacterList of length 2
[[1]] A1 A2 A3
[[2]] B1 B2
DataFrame
objectsDataFrame
objectslist
Formal class 'DFrame' [package "S4Vectors"] with 6 slots
..@ rownames : NULL
..@ nrows : int 2
..@ elementType : chr "ANY"
..@ elementMetadata: NULL
..@ metadata :List of 1
.. ..$ details: chr "Created for RAdelaide 2024"
..@ listData :List of 2
.. ..$ Gene : chr [1:2] "A" "B"
.. ..$ Transcripts:Formal class 'CompressedCharacterList' [package "IRanges"] with 5 slots
DataFrame
objectsmcols()
DataFrame with 2 rows and 1 column
meta
<character>
Gene Made-up genes
Transcripts Made-up transcripts
Formal class 'DFrame' [package "S4Vectors"] with 6 slots
..@ rownames : NULL
..@ nrows : int 2
..@ elementType : chr "ANY"
..@ elementMetadata:Formal class 'DFrame' [package "S4Vectors"] with 6 slots
..@ metadata :List of 1
.. ..$ details: chr "Created for RAdelaide 2024"
..@ listData :List of 2
.. ..$ Gene : chr [1:2] "A" "B"
.. ..$ Transcripts:Formal class 'CompressedCharacterList' [package "IRanges"] with 5 slots
S4
Object StructureS4
objects have slots denoted with @
S4
class
NULL
) objectsS3
or S4
objectsS4
Object Structurelapply
our way through these objectsobject@slotName
slot(object, "slotName")
S4
Object StructureslotNames(object)
S4
MethodsS3
method dispatch uses the method.class
syntaxS4
is very different but has some similaritiesS4
objects almost always have hierarchical classes
S3
objectsGeneric
function must be defined for each method/classS4
Methodsis()
[1] "DFrame" "DataFrame" "SimpleList" "RectangularData"
[5] "List" "DataFrame_OR_NULL" "Vector" "list_OR_List"
[9] "Annotated" "vector_OR_Vector"
S4
Methodsbody()
will return standardGeneric()
UseMethod()
R
S4
object classes are common
CRAN
packages (spatial/GIS)tidyverse