RAdelaide 2024
July 11, 2024
limma was first released in 2004 (Smyth 2004)
Image courtesy of National Human Genome Research Institute
Wang, M. (2021). Next-Generation Sequencing (NGS). In: Pan, S., Tang, J. (eds) Clinical Molecular Diagnostics. Springer, Singapore. https://doi.org/10.1007/978-981-16-1037-0_23
fastq file
fq.gz suffixR{prefix}_R1.fq.gz + {prefix}_R2.fq.gz@+@SRR001666.1 071112_SLXA-EAS1_s_7:5:1:817:345 length=72
GGGTGATGGCCGCTGCCGATGGCGTCAAATCCCACCAAGTTACCCTTAACAACTTAAGGGTTTTCAAATAGA
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IG9ICIIIIIIIIIIIIIIIIIIIIDIIIIIII>IIIIII/
@SRR001666.2 071112_SLXA-EAS1_s_7:5:1:801:338 length=72
GTTCAGGGATACGACGTTTGTATTTTAAGAATCTGAAGCAGAAGTCGATGATAATACGCGTCGTTTTATCAT
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII6IBIIIIIIIIIIIIIIIIIIIIIIIGII>IIIII-I)8I
FastQCcutadapt, AdapterRemoval, trimmomatic, TrimGalore etcfastp combines QC reports with read trimmingFastQC, fastp, cutadapt all return reports after running
MultiQC is an excellent standalone tool for combining all reportsngsReports is the “go-to” Bioconductor package for this
STAR, hisat2 or bowtie2
bam files produced| Column | Field | Description |
|---|---|---|
| 1 | QNAME |
The original FastQ header line |
| 2 | FLAG |
Information regarding pairing, primary alignment, duplicate status, unmapped etc |
| 3 | RNAME |
Reference sequence name (e.g. chr1) |
| 4 | POS |
Left-most co-ordinate in the alignment |
| 5 | MAPQ |
Mapping quality score |
| 6 | CIGAR |
Code summarising exact matches, insertions, deletions etc. |
| 7 | RNEXT |
Reference sequence the mate aligned to |
| 8 | PNEXT |
Left-most co-ordinate the mate aligned to |
| 9 | TLEN |
Read length |
| 10 | SEQ |
The original read sequence |
| 11 | QUAL |
The read quality scores |
NH:i:1 indicates this read aligned only onceAS:i:290 the actual alignment score produced by the alignerNM:i:2 two edits are required to perfectly match the referencesamtools in bashbam files is RsamtoolsBamFile or BamFileList objectsScanBamParam()GRanges)DNAStringSetsGRanges objectsfeatureCounts from the Subread tool
RSEM, htseqR:
Rsubread or GenomicAlignments