data import and normalization

Question

edgeR analysis: glm functions and offset matrix

0

Entering edit mode

Marianna ▴ 20

@7cc5052f

Last seen 18 days ago

Italy

Dear Bioconductor community,

I'm trying to analyze a RNA-seq dataset with edgeR after importing the Kallisto counts with tximport and using the offset matrix. But I'm not sure if I'm using the offset matrix correctly. Can you kindly advise me? Below the code

data import and normalization

Txi_gene <- tximport(path, 
                     type = "kallisto", 
                     tx2gene = Tx2gene, 
                     txOut = FALSE,
                     ignoreAfterBar=TRUE,
                     ignoreTxVersion =TRUE)

myCounts <- Txi_gene$counts
myTPM <- Txi_gene$abundance
normMat <- Txi_gene$length
colnames(myCounts)<-c(sampleLabels)

#obtaining per-observation scaling factors for length
normMat <- normMat/exp(rowMeans(log(normMat)))
normCounts <- myCounts/normMat

#computing effective library sizes from scaled counts
eff.lib <- calcNormFactors(normCounts) * colSums(normCounts)

#combining effective library sizes with the length factors, and calculating offsets for a log-link GLM
normMat <- sweep(normMat, 2, eff.lib, "*")
normMat <- log(normMat)

edgeR: create a DGElist

myDGEList <- DGEList(counts= myCounts)
myDGEList <- scaleOffset(myDGEList, normMat)

#creating a matrix of CPMs and logCPMs
cpms <- edgeR::cpm(myDGEList, offset = myDGEList$offset, log = FALSE)
colSums(cpms)
log2.cpms <- edgeR::cpm(myDGEList, offset = myDGEList$offset, log=TRUE)

#set the design
group <- paste(myo, sex, sep="_")
design <- model.matrix(~0+group)

#filtering
keep<-filterByExpr(myDGEList, design)
F_myDGEList<- myDGEList[keep, , keep.lib.sizes=TRUE]

#NORMALIZATION (normLibSizes) IS NOT NECESSARY

estimate dispersion, set contrasts, fit the data

disp <- estimateDisp(F_myDGEList, design, robust=TRUE) 

### SHOULD I add here the offset??? or it is automatically "into" the DGElist? ### 

my.contrasts <- makeContrasts(SM_F = groupNormal_F - groupSM_F, 
                                 WS_F = groupNormal_F - groupWS_F,
                                 WB_F = groupNormal_F - groupWB_F,
                                 SM_M = groupNormal_M - groupSM_M, 
                                 WS_M = groupNormal_M - groupWS_M,
                                 WB_M = groupNormal_M - groupWB_M,
                                 levels=design)

fit_qlf<- glmQLFit(disp, design, robust=TRUE, legacy=FALSE) 

### SHOULD I add here the offset??? or it is automatically "into" the DGElist? ###

QLFtest (example for a single comparison)

QLFtest.SM_F<- glmQLFTest(fit_qlf, contrast=my.contrasts [, "SM_F"])
toplist.QLFtest.SM_F<-topTags(QLFtest.SM_F, n = Inf, sort.by = "p.value",adjust.method = "BH")
is.de.QLFtest.SM_F<- decideTestsDGE(QLFtest.SM_F)
summary (is.de.QLFtest.SM_F)

(sessionInfo)
R version 4.3.2 (2023-10-31 ucrt) -- "Eye Holes"
Platform: x86_64-w64-mingw32/x64 (64-bit) 
EdgeR v 4.0.1

Thank you to everyone who takes the time to respond to my question.

Best

Marianna

tximport edgeR Normalization • 151 views

ADD COMMENT • link 27 days ago • updated 26 days ago Marianna ▴ 20

score 0 · Answer 1 · 2024-04-03

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

The instructions are in the tximport vignette. The offset is added to the DGEList. You don't add it anywhere else, it is automatically propagated.

ADD COMMENT • link 26 days ago Gordon Smyth 50k

0

Entering edit mode

Thank you Gordon Smyth for your prompt reply. I really appreciate it.

Yes, the tximport vignette says that after including the offset in the dgelist (asI did using the code provided in the vignette), the data are ready for the estimate dispersion functions (see edgeR User's Guide)

So after including the offset in the DGEList by using the code

myDGEList <- scaleOffset(myDGEList, normMat)

I'm done.

No need to normalize again with the edgeR normLibSizes, no need to "recall" again the offset that is automatically propagated.

Thank you very much for your time and support.

Best regards

Marianna

ADD REPLY • link 26 days ago Marianna ▴ 20