edgeR analysis: glm functions and offset matrix
1
0
Entering edit mode
Marianna ▴ 20
@7cc5052f
Last seen 11 weeks ago
Italy

Dear Bioconductor community,

I'm trying to analyze a RNA-seq dataset with edgeR after importing the Kallisto counts with tximport and using the offset matrix. But I'm not sure if I'm using the offset matrix correctly. Can you kindly advise me? Below the code

data import and normalization

Txi_gene <- tximport(path, 
                     type = "kallisto", 
                     tx2gene = Tx2gene, 
                     txOut = FALSE,
                     ignoreAfterBar=TRUE,
                     ignoreTxVersion =TRUE)

myCounts <- Txi_gene$counts
myTPM <- Txi_gene$abundance
normMat <- Txi_gene$length
colnames(myCounts)<-c(sampleLabels)

#obtaining per-observation scaling factors for length
normMat <- normMat/exp(rowMeans(log(normMat)))
normCounts <- myCounts/normMat

#computing effective library sizes from scaled counts
eff.lib <- calcNormFactors(normCounts) * colSums(normCounts)

#combining effective library sizes with the length factors, and calculating offsets for a log-link GLM
normMat <- sweep(normMat, 2, eff.lib, "*")
normMat <- log(normMat)

edgeR: create a DGElist

myDGEList <- DGEList(counts= myCounts)
myDGEList <- scaleOffset(myDGEList, normMat)

#creating a matrix of CPMs and logCPMs
cpms <- edgeR::cpm(myDGEList, offset = myDGEList$offset, log = FALSE)
colSums(cpms)
log2.cpms <- edgeR::cpm(myDGEList, offset = myDGEList$offset, log=TRUE)

#set the design
group <- paste(myo, sex, sep="_")
design <- model.matrix(~0+group)

#filtering
keep<-filterByExpr(myDGEList, design)
F_myDGEList<- myDGEList[keep, , keep.lib.sizes=TRUE]

#NORMALIZATION (normLibSizes) IS NOT NECESSARY

estimate dispersion, set contrasts, fit the data

disp <- estimateDisp(F_myDGEList, design, robust=TRUE) 

### SHOULD I add here the offset??? or it is automatically "into" the DGElist? ### 

my.contrasts <- makeContrasts(SM_F = groupNormal_F - groupSM_F, 
                                 WS_F = groupNormal_F - groupWS_F,
                                 WB_F = groupNormal_F - groupWB_F,
                                 SM_M = groupNormal_M - groupSM_M, 
                                 WS_M = groupNormal_M - groupWS_M,
                                 WB_M = groupNormal_M - groupWB_M,
                                 levels=design)

fit_qlf<- glmQLFit(disp, design, robust=TRUE, legacy=FALSE) 

### SHOULD I add here the offset??? or it is automatically "into" the DGElist? ###

QLFtest (example for a single comparison)

QLFtest.SM_F<- glmQLFTest(fit_qlf, contrast=my.contrasts [, "SM_F"])
toplist.QLFtest.SM_F<-topTags(QLFtest.SM_F, n = Inf, sort.by = "p.value",adjust.method = "BH")
is.de.QLFtest.SM_F<- decideTestsDGE(QLFtest.SM_F)
summary (is.de.QLFtest.SM_F)
(sessionInfo)
R version 4.3.2 (2023-10-31 ucrt) -- "Eye Holes"
Platform: x86_64-w64-mingw32/x64 (64-bit) 
EdgeR v 4.0.1

Thank you to everyone who takes the time to respond to my question.

Best

Marianna

tximport edgeR Normalization • 506 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 6 hours ago
WEHI, Melbourne, Australia

The instructions are in the tximport vignette. The offset is added to the DGEList. You don't add it anywhere else, it is automatically propagated.

ADD COMMENT
0
Entering edit mode

Thank you Gordon Smyth for your prompt reply. I really appreciate it.

Yes, the tximport vignette says that after including the offset in the dgelist (asI did using the code provided in the vignette), the data are ready for the estimate dispersion functions (see edgeR User's Guide)

So after including the offset in the DGEList by using the code

myDGEList <- scaleOffset(myDGEList, normMat)

I'm done.

No need to normalize again with the edgeR normLibSizes, no need to "recall" again the offset that is automatically propagated.

Thank you very much for your time and support.

Best regards

Marianna

ADD REPLY

Login before adding your answer.

Traffic: 656 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6