use EDAseq normalized data as input of DESeq2
2
0
Entering edit mode
@zoppoli-pietro-4792
Last seen 5.6 years ago
United States

Hi

 

I'd like to use EDAseq normalized data as input of DESeq2.

I found the vignette for DESeq but not for DESeq2.

Is there a way to do DESeq2 DEA starting from  EDAseq normalized data ? 

I understand I can use this

dds <- DESeqDataSetFromMatrix(countData = cts,

                              colData = coldata,

                              design = ~ condition)

 

to extract the data I need and then use them in the formula BUT

doing this I miss all the information of the SeqExpressionSet object and the normalizations done by EDAseq. 

Thanks,

Pietro

deseq2 edaseq • 23k views
ADD COMMENT
0
Entering edit mode
davide risso ▴ 950
@davide-risso-5075
Last seen 4 weeks ago
University of Padova

Hi Pietro,

first, the DESeq2 vignette can be found at this link.

As you will see, Mike did a great job at covering as many use cases as possible and he has a section on how to import gene-specific size factors from EDASeq and similar packages: https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#sample-gene-dependent-normalization-factors

In short, you have to use the offset=TRUE option in EDASeq to compute the offsets and pass them to DESeq2 as explained in the DESeq2 vignette (note the transformation!).

ADD COMMENT
0
Entering edit mode
@zoppoli-pietro-4792
Last seen 5.6 years ago
United States

Sorry, I was unclear in my question.

I know the DESeq2 vignette but in the EDAseq I didn't find how to handle DESeq2 while there is EgdeR and DESeq.

I was wondering if there was something like this (reported in EDAseq):

## ----deseq------------------------------------------------

library(DESeq)

counts <- as(dataNorm, "CountDataSet")

sizeFactors(counts) <- rep(1,4)

counts <- estimateDispersions(counts)

res <- nbinomTest(counts, "wt", "mut")

head(res)

 

BUT for DESseq2.

What I do is: 

## ----deseq2------------------------------------------------

library(DESeq2)
# data is data from EDAseq example
dataOffset <- withinLaneNormalization(data,"gc",
                                      which="full",offset=TRUE)
dataOffset <- betweenLaneNormalization(dataOffset,
                                       which="full",offset=TRUE)


dds <- DESeqDataSetFromMatrix(countData = dataOffset@assayData$normalizedCounts,
                              colData = pData(dataOffset),
                              design = ~ conditions)


EDASeqNormFactors <- exp(-1 * offst(dataOffset))
normalizationFactors(dds) <- EDASeqNormFactors
# dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
dds <- nbinomWaldTest(dds)
res <- results(dds)
res <- res[order(res$pvalue),]
head(res)

 

 

Thank you,

P

 

 

ADD COMMENT
1
Entering edit mode

Hi Pietro, please use the "comment" function to comment on an answer, and the "answer" function only to add an answer to the original question.

I see what you mean now: you are looking for a coercion method to DESeqDataSet similar to the one that we have for CountDataSet.

There isn't one because nobody ever asked, but I can certainly add it to the EDASeq package.

Note that the DESeq section in the EDASeq package is a workaround due to the fact that DESeq could not handle gene-specific offsets. Since DESeq2 can, the recommended pipeline is now similar to the one described in the edgeR section.

Namely, computing offsets with EDASeq and pass them to DESeq2, with no need to use the normalized counts.

In your example, you need to modify the code to use the raw counts to create the DESeqDataSet otherwise, you are normalizing twice. i.e., changing the line that creates the dds object to:

dds <- DESeqDataSetFromMatrix(countData = counts(dataOffset), 
                              colData = pData(dataOffset),
                              design = ~ conditions)
ADD REPLY
0
Entering edit mode

Thanks a lot!

Everything much clear now!

Pietro

ADD REPLY
0
Entering edit mode

Sorry to bother but I'd like to know if it's correct to pass the normCounts(dataOffset) obtained in the previous example directly to voom in order to to process RNA-Seq data prior to linear modelling in limma.

I'm trying to make a comparison on the different pipelines to make a DEA from my RNAseq data.

Thanks in advance,

P

 

##  ----voom-limma-----------------------------------------------

logCPM <- limma::voom(normCounts(dataOffset), design)

cont.matrix <- limma::makeContrasts(contrasts = contr,  levels = design)

fit <- limma::lmFit(logCPM, design)
fit <- contrasts.fit(fit, cont.matrix)

fit <- limma::eBayes(fit, trend = FALSE)

DEGs <- limma::toptable(fit, coef = 1, adjust.method = "fdr", 

                                  number = nrow(normCounts(dataOffset)))

ADD REPLY
1
Entering edit mode

It looks reasonable, but we haven't tested this approach, so I would carefully look at the results, residuals, etc. to make sure that everything looks reasonable in terms of model fit. It might be good to check with Gordon Smyth or other limma developers.

ADD REPLY
0
Entering edit mode

If you have a question about limma, I'd make a new post. But please note the advice here:

A: edgeR and DESeq2

ADD REPLY

Login before adding your answer.

Traffic: 639 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6