Question: use EDAseq normalized data as input of DESeq2
0
19 months ago by
United States
zoppoli pietro10 wrote:

Hi

I'd like to use EDAseq normalized data as input of DESeq2.

I found the vignette for DESeq but not for DESeq2.

Is there a way to do DESeq2 DEA starting from  EDAseq normalized data ?

I understand I can use this

dds <- DESeqDataSetFromMatrix(countData = cts,

colData = coldata,

design = ~ condition)

to extract the data I need and then use them in the formula BUT

doing this I miss all the information of the SeqExpressionSet object and the normalizations done by EDAseq.

Thanks,

Pietro

edaseq deseq2 • 754 views
modified 19 months ago • written 19 months ago by zoppoli pietro10
Answer: use EDAseq normalized data as input of DESeq2
0
19 months ago by
davide risso830
davide risso830 wrote:

Hi Pietro,

first, the DESeq2 vignette can be found at this link.

As you will see, Mike did a great job at covering as many use cases as possible and he has a section on how to import gene-specific size factors from EDASeq and similar packages: https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#sample-gene-dependent-normalization-factors

In short, you have to use the offset=TRUE option in EDASeq to compute the offsets and pass them to DESeq2 as explained in the DESeq2 vignette (note the transformation!).

Answer: use EDAseq normalized data as input of DESeq2
0
19 months ago by
United States
zoppoli pietro10 wrote:

Sorry, I was unclear in my question.

I know the DESeq2 vignette but in the EDAseq I didn't find how to handle DESeq2 while there is EgdeR and DESeq.

I was wondering if there was something like this (reported in EDAseq):

## ----deseq------------------------------------------------

library(DESeq)

counts <- as(dataNorm, "CountDataSet")

sizeFactors(counts) <- rep(1,4)

counts <- estimateDispersions(counts)

res <- nbinomTest(counts, "wt", "mut")

BUT for DESseq2.

What I do is:

## ----deseq2------------------------------------------------

library(DESeq2)
# data is data from EDAseq example
dataOffset <- withinLaneNormalization(data,"gc",
which="full",offset=TRUE)
dataOffset <- betweenLaneNormalization(dataOffset,
which="full",offset=TRUE)

dds <- DESeqDataSetFromMatrix(countData = dataOffset@assayData$normalizedCounts, colData = pData(dataOffset), design = ~ conditions) EDASeqNormFactors <- exp(-1 * offst(dataOffset)) normalizationFactors(dds) <- EDASeqNormFactors # dds <- estimateSizeFactors(dds) dds <- estimateDispersions(dds) dds <- nbinomWaldTest(dds) res <- results(dds) res <- res[order(res$pvalue),]

Thank you,

P

1

I see what you mean now: you are looking for a coercion method to DESeqDataSet similar to the one that we have for CountDataSet.

There isn't one because nobody ever asked, but I can certainly add it to the EDASeq package.

Note that the DESeq section in the EDASeq package is a workaround due to the fact that DESeq could not handle gene-specific offsets. Since DESeq2 can, the recommended pipeline is now similar to the one described in the edgeR section.

Namely, computing offsets with EDASeq and pass them to DESeq2, with no need to use the normalized counts.

In your example, you need to modify the code to use the raw counts to create the DESeqDataSet otherwise, you are normalizing twice. i.e., changing the line that creates the dds object to:

dds <- DESeqDataSetFromMatrix(countData = counts(dataOffset),
colData = pData(dataOffset),
design = ~ conditions)


Thanks a lot!

Everything much clear now!

Pietro

Sorry to bother but I'd like to know if it's correct to pass the normCounts(dataOffset) obtained in the previous example directly to voom in order to to process RNA-Seq data prior to linear modelling in limma.

I'm trying to make a comparison on the different pipelines to make a DEA from my RNAseq data.

P

##  ----voom-limma-----------------------------------------------

logCPM <- limma::voom(normCounts(dataOffset), design)

cont.matrix <- limma::makeContrasts(contrasts = contr,  levels = design)

fit <- limma::lmFit(logCPM, design)
fit <- contrasts.fit(fit, cont.matrix)

fit <- limma::eBayes(fit, trend = FALSE)

DEGs <- limma::toptable(fit, coef = 1, adjust.method = "fdr",

number = nrow(normCounts(dataOffset)))

1

It looks reasonable, but we haven't tested this approach, so I would carefully look at the results, residuals, etc. to make sure that everything looks reasonable in terms of model fit. It might be good to check with Gordon Smyth or other limma developers.

If you have a question about limma, I'd make a new post. But please note the advice here:

A: edgeR and DESeq2