Question: use EDAseq normalized data as input of DESeq2
0
gravatar for zoppoli pietro
19 months ago by
United States
zoppoli pietro10 wrote:

Hi

 

I'd like to use EDAseq normalized data as input of DESeq2.

I found the vignette for DESeq but not for DESeq2.

Is there a way to do DESeq2 DEA starting from  EDAseq normalized data ? 

I understand I can use this

dds <- DESeqDataSetFromMatrix(countData = cts,

                              colData = coldata,

                              design = ~ condition)

 

to extract the data I need and then use them in the formula BUT

doing this I miss all the information of the SeqExpressionSet object and the normalizations done by EDAseq. 

Thanks,

Pietro

edaseq deseq2 • 754 views
ADD COMMENTlink modified 19 months ago • written 19 months ago by zoppoli pietro10
Answer: use EDAseq normalized data as input of DESeq2
0
gravatar for davide risso
19 months ago by
davide risso830
University of Padova
davide risso830 wrote:

Hi Pietro,

first, the DESeq2 vignette can be found at this link.

As you will see, Mike did a great job at covering as many use cases as possible and he has a section on how to import gene-specific size factors from EDASeq and similar packages: https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#sample-gene-dependent-normalization-factors

In short, you have to use the offset=TRUE option in EDASeq to compute the offsets and pass them to DESeq2 as explained in the DESeq2 vignette (note the transformation!).

ADD COMMENTlink written 19 months ago by davide risso830
Answer: use EDAseq normalized data as input of DESeq2
0
gravatar for zoppoli pietro
19 months ago by
United States
zoppoli pietro10 wrote:

Sorry, I was unclear in my question.

I know the DESeq2 vignette but in the EDAseq I didn't find how to handle DESeq2 while there is EgdeR and DESeq.

I was wondering if there was something like this (reported in EDAseq):

## ----deseq------------------------------------------------

library(DESeq)

counts <- as(dataNorm, "CountDataSet")

sizeFactors(counts) <- rep(1,4)

counts <- estimateDispersions(counts)

res <- nbinomTest(counts, "wt", "mut")

head(res)

 

BUT for DESseq2.

What I do is: 

## ----deseq2------------------------------------------------

library(DESeq2)
# data is data from EDAseq example
dataOffset <- withinLaneNormalization(data,"gc",
                                      which="full",offset=TRUE)
dataOffset <- betweenLaneNormalization(dataOffset,
                                       which="full",offset=TRUE)


dds <- DESeqDataSetFromMatrix(countData = dataOffset@assayData$normalizedCounts,
                              colData = pData(dataOffset),
                              design = ~ conditions)


EDASeqNormFactors <- exp(-1 * offst(dataOffset))
normalizationFactors(dds) <- EDASeqNormFactors
# dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
dds <- nbinomWaldTest(dds)
res <- results(dds)
res <- res[order(res$pvalue),]
head(res)

 

 

Thank you,

P

 

 

ADD COMMENTlink written 19 months ago by zoppoli pietro10
1

Hi Pietro, please use the "comment" function to comment on an answer, and the "answer" function only to add an answer to the original question.

I see what you mean now: you are looking for a coercion method to DESeqDataSet similar to the one that we have for CountDataSet.

There isn't one because nobody ever asked, but I can certainly add it to the EDASeq package.

Note that the DESeq section in the EDASeq package is a workaround due to the fact that DESeq could not handle gene-specific offsets. Since DESeq2 can, the recommended pipeline is now similar to the one described in the edgeR section.

Namely, computing offsets with EDASeq and pass them to DESeq2, with no need to use the normalized counts.

In your example, you need to modify the code to use the raw counts to create the DESeqDataSet otherwise, you are normalizing twice. i.e., changing the line that creates the dds object to:

dds <- DESeqDataSetFromMatrix(countData = counts(dataOffset), 
                              colData = pData(dataOffset),
                              design = ~ conditions)
ADD REPLYlink modified 19 months ago • written 19 months ago by davide risso830

Thanks a lot!

Everything much clear now!

Pietro

ADD REPLYlink written 19 months ago by zoppoli pietro10

Sorry to bother but I'd like to know if it's correct to pass the normCounts(dataOffset) obtained in the previous example directly to voom in order to to process RNA-Seq data prior to linear modelling in limma.

I'm trying to make a comparison on the different pipelines to make a DEA from my RNAseq data.

Thanks in advance,

P

 

##  ----voom-limma-----------------------------------------------

logCPM <- limma::voom(normCounts(dataOffset), design)

cont.matrix <- limma::makeContrasts(contrasts = contr,  levels = design)

fit <- limma::lmFit(logCPM, design)
fit <- contrasts.fit(fit, cont.matrix)

fit <- limma::eBayes(fit, trend = FALSE)

DEGs <- limma::toptable(fit, coef = 1, adjust.method = "fdr", 

                                  number = nrow(normCounts(dataOffset)))

ADD REPLYlink written 19 months ago by zoppoli pietro10
1

It looks reasonable, but we haven't tested this approach, so I would carefully look at the results, residuals, etc. to make sure that everything looks reasonable in terms of model fit. It might be good to check with Gordon Smyth or other limma developers.

ADD REPLYlink written 19 months ago by davide risso830

If you have a question about limma, I'd make a new post. But please note the advice here:

A: edgeR and DESeq2

ADD REPLYlink written 19 months ago by Michael Love26k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 241 users visited in the last hour