Question: DESeq2 Normalized Counts for Mfuzz
gravatar for rbenel
10 months ago by
rbenel0 wrote:


I would like to preform temporal patterning using the package Mfuzz which assumes the given expression data are fully pre-processed including data normalization and averaging of replicates. 

As @Michael Love explained in this question expression profiles between DESeq2 and Mfuzz "the normalized counts in DESeq2 are just provided for visualization, they aren't actually used in the DESeq2 model which uses all samples on the original counts scale and keeps track of size factors on the right-hand side of the equations as shown in the DESeq2 paper".

So, is performing one of DESeq2's transformations and then adjusting for a covariate, using limma's removeBatchEffect() function on the transformed values, as shown here How do I extract read counts from DESeq2 still the most effective way to extract normalized counts from a dds object for downstream use?



rnaseq mfuzz deseq2 • 324 views
ADD COMMENTlink modified 10 months ago by Michael Love24k • written 10 months ago by rbenel0
Answer: DESeq2 Normalized Counts for Mfuzz
gravatar for Michael Love
10 months ago by
Michael Love24k
United States
Michael Love24k wrote:

This really depends on the downstream use.

Count based methods will want unnormalized counts. If a method wants normalized, variance stabilized transformations we provide these, see the transformations section of the vignette. Normalized counts are useful for visualization mostly.


ADD COMMENTlink written 10 months ago by Michael Love24k

OK. This specific method wants normalized counts, and since the sample size is n < 30, in the vignette you suggest to use rlog.  So if my design = ~ replicate + day it is in my best interest to use mat <- assay(rld) 

mat2 <- removeBatchEffect(mat, rld$replicate)? 

and ​then use mat2 as an input for Mfuzz?

ADD REPLYlink modified 10 months ago • written 10 months ago by rbenel0

If it wants normalized counts, you don't want to use log-transformed, VST or rlog counts.

I guess you should ask the Mfuzz developers how to deal with batch. I can't give too much advice because I don't know what they expect as input.

ADD REPLYlink written 10 months ago by Michael Love24k

OK, thanks for the input. I tagged Mfuzz as well, but not every developer is as active as you are ;) 

On the topic of downstream analysis and normalized counts, if I wanted to look at the correlation between genes, (at the expression level), is using one of the transformed counts appropriate? 

as always, thank you for your time!

ADD REPLYlink written 10 months ago by rbenel0

I would look at correlation on the log2 scale. I tend to use vst() for transformation, and if I want to remove batches, I use removeBatchEffect on the matrix in assay(vsd).

ADD REPLYlink written 10 months ago by Michael Love24k

Just to update anyone who might be interested in using the Mfuzz package on normalized DESeq2 counts: 

From the developers of Mfuzz: "For RNA-seq, you could input log-transformed normalised cpm and then use the standardise function. If you are looking specifically for peaks, you can spare the log transformation, since it reduces the dynamic range... TPM would also be OK, as TPM and CPM only differ by the inverse length of the transcript since Mfuzz::standardise() divides through the standard deviation for each gene, this factor is taken out again."


ADD REPLYlink written 10 months ago by rbenel0

Hi, rbenel

Could you tell how did you import normalized objects (rld or vsd) to Mfuzz? Just eset <- new('ExpressionSet', exprs=mat2, phenoData=phenodf)? But if so, how did you include group information?

Thanks for your time!

ADD REPLYlink modified 6 months ago • written 6 months ago by takashi.k12010
It was mentioned in GOexpress: use DESeq2 data object as input
ADD REPLYlink written 6 months ago by takashi.k12010

I am not sure about the link you posted, but I don't import directly the vsd object, after correcting for batch effects, I save the object as .RData and then when I worked with MFuzz I created an ExpressionSet like I showed below. 

#The clustering is based soley on the exprs matrix and no information is used from the phenoData.
#so it is possible to just create the expression set using biobase (see below)


exp.set <- ExpressionSet(assayData = as.matrix(vsd.matrix))
ADD REPLYlink written 6 months ago by rbenel0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 199 users visited in the last hour