Average biological replicates in RNA-seq
1
0
Entering edit mode
boczniak767 ▴ 740
@maciej-jonczyk-3945
Last seen 22 days ago
Poland

Hi,

I want to use mfuzz for clustering of RNA-seq results.

The problem is that I can't find easy solution how to average biological replicates. I assumed there is simple function for averaging replicates by conditions using info from phenoData for ExpressionSet or sample info in case of DESeqDataSet.

I use DESeq2 so I have DESeqDataSet object. I've prepared ExpressionSet object with standardised FPKM values for mfuzz but I've realised that I need to average value for each condition.

I know, I could use aggregate to average data in data frame but it is prone to error if used for object with many columns.

Mfuzz RNASeq • 1.1k views
ADD COMMENT
2
Entering edit mode
ATpoint ★ 4.5k
@atpoint-13662
Last seen 3 days ago
Germany

Many ways to do that, here are two:

library(DESeq2)
library(limma)

set.seed(1)
dds <- makeExampleDESeqDataSet()
ntf <- normTransform(dds)

# base R, assuming condition is a factor
rs <- rowsum(t(assay(ntf)), group = dds$condition)
rn <- as.numeric(table(ntf$condition))
averaged <- t(rs/rn)

# limma with transpose
averaged <- t(avereps(t(assay(ntf)), ID = ntf$condition))
ADD COMMENT
0
Entering edit mode

Thank you,

I've used limma's method and it works. However I have to change assay to exprs.

Why have you used ntf <- normTransform(dds) before averaging? I assumed that I'll standardize data in mfuzz using standardise function.

ADD REPLY
0
Entering edit mode

Use whatever you feel is correct, I just put together an example for the sake of demonstration. It's a random example. Note that before standardization, you would still typically normalize and log2-transform your data.

ADD REPLY
0
Entering edit mode

ATpoint Thank you for directing my attention to normalization. I assumed that using FPKM is ok as it was mentioned at Mfuzz page. Do you think I can use FPKM?

Or do you think that approach presented at sthda in "Normalization using DESeq2 (size factors)" is ok for clustering?

ADD REPLY
1
Entering edit mode

What you usually do is to normalize data first with respect to library size and composition (that is what the size factors do), then log2-transform and then Z-scale aka standardize. In DESeq2 the vst function does the two first points plus some magic extra that is beneficial for downstream analysis so I would go with that. Alternatively, log2-transformed normalized counts work as well. That is what normTransform does.

ADD REPLY
0
Entering edit mode

Thank you very much for this clear explanation. Since my last post I've read several posts about normalization but haven't found such clear information.

ADD REPLY

Login before adding your answer.

Traffic: 799 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6