Question

Average biological replicates in RNA-seq

0

Entering edit mode

boczniak767 ▴ 740

@maciej-jonczyk-3945

Last seen 4 months ago

Poland

Hi,

I want to use mfuzz for clustering of RNA-seq results.

The problem is that I can't find easy solution how to average biological replicates. I assumed there is simple function for averaging replicates by conditions using info from phenoData for ExpressionSet or sample info in case of DESeqDataSet.

I use DESeq2 so I have DESeqDataSet object. I've prepared ExpressionSet object with standardised FPKM values for mfuzz but I've realised that I need to average value for each condition.

I know, I could use aggregate to average data in data frame but it is prone to error if used for object with many columns.

Mfuzz RNASeq • 2.1k views

ADD COMMENT • link 23 months ago • updated 4 months ago boczniak767 ▴ 740

score 2 · Accepted Answer · 2024-02-19

2

Entering edit mode

ATpoint ★ 5.0k

@atpoint-13662

Last seen 16 hours ago

Germany

Many ways to do that, here are two:

library(DESeq2)
library(limma)

set.seed(1)
dds <- makeExampleDESeqDataSet()
ntf <- normTransform(dds)

# base R, assuming condition is a factor
rs <- rowsum(t(assay(ntf)), group = dds$condition)
rn <- as.numeric(table(ntf$condition))
averaged <- t(rs/rn)

# limma with transpose
averaged <- t(avereps(t(assay(ntf)), ID = ntf$condition))

ADD COMMENT • link 23 months ago ATpoint ★ 5.0k

0

Entering edit mode

Thank you,

I've used limma's method and it works. However I have to change assay to exprs.

Why have you used ntf <- normTransform(dds) before averaging? I assumed that I'll standardize data in mfuzz using standardise function.

ADD REPLY • link 23 months ago boczniak767 ▴ 740

0

Entering edit mode

Use whatever you feel is correct, I just put together an example for the sake of demonstration. It's a random example. Note that before standardization, you would still typically normalize and log2-transform your data.

ADD REPLY • link 23 months ago ATpoint ★ 5.0k

0

Entering edit mode

ATpoint Thank you for directing my attention to normalization. I assumed that using FPKM is ok as it was mentioned at Mfuzz page. Do you think I can use FPKM?

Or do you think that approach presented at sthda in "Normalization using DESeq2 (size factors)" is ok for clustering?

ADD REPLY • link 23 months ago boczniak767 ▴ 740

1

Entering edit mode

What you usually do is to normalize data first with respect to library size and composition (that is what the size factors do), then log2-transform and then Z-scale aka standardize. In DESeq2 the vst function does the two first points plus some magic extra that is beneficial for downstream analysis so I would go with that. Alternatively, log2-transformed normalized counts work as well. That is what normTransform does.

ADD REPLY • link 23 months ago ATpoint ★ 5.0k

0

Entering edit mode

Thank you very much for this clear explanation. Since my last post I've read several posts about normalization but haven't found such clear information.