Representative gene expression value in one condition with several replicates
3
0
Entering edit mode
Jack • 0
@jack-14069
Last seen 2.9 years ago

Hi all,

I want to know how to get a gene expression value for a condition with different replicates.

For example, I have condition M and N, each condition with two replicates M1, M2, N1, N2

I want to get one value to represent the gene expression value (FPKM or TPM) of M, can I just use the mean of each replicate? M=(M1+M2)/2?

Is there any other way to calculate the gene expression value for a condition?

rnaseq gene expression edger • 1.1k views
5
Entering edit mode
Aaron Lun ★ 27k
@alun
Last seen 10 hours ago
The city by the bay

As Mike says, this isn't an edgeR question. But I will pretend it is. If you have the counts, go through an edgeR analysis - or at least to calling glmFit - with the following design matrix:

group <- c("M", "M", "N", "N")
design <- model.matrix(~0 + group)


You didn't specify the nature of your replicates, but you may need to add a blocking factor if M1 is related to N1 (e.g., from the same individual) and M2 is related to N2.

Anyway, once you've done that, you can obtain the log-average expression of each level of group from the \$coefficients field of the output of glmFit. This provides a general approach to getting condition-specific expression values, taking advantage of NB GLMs to give a more precise estimate than averaging FPKMs.

0
Entering edit mode

Thank you very much!!

4
Entering edit mode
@gordon-smyth
Last seen 4 hours ago
WEHI, Melbourne, Australia

If you want expression values on a log-scale, then you can use the process explained by Aaron, which is similar to but better than just averaging the individual log-expression values.

If you want expression values on the unlogged scale, then the edgeR package provides functions to do this. Type

library(edgeR)
?cpmByGroup
?rpkmByGroup

0
Entering edit mode

Thank you very much for you advice!

0
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

This isn't a DESeq2 (or edgeR) question really, so I'm removing the DESeq2 tag. The arithmetic or geometric mean of the TPM seems to be a reasonable number for the average relative abundance. I don't have any strong opinions about this though.

0
Entering edit mode

Yes, you are right. I think it is good to hear your opinion.