Question: Representative gene expression value in one condition with several replicates
16 months ago
Jack0
Jack0

Hi all,

I want to know how to get a gene expression value for a condition with different replicates.

For example, I have condition M and N, each condition with two replicates M1, M2, N1, N2

I want to get one value to represent the gene expression value (FPKM or TPM) of M, can I just use the mean of each replicate? M=(M1+M2)/2?

Is there any other way to calculate the gene expression value for a condition?

Answer: Representvie gene expression value in one condition with several replicates
16 months ago
Aaron Lun
Cambridge, United Kingdom
Aaron Lun

As Mike says, this isn't an edgeR question. But I will pretend it is. If you have the counts, go through an edgeR analysis - or at least to calling glmFit - with the following design matrix:

group <- c("M", "M", "N", "N")
design <- model.matrix(~0 + group)


You didn't specify the nature of your replicates, but you may need to add a blocking factor if M1 is related to N1 (e.g., from the same individual) and M2 is related to N2.

Anyway, once you've done that, you can obtain the log-average expression of each level of group from the \$coefficients field of the output of glmFit. This provides a general approach to getting condition-specific expression values, taking advantage of NB GLMs to give a more precise estimate than averaging FPKMs.

Answer: Representvie gene expression value in one condition with several replicates
16 months ago
Gordon Smyth
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth

If you want expression values on a log-scale, then you can use the process explained by Aaron, which is similar to but better than just averaging the individual log-expression values.

If you want expression values on the unlogged scale, then the edgeR package provides functions to do this. Type

library(edgeR)
?cpmByGroup
?rpkmByGroup


Answer: Representvie gene expression value in one condition with several replicates
16 months ago
Michael Love
United States
Michael Love

This isn't a DESeq2 (or edgeR) question really, so I'm removing the DESeq2 tag. The arithmetic or geometric mean of the TPM seems to be a reasonable number for the average relative abundance. I don't have any strong opinions about this though.