Question

Representative gene expression value in one condition with several replicates

0

Entering edit mode

Jack • 0

@jack-14069

Last seen 4.5 years ago

Hi all,

I want to know how to get a gene expression value for a condition with different replicates.

For example, I have condition M and N, each condition with two replicates M1, M2, N1, N2

I want to get one value to represent the gene expression value (FPKM or TPM) of M, can I just use the mean of each replicate? M=(M1+M2)/2?

Is there any other way to calculate the gene expression value for a condition?

Any advice is appreciated!

rnaseq gene expression edger • 1.8k views

ADD COMMENT • link updated 6.4 years ago by Gordon Smyth 50k • written 6.4 years ago by Jack • 0

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 1 day ago

United States

This isn't a DESeq2 (or edgeR) question really, so I'm removing the DESeq2 tag. The arithmetic or geometric mean of the TPM seems to be a reasonable number for the average relative abundance. I don't have any strong opinions about this though.

ADD COMMENT • link 6.4 years ago Michael Love 41k

0

Entering edit mode

Yes, you are right. I think it is good to hear your opinion.

ADD REPLY • link 6.4 years ago Jack • 0

score 5 · Accepted Answer · 2017-12-20

As Mike says, this isn't an edgeR question. But I will pretend it is. If you have the counts, go through an edgeR analysis - or at least to calling glmFit - with the following design matrix:

group <- c("M", "M", "N", "N")
design <- model.matrix(~0 + group)

You didn't specify the nature of your replicates, but you may need to add a blocking factor if M1 is related to N1 (e.g., from the same individual) and M2 is related to N2.

Anyway, once you've done that, you can obtain the log-average expression of each level of group from the $coefficients field of the output of glmFit. This provides a general approach to getting condition-specific expression values, taking advantage of NB GLMs to give a more precise estimate than averaging FPKMs.

score 4 · Accepted Answer · 2017-12-20

4

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

If you want expression values on a log-scale, then you can use the process explained by Aaron, which is similar to but better than just averaging the individual log-expression values.

If you want expression values on the unlogged scale, then the edgeR package provides functions to do this. Type

library(edgeR)
?cpmByGroup
?rpkmByGroup