read counts for gene families
2
1
Entering edit mode
zhou_hye ▴ 10
@zhou_hye-15558
Last seen 6.7 years ago

Hi, I am using edger to analyze RNA-seq data. I have generated count table for each genes under each condition. However, my goal is not to look at differential expression of specific genes. Instead, I want to look at differential expression of some gene families. For example, I have gene A1, A2,...,An under gene family A . I want to see whether gene family A is differentially expressed. My question is whether the expression of gene family A is simply the addition of expression of genes A1 to An regarding to normalized read counts. By the way, the gene lengths are slightly different for genes within same family. What is in my mind is that I am quite sure FPKM is addible but not read counts. Thus, if read count is not addible, what should I do to analyze DE of gene families. Even if read count is addible, can I simply generate a new table for gene families and follow the edger guide as genes. Thank you for your time. 

 

 

edger • 1.1k views
ADD COMMENT
4
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

As Ryan has already pointed out, you don't need to add anything, counts or otherwise. Just do a gene set test for your family of genes.

Suppose your DGEList object is dge, and suppose you have an annotation column called "Symbol". Then

Family <- c("A1","A2","A3")
fry(dge, index=Family, design=design, geneid="Symbol")

will test whether the gene family is differentially expressed.

ADD COMMENT
2
Entering edit mode
@ryan-c-thompson-5618
Last seen 9 weeks ago
Icahn School of Medicine at Mount Sinai…

Generally I wouldn't recommend trying to combine counts from multiple different genes. What precisely is your null hypothesis? If you want to know whether any genes in a family are differentially expressed, you can use roast fry. If you want to know whether a family is enriched for differentially expressed genes, you can use camera.

ADD COMMENT
1
Entering edit mode

For an edgeR analysis, fry() is quicker and better than roast(). It's actually equivalent to using roast() with nrot=Inf.

ADD REPLY
0
Entering edit mode

Thanks for the correction. I've done most of my gene set testing with limma, so I'm less familiar with the options for edgeR.

ADD REPLY

Login before adding your answer.

Traffic: 967 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6