Question

Competitive gene set testing between two sets of genes

1

Entering edit mode

le2336 ▴ 20

@le2336-10789

Last seen 5.3 years ago

Hello,

I ran camera() in edgeR to test whether 2 gene sets are highly ranked in my mutant data compared to my wild-type data in terms of differential expression relative to other genes.

design <- model.matrix(~0 + genotype)
contrast <- makeContrasts(mutant - wildtype, levels=design)
camera_test <- camera(y, id_matrix, design=design, contrast = contrast)

wildtype vs mutant	NGenes	Direction	PValue	FDR
Gene set 1	1879	Down	1.92E-20	4.2E-20
Gene set 2	4196	Down	2.76E-13	3.1E-13

To follow up on these results, I would like to test whether the difference in rank between these 2 gene sets is significant, i.e. to test whether Gene set 1 is more significantly downregulated in the mutant than Gene set 2. What is the best way to approach this? Thank you.

gene set testing camera edger • 2.3k views

ADD COMMENT • link updated 9.5 years ago by Gordon Smyth 53k • written 9.5 years ago by le2336 ▴ 20

score 2 · Accepted Answer · 2016-07-05

2

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

Well, this isn't a standard thing to do. I guess you could simply compute a two-sample t-test between the test statistics for the two sets. Something like this:

fit <- glmFit(y, design)
lrt <- glmLRT(fit, contrast=contrast)
z <- sign(lrt$table$logFC) * sqrt(lrt$table$LR)
z.geneset1 <- z[ id_matrix[["Gene set 1"]] ]
z.geneset2 <- z[ id_matrix[["Gene set 2"]] ]
t.test(z.geneset1, z.geneset2)

You could also visualize the differences using:

barcodeplot(z, index=id_matrix[["Gene set 1"]], index2=id_matrix[["Gene set 2"]] )

ADD COMMENT • link 9.5 years ago Gordon Smyth 53k

0

Entering edit mode

Hi Gordon,

Thank you for your response. I ran the t-tests and all the results are "p-value < 2.2e-16". However, the barcode plots suggest that Gene set 1 tends to have more downregulated genes with more negative log-fold-changes than Gene set 2. I wonder if the two-sample t-test may be too sensitive to appreciate this difference.

Would the following type of comparison be reasonable? Within each genotype, I first obtain a test statistic for Gene set 1 and Gene set 2. Using these values, I perform a second comparison of the test statistics between genotypes. The comparison would thus be: Mutant(Gene set 1 vs Gene set 2) vs wildtype (Gene set 1 vs Gene set 2).

ADD REPLY • link 9.5 years ago le2336 ▴ 20

0

Entering edit mode

I don't understand what you mean. You say that you ran multiple t-tests, but I advised you to do only one t-test.

ADD REPLY • link 9.5 years ago Gordon Smyth 53k

0

Entering edit mode

My apologies for the confusion! I did only run one t-test as you recommended for this specific comparison, and obtained "p-value < 2.2e-16" as the result. The other t-tests I mentioned were run for other gene sets in the same dataset, again with only one t-test per comparison -- in those instances I also obtained "p-value < 2.2e-16". Perhaps this is why this isn't a standard thing to do. Many thanks again for your help.

I am still curious if the alternative comparison I described makes sense to perform, specifically for Gene set 1 and another set of genes that do not change in my mutant. As this is now a separate analysis that deviates from the original question, I could open this as a new question if you would prefer.

ADD REPLY • link 9.5 years ago le2336 ▴ 20