Question: How to Find Common Dysregulated Genes in Two or More Sets of Microarray Data?
gravatar for pankajnarula84
3.7 years ago by
pankajnarula8420 wrote:

I have two (or more) micro array data of genes of SARS ( and Parkinson disease ( I found the dysregulated genes in these sets by applying criteria log fold change is greater than 1.5 and p-vlaue < 0.01. My question is how to find the common dysregulated genes in these two sets? Which statistical tests should be applied? Which packages are available in R for this kind analysis? I am new to bioinformatics. Kindly bear with me if question is very basic. Thanks in advance. 

ADD COMMENTlink modified 3.7 years ago by Gordon Smyth35k • written 3.7 years ago by pankajnarula8420


Do you mean with common dysregulated genes, the ones that are significantly differentially expressed in both sets?

For that I usually use Venn diagrams. In the limma manual you can find examples of how to make Venn diagrams, or you can make them yourself with the gplots package.

Good luck.


ADD REPLYlink written 3.7 years ago by b.nota300

Dear B.Nota, Thanks for your reply. Actually representation is not my problem. I want to know that what is the statistical approach to calculate the common dysregulated genes.? As two sets contain different number of control and infected samples. So which criteria is to be imposed to get number common significant up regulated and down regulated genes in two diseases?    

ADD REPLYlink written 3.7 years ago by pankajnarula8420
gravatar for Chris Seidel
3.7 years ago by
Chris Seidel50
United States
Chris Seidel50 wrote:

You could make a contingency table and use a Fisher Exact test, or you could use the hypergeometric distribution (see ?phyper in R). Given a universe of genes in two experiments, if you identify a set of genes in experiment 1, and another set of genes in experiment 2, these can help you evaluate the likelihood of a given degree of overlap. As b.nota mentioned, I usually make a Venn diagram and then evaluate it with either of those tests. There's a package in R which does this for you, called GeneOverlap.

ADD COMMENTlink written 3.7 years ago by Chris Seidel50

Thanks Chris Seidel, So effectively it means find dysregulated genes in list 1 and dysregulated genes in list 2. Then apply GeneOverlap on dysregulated genes 1 and dysregulated genes 2 ( By using function newGeneOverlap in GeneOverlap package).? Sorry for my late reply and poor understanding in bioinformatics.  

ADD REPLYlink written 3.6 years ago by pankajnarula8420
gravatar for Gordon Smyth
3.7 years ago by
Gordon Smyth35k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth35k wrote:

For each gene, you can use the maximum of the two p-values from the SARS and Parkinson datasets to test whether the gene is dysregulated in both diseases.

In other words, a gene is a common significant gene if it is significant in both diseases. It is as simple as that.

However, the method you have used to assess significance in each individual dataset does not seem the best. It would be better to apply an analysis method that controls the false discovery rate across the whole genome.

ADD COMMENTlink written 3.7 years ago by Gordon Smyth35k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 166 users visited in the last hour