How to Find Common Dysregulated Genes in Two or More Sets of Microarray Data?
2
1
Entering edit mode
@pankajnarula84-7534
Last seen 8.9 years ago
India

I have two (or more) micro array data of genes of SARS (http://www.ncbi.nlm.nih.gov/geo/geo2r/?acc=GSE1739) and Parkinson disease (http://www.ncbi.nlm.nih.gov/geo/geo2r/?acc=GSE7621). I found the dysregulated genes in these sets by applying criteria log fold change is greater than 1.5 and p-vlaue < 0.01. My question is how to find the common dysregulated genes in these two sets? Which statistical tests should be applied? Which packages are available in R for this kind analysis? I am new to bioinformatics. Kindly bear with me if question is very basic. Thanks in advance. 

disease genetics micro array data microarray • 2.2k views
ADD COMMENT
0
Entering edit mode

Hi,

Do you mean with common dysregulated genes, the ones that are significantly differentially expressed in both sets?

For that I usually use Venn diagrams. In the limma manual you can find examples of how to make Venn diagrams, or you can make them yourself with the gplots package.

Good luck.

Ben

ADD REPLY
0
Entering edit mode

Dear B.Nota, Thanks for your reply. Actually representation is not my problem. I want to know that what is the statistical approach to calculate the common dysregulated genes.? As two sets contain different number of control and infected samples. So which criteria is to be imposed to get number common significant up regulated and down regulated genes in two diseases?    

ADD REPLY
1
Entering edit mode
Chris Seidel ▴ 80
@chris-seidel-5840
Last seen 2.9 years ago
United States

You could make a contingency table and use a Fisher Exact test, or you could use the hypergeometric distribution (see ?phyper in R). Given a universe of genes in two experiments, if you identify a set of genes in experiment 1, and another set of genes in experiment 2, these can help you evaluate the likelihood of a given degree of overlap. As b.nota mentioned, I usually make a Venn diagram and then evaluate it with either of those tests. There's a package in R which does this for you, called GeneOverlap.

ADD COMMENT
0
Entering edit mode

Thanks Chris Seidel, So effectively it means find dysregulated genes in list 1 and dysregulated genes in list 2. Then apply GeneOverlap on dysregulated genes 1 and dysregulated genes 2 ( By using function newGeneOverlap in GeneOverlap package).? Sorry for my late reply and poor understanding in bioinformatics.  

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 48 minutes ago
WEHI, Melbourne, Australia

For each gene, you can use the maximum of the two p-values from the SARS and Parkinson datasets to test whether the gene is dysregulated in both diseases.

In other words, a gene is a common significant gene if it is significant in both diseases. It is as simple as that.

However, the method you have used to assess significance in each individual dataset does not seem the best. It would be better to apply an analysis method that controls the false discovery rate across the whole genome.

ADD COMMENT

Login before adding your answer.

Traffic: 769 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6