Question

How to Find Common Dysregulated Genes in Two or More Sets of Microarray Data?

1

Entering edit mode

pankajnarula84 ▴ 20

@pankajnarula84-7534

Last seen 8.9 years ago

India

I have two (or more) micro array data of genes of SARS (http://www.ncbi.nlm.nih.gov/geo/geo2r/?acc=GSE1739) and Parkinson disease (http://www.ncbi.nlm.nih.gov/geo/geo2r/?acc=GSE7621). I found the dysregulated genes in these sets by applying criteria log fold change is greater than 1.5 and p-vlaue < 0.01. My question is how to find the common dysregulated genes in these two sets? Which statistical tests should be applied? Which packages are available in R for this kind analysis? I am new to bioinformatics. Kindly bear with me if question is very basic. Thanks in advance.

disease genetics micro array data microarray • 2.2k views

ADD COMMENT • link updated 9.0 years ago by Gordon Smyth 50k • written 9.0 years ago by pankajnarula84 ▴ 20

0

Entering edit mode

Hi,

Do you mean with common dysregulated genes, the ones that are significantly differentially expressed in both sets?

For that I usually use Venn diagrams. In the limma manual you can find examples of how to make Venn diagrams, or you can make them yourself with the gplots package.

Good luck.

Ben

ADD REPLY • link 9.0 years ago b.nota ▴ 360

0

Entering edit mode

Dear B.Nota, Thanks for your reply. Actually representation is not my problem. I want to know that what is the statistical approach to calculate the common dysregulated genes.? As two sets contain different number of control and infected samples. So which criteria is to be imposed to get number common significant up regulated and down regulated genes in two diseases?

ADD REPLY • link 9.0 years ago pankajnarula84 ▴ 20

score 1 · Answer 1 · 2015-04-03

1

Entering edit mode

Chris Seidel ▴ 80

@chris-seidel-5840

Last seen 2.9 years ago

United States

You could make a contingency table and use a Fisher Exact test, or you could use the hypergeometric distribution (see ?phyper in R). Given a universe of genes in two experiments, if you identify a set of genes in experiment 1, and another set of genes in experiment 2, these can help you evaluate the likelihood of a given degree of overlap. As b.nota mentioned, I usually make a Venn diagram and then evaluate it with either of those tests. There's a package in R which does this for you, called GeneOverlap.

ADD COMMENT • link 9.0 years ago Chris Seidel ▴ 80

0

Entering edit mode

Thanks Chris Seidel, So effectively it means find dysregulated genes in list 1 and dysregulated genes in list 2. Then apply GeneOverlap on dysregulated genes 1 and dysregulated genes 2 ( By using function newGeneOverlap in GeneOverlap package).? Sorry for my late reply and poor understanding in bioinformatics.

ADD REPLY • link 9.0 years ago pankajnarula84 ▴ 20

score 1 · Answer 2 · 2015-04-04

For each gene, you can use the maximum of the two p-values from the SARS and Parkinson datasets to test whether the gene is dysregulated in both diseases.

In other words, a gene is a common significant gene if it is significant in both diseases. It is as simple as that.

However, the method you have used to assess significance in each individual dataset does not seem the best. It would be better to apply an analysis method that controls the false discovery rate across the whole genome.