As I’m currently studying the biological phenomenon of irradiated cells, I have downloaded from GEO repository 9 human microarray datasets, in order to perform “some kind” of meta-analysis. In detail, I would like to identify genuine differentially expressed genes, and subsequently to conduct functional enrichment analysis. My main problem-issue, is because I’m a newbie in R and statistical analysis, i wonder how I should proceed with the analysis of my datasets. Specifically, 7 of the 9 datasets are Agilent (6 are of the same platform- Agilent-014850 Whole Human Genome Microarray 4x44K G4112F, and one Agilent- Agilent-026652 Whole Human Genome Microarray 4x44K v2), whereas both of the 2 Illumina datasets, comprise of the platform llumina HumanWG-6 v3.0 expression beadchip.
Thus, one first naive thought was to perform some kind of cross-normalization between the datasets of each platform, and to perform two separate analysis and then compare my results (for instance in the final DE lists). However, except from the obvious problem that could arise from the specific effects of each data-set, also the experiment design increases more the complexity: in other words, although there actually 3 conditions in each dataset(control, bystander & irradiated cells), some time points are different or don’t exist in some datasets. So, how could I proceed in a “safe way” with my actual analysis? I should analyze separately each dataset, export my gene lists with my differentially expressed genes(i.e. gene symbols) and then somehow identify common genes between common comparisons? Moreover, is there a package that after exporting from each dataset the statistics to perform meta-analysis?
Any suggestion or feedback on this matter would be very helpful !!!