Question: Meta-analysis of microarrays (and RNA-seq) data
gravatar for giroudpaul
2.6 years ago by
giroudpaul40 wrote:

Dear Bioconductor Scientists,

I am rather new to NGS data analysis, I learned what I know all by myself as there is no bioinformaticians where I work.

Nevertheless, I already analyzed my own affymetrix HTA 2.0 data, as well as a couple public data on affymetrix HGU133 and Illumina HumanHT12 V3/4 microarrays (and should analyze some Agilent 4x44K soon). In the past I also got my hand on Chip-seq/RNA-seq data, and learned the basics (but it's been some time, I hope it's just like bike, you never really forget it ;) ).

I am working on an immune cell subtype, which can be schematized this way :

Primary cells are extracted from blood, let's call them the O cells. They can be differentiated into A, B and C subtypes (and more).

The trouble with public data is the inconsistency in the differentiation methods, the poor number of replicates, the fact that controls are not the same (sometimes fresh O cells, sometimes O cells cultivated with only medium for the same time, sometimes no controls...). So I figured out that it would make sense to combine all these data in a meta analysis to gain power. And maybe also add RNAseq results from similar experiments for an increased precision.

The results I would like to obtain are :

  • Identify genes differentially expressed in specific conditions (A, B or C against O, B/C against A), either in all studies or in a majority of studies.
  • Identify gene expression profiles of A, B and C in order to find potentially similar cells enrichment in cancer tissues data (microarray/RNA-seq)

For now, I read some literature (10.1371/journal.pmed.0050184;10.1186/1471-2105-14-368), found some packages (crossmeta, GeneMeta, metaArray, MetaOmics), but as I have limited statistical knowledge, the explications are somewhat obscure to me here, as to what method (p-value vs effect size vs rank ?) is best suitable for my purpose.

I guess my questions for you dear members are :

  • Is this kind of analysis (as I explain it) possible ? Even for neophyte ?
  • What are your advices on how to perform this ? Which packages do you recommend ? Could you share some experience on similar subjects ?

Thank you for your time,


meta-analysis • 1.5k views
ADD COMMENTlink modified 2.6 years ago by alexvpickering110 • written 2.6 years ago by giroudpaul40
Answer: Meta-analysis of microarrays (and RNA-seq) data
gravatar for alexvpickering
2.6 years ago by
alexvpickering110 wrote:

Hi Paul,

I am the author of crossmeta, which uses the same effect-size methods as GeneMeta. The methods were modified so that genes that are only measured in a subset of studies can still be included in the meta-analysis. I chose to use an effect-size (as opposed to p-value or rank) meta-analysis method largely because crossmeta was designed to produce a signature that can be used by ccmap to find drug candidates to either reverse or mimic a gene expression signature. From my current understanding, effect-size meta-analyses are generally preferable to p-value combination methods (e.g. see metap vignette). Rank-combination methods are even less preferable and would be chosen if all you have is ordered lists of genes.

Is this kind of analysis (as I explain it) possible ? Even for neophyte ?

This is a big part of what I hope crossmeta accomplishes. All you need is a list of microarray GSEs (crossmeta does not currently support RNAseq data) from GEO that you would like to include in your meta-analysis. After that, the basic workflow is:

# studies from GEO
gse_names  <- c("GSE9601", "GSE15069")

# get raw data for specified studies

# load and annotate raw data
esets <- load_raw(gse_names)

# perform differential expression analysis
anals <- diff_expr(esets)

# add sample sources (if you want to perform separate meta-analyses for different tissue sources)
anals <- add_sources(anals)

# perform effect-size meta-analysis
es_res <- es_meta(anals, by_source = TRUE)

crossmeta also does pathway meta-analyses using PADOG, which outperforms other methods at prioritizing expected pathways (ref1ref2).  To do so:

# pathway analysis for each contrast
path_anals <- diff_path(esets, anals)

# pathway meta analysis by tissue source
path_res <- path_meta(path_anals, by_source = TRUE)

Other than a list of GSEs that you want to include, all that you have to do is select control and test samples (when running diff_expr) and specify tissue sources (when running add_sources). Both of these functions use a GUI for user input.



For the true neophyte, I just released a web-app adaptation of crossmeta at Unlike crossmeta, RNA Meta Analysis let's you search for similar contrasts in 26,000+ studies and includes support for both microarray and RNA-Seq data.



ADD COMMENTlink modified 14 months ago • written 2.6 years ago by alexvpickering110

Well thank you for your answer, sorry for replaying this late.

I didn't had time to go further on my investigations in the different packages I cited, as I am in sick leave at the moment, but I planned on understanding which package was the more suitable. Thank you for your explanation, it give me a head start with your package ;)

ADD REPLYlink written 2.5 years ago by giroudpaul40

Hello Alex,

So I had some time to look up crossmeta and I ran into some troubles trying to follow the vignette. I also have some additional questions about the package and what it can do, so would it be possible to contact you in private ?

ADD REPLYlink written 2.4 years ago by giroudpaul40

Hi Paul, 

Get in touch and I'll try to help you out: alexvpickering at gmail dot com.

ADD REPLYlink written 2.4 years ago by alexvpickering110
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 474 users visited in the last hour