Question: Meta-analysis of microarrays (and RNA-seq) data
gravatar for giroudpaul
9 months ago by
giroudpaul30 wrote:

Dear Bioconductor Scientists,

I am rather new to NGS data analysis, I learned what I know all by myself as there is no bioinformaticians where I work.

Nevertheless, I already analyzed my own affymetrix HTA 2.0 data, as well as a couple public data on affymetrix HGU133 and Illumina HumanHT12 V3/4 microarrays (and should analyze some Agilent 4x44K soon). In the past I also got my hand on Chip-seq/RNA-seq data, and learned the basics (but it's been some time, I hope it's just like bike, you never really forget it ;) ).

I am working on an immune cell subtype, which can be schematized this way :

Primary cells are extracted from blood, let's call them the O cells. They can be differentiated into A, B and C subtypes (and more).

The trouble with public data is the inconsistency in the differentiation methods, the poor number of replicates, the fact that controls are not the same (sometimes fresh O cells, sometimes O cells cultivated with only medium for the same time, sometimes no controls...). So I figured out that it would make sense to combine all these data in a meta analysis to gain power. And maybe also add RNAseq results from similar experiments for an increased precision.

The results I would like to obtain are :

  • Identify genes differentially expressed in specific conditions (A, B or C against O, B/C against A), either in all studies or in a majority of studies.
  • Identify gene expression profiles of A, B and C in order to find potentially similar cells enrichment in cancer tissues data (microarray/RNA-seq)

For now, I read some literature (10.1371/journal.pmed.0050184;10.1186/1471-2105-14-368), found some packages (crossmeta, GeneMeta, metaArray, MetaOmics), but as I have limited statistical knowledge, the explications are somewhat obscure to me here, as to what method (p-value vs effect size vs rank ?) is best suitable for my purpose.

I guess my questions for you dear members are :

  • Is this kind of analysis (as I explain it) possible ? Even for neophyte ?
  • What are your advices on how to perform this ? Which packages do you recommend ? Could you share some experience on similar subjects ?

Thank you for your time,


ADD COMMENTlink modified 8 months ago by alexvpickering80 • written 9 months ago by giroudpaul30
gravatar for alexvpickering
8 months ago by
alexvpickering80 wrote:

Hi Paul,

I am the author of crossmeta, which uses the same effect-size methods as GeneMeta. The methods were modified so that genes that are only measured in a subset of studies can still be included in the meta-analysis. I chose to use an effect-size (as opposed to p-value or rank) meta-analysis method largely because crossmeta was designed to produce a signature that can be used by ccmap to find drug candidates to either reverse or mimic a gene expression signature. From my current understanding, effect-size meta-analyses are generally preferable to p-value combination methods (e.g. see metap vignette). Rank-combination methods are even less preferable and would be chosen if all you have is ordered lists of genes.

Is this kind of analysis (as I explain it) possible ? Even for neophyte ?

This is a big part of what I hope crossmeta accomplishes. All you need is a list of microarray GSEs (crossmeta does not currently support RNAseq data) from GEO that you would like to include in your meta-analysis. After that, the basic workflow is:

# studies from GEO
gse_names  <- c("GSE9601", "GSE15069")

# get raw data for specified studies

# load and annotate raw data
esets <- load_raw(gse_names)

# perform differential expression analysis
anals <- diff_expr(esets)

# add sample sources (if you want to perform separate meta-analyses for different tissue sources)
anals <- add_sources(anals)

# perform effect-size meta-analysis
es_res <- es_meta(anals, by_source = TRUE)

crossmeta also does pathway meta-analyses using PADOG, which outperforms other methods at prioritizing expected pathways (ref1ref2).  To do so:

# pathway analysis for each contrast
path_anals <- diff_path(esets, anals)

# pathway meta analysis by tissue source
path_res <- path_meta(path_anals, by_source = TRUE)

Other than a list of GSEs that you want to include, all that you have to do is select control and test samples (when running diff_expr) and specify tissue sources (when running add_sources). Both of these functions use a GUI for user input.



For the true neophyte, I just released a web-app adaptation of crossmeta at Unlike crossmeta, RNA Meta Analysis let's you upload your own data and includes support for both microarray and RNA-Seq data.



ADD COMMENTlink modified 4 months ago • written 8 months ago by alexvpickering80

Well thank you for your answer, sorry for replaying this late.

I didn't had time to go further on my investigations in the different packages I cited, as I am in sick leave at the moment, but I planned on understanding which package was the more suitable. Thank you for your explanation, it give me a head start with your package ;)

ADD REPLYlink written 8 months ago by giroudpaul30

Hello Alex,

So I had some time to look up crossmeta and I ran into some troubles trying to follow the vignette. I also have some additional questions about the package and what it can do, so would it be possible to contact you in private ?

ADD REPLYlink written 7 months ago by giroudpaul30

Hi Paul, 

Get in touch and I'll try to help you out: alexvpickering at gmail dot com.

ADD REPLYlink written 7 months ago by alexvpickering80
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 287 users visited in the last hour