Combining RNAseq and microarray data using edgeR/limma-voom
1
0
Entering edit mode
@5028264b
Last seen 4 hours ago
United States

Hi all,

I am working on a large dataset consisting of multiple different RNAseq and Microarray studies from different labs and times. While we have a (functional) pipeline setup for this, I recently saw a post which mentioned using voomLmFit to counter the issues which might stem from having excess zeroes in the data.

Since we are combining RNAseq data and microarray data for a combined analysis, we see some of these data-sparsity issues; a number of the genes in the RNAseq dataset are simply not found in the microarray datasets, and not all of our microarray datasets share a full geneset either. Ideally, we would like not to simply remove the partially sparse genes from the dataset, since doing that would drastically reduce the amount of genes available for further analyses.

My question is therefore whether a voomLmFit pipeline could be used for both the RNAseq and microarray data? I.e. is voom transformation of microarray data harmful, and if so is there another way to account for these data-sparsity issues without having to cut down our genesets drastically?

Thanks, Adam

edit: For some more context, we are not combining samples across studies into any single groups. Rather, we want to perform a group-wise comparison (with the original groups from each study), contrasting the changes between conditions. We have no repeats of condition comparisons across studies.

limma voom RNASeq edgeR Microarray • 194 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 6 hours ago
United States

No, that's not a good idea at all. You should be doing a meta-analysis with these data rather than trying to combine like that. GeneMeta is one package you could use, but there are others that you can search for on the BioC website, like metapod.

The latter is a bit tricky if you use Stouffer's method because it's supposed to be based on one tailed p-values, and there's no obvious way to implement that in a package, so you have to know that, and use your own code to accommodate. In other words, you have to convert all your p-values to one tailed, do the test, and then convert back.

0
Entering edit mode

Hi again,

Thank you for the response! Can you expound on why it is not a good idea? I realize that the expression levels/differential expression levels are not directly comparable, but would this also hold true if we are performing a pathway analysis (ORA/GSEA)?

And which part of the question are you referring to with the last comment, the voom-transform or if there is another way?

Maybe also for some more context, we are not combining samples across studies into any single groups. Rather, we want to perform a group-wise comparison (with the original groups from each study), contrasting the changes between conditions. We have no repeats of condition comparisons across studies.

ADD REPLY
0
Entering edit mode

If you're saying that (as an example) you have treated and control in both assay types, and you want to combine and make comparisons between the two, then that's what I'm talking about. That's a bad idea IMO, and sounds a meta-analysis is better.

It's far better to make the comparisons within assay type, and then combine results after. You can combine using effect size (GeneMeta) or p-values (metapod). I would probably use p-values, and would probably use Stouffer's method, which is why I mentioned that about the one tailed p-values.

ADD REPLY
0
Entering edit mode

And it doesn't matter if you're doing gene set analyses after, if the statistics are questionable to begin with.

ADD REPLY

Login before adding your answer.

Traffic: 600 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6