svaseq: how many and which surrogate variables to pick
1
1
Entering edit mode
nicklesd ▴ 10
@nicklesd-8688
Last seen 7.0 years ago
United States

I have a general question concerning surrogate variable analysis.

I have a large RNAseq data set on a heterogenous population and I'd like to identify the major hidden sources of variation so that I can adjust for them when performing differential gene expression analysis. svaseq() from the sva package finds 33 significant surrogate variables - that is a lot, I don't want to include all of  them in my model. Apparently, previously the sva package had a function called svaplot()  that allowed you do visualize the percent of variation explained by each surrogate variable (I envision something like a screeplot), but that function is not included in the package anymore. 

So my question is: how do I pick the surrogate variables that explain most of the variation? And how do I determine what a good number of variables to pick is? 

Thanks,

Doro

 

sva • 3.8k views
ADD COMMENT
0
Entering edit mode

Also wondering the same thing.   Did you find an answer, nicklesd?

ADD REPLY
3
Entering edit mode
Jeff Leek ▴ 640
@jeff-leek-5015
Last seen 17 months ago
United States

You could try the alternative of using method = "be" in the software, that sometimes is a little better if the sample size of your experiment is very large. I removed the svaplot() function because it is a bit hard to judge how many surrogate variables to include by eye and while the automated ways aren't entirely better, at least they are reproducible. 

If you have a measured batch effect, one way some people select the number of surrogate variables is to pick the number of batches - but again that is a bit of a hack. 

To be honest this is a quite hard and open problem in the analysis of data from these experiments - how many artifact estimates to include. 

 

Jeff

ADD COMMENT

Login before adding your answer.

Traffic: 205 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6