estimating variability in absence of replicates for edgeR
1
0
Entering edit mode
hcnbox • 0
@hcnbox-23063
Last seen 4.1 years ago

I would like some help getting DE genes from what is turning out to be a more difficult design than I thought. The data come from elsewhere, so the design isn't mine (nor can I just repeat everything). I want to get DE genes from different times in fetal mouse development (E14 and E16), so comparing 2 different gestational ages. I also want to look at fetal sex as a variable. My proposed analysis would be: E14 male vs E16 male E14 female vs E16 female E14 male vs E14 female E16 male vs E16 female The fundamental design problem is that there are no replicates. I have RNAseq data from each of the four conditions but only one analysis each. I have been following the edgeR Users Guide and accompanying vignette, and run into problems at the step that uses voom to look at scedasticity and transform the data. Here is the code where I get an error:

par(mfrow=c(1,2)) v <- voom(edger, design, plot=TRUE)

And here is the error message:

Warning message in min(x): "no non-missing arguments to min; returning Inf"Warning message in max(x): "no non-missing arguments to max; returning -Inf"

Error in plot.window(...): need finite 'ylim' values Traceback:

  1. voom(edger, design, plot = TRUE)
  2. plot(sx, sy, xlab = "log2( count size + 0.5 )", ylab = "Sqrt( standard deviation )", . pch = 16, cex = 0.25)
  3. plot.default(sx, sy, xlab = "log2( count size + 0.5 )", ylab = "Sqrt( standard deviation )", . pch = 16, cex = 0.25)
  4. localWindow(xlim, ylim, log, asp, ...)
  5. plot.window(...)

The 'Plot=TRUE' segment isn't important overall, I get the same error if I make it 'Plot=FALSE'. My research indicates the step won't run because voom must have data sets with replicates. I read about alternatives for when replicates are not available in section 2.11 of the edge R Users Guide, and also all the communications I could locate on this issue here on the Bioconductor help pages. It looks like scenario 3 in the Users Guide is best applicable to my situation, but I'm not sure how to put that into effect, or whether another alternative is better. I have these two questions: 1) The "source" of my RNAseq data actually did RNAseq as part of three similar experiments over the course of a year or so. So there are 3 separate, full data files. I can't guarantee that everything was identical in each of those efforts, so I intended to treat these as three different looks at the same questions, and do my analysis three times (and subsequently comment on how much variability in DE I found across the different data sets). I know the same Affymetrix unit was used each time, and the same mouse breed was used, but people and other factors were different. Am I better off considering the three data sets as replicates, realizing that unknown factors are a potential limitation, or using the approximation of variability described in section 2.11 which has the known limitation of being an approximation? 2) If instead I use the section 2.11 approach, I don't understand how to incorporate that information back into the analytical R code to move the analysis forward. Are there some guidelines for just how to do that?

I recognize much of this is asking for opinions, but I'm uncomfortable relying on just my opinion and hope to get input from others more expert than I. Thanks.

edgeR RNAseq voom variability estimator • 645 views
ADD COMMENT
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

If you have three repeats of this experiment, then it would be far better to analyse all the data at once. It is easy to allow for possible batch effects, just add in an extra term in the model for experimental time.

As we have discussed before, you are obviously using an Illumina rather than an Affymetrix unit.

ADD COMMENT
0
Entering edit mode

Thank you again. And yes again. I don't know why I have Affy on my brain. I'll have to make a big sign in front of my monitor that says "Illumina, you dolt!". Heber

ADD REPLY

Login before adding your answer.

Traffic: 839 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6