Edge R Issue from a basic differential expression analysis
1
0
Entering edit mode
@peterchernek-10682
Last seen 7.1 years ago

Hi, I am using edgeR to do a differential expression analysis. I am getting this subscript error after topTable highlighted below. Any help or hint would be appreciated!

 

dge <- DGEList(counts=exprs(eset.rnaseq))
               
## Require at least 5 counts in at least 5% of samples:
dge$counts <- dge$counts[(apply(dge$counts, 1, function(x) sum(x > 5)) > ncol(dge$counts) * 0.05), ]
design = model.matrix(~ batch_number +  gleason_score  + racevar, data=pData(eset.rnaseq))

v <- voom(dge, design, plot=TRUE)


fit <- lmFit(v, design)

fit <- eBayes(fit)

tt <- topTable(fit)

 

Removing intercept from test coefficients

Error in object$cov.coefficients[j, j, drop = FALSE] : 
  subscript out of bounds

edger • 2.1k views
ADD COMMENT
0
Entering edit mode

You say you're using edgeR, but the code you show is for a limma analysis.

ADD REPLY
0
Entering edit mode

Hi Ryan,

 

The package I am using to get this code to run is EdgeR. Should I also be using something else? 

ADD REPLY
0
Entering edit mode

I'm not telling what you should be using, I'm telling you what you are currently using. The voom, lmFit, eBayes, and topTable functions are all part of limma, not edgeR, as you can see from reading the help pages for those functions. If this is the analysis pipeline you are using, you should tag your question with the limma tag, not the edgeR tag.

ADD REPLY
0
Entering edit mode

The first standard question that we always ask is, what version of R and limma are you using? Please give output from sessionInfo().

Would it be right to assume that you are trying to run an example from Levi Waldron's LeviRmisc package? All the examples in that package run fine.

ADD REPLY
0
Entering edit mode

Hi Gordon,

No, this is not an example from LeviRmisc. This is my thesis that I am working on with Levi, though. He is very busy and I do not want to interrupt him with technical programming issues so that is why I am using the forum. We switched from Deseq2 to edgeR for some data since it was too large. I can get everything to work up until the topTable in which I am getting this subscript error. I am an medium level programmer so do not know how to troubleshoot every error I come across.

 sessionInfo()
R Under development (unstable) (2017-01-05 r71919)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.4.0 tools_3.4.0   

ADD REPLY
0
Entering edit mode

R development version, LeviRmisc I do not think works with this version, correct?

ADD REPLY
0
Entering edit mode

Why would it not?

ADD REPLY
0
Entering edit mode

You are supposed to show us the sessionInfo() output from the same R session that gave you the error. You have simply started a new R session with no packages loaded so we can't see what versions of edgeR and limma you are using. You could look at:

packageDescription("limma")

If you are not an experienced R user, you might be better of using the official release of Bioconductor and R rather than working with the devel version. The devel version is for developers who can diagnose problems when they see them.

ADD REPLY
0
Entering edit mode

Package: limma
Version: 3.31.16
Date: 2017-03-02
Title: Linear Models for Microarray Data
Description: Data analysis, linear models and differential expression for microarray
          data.
Author: Gordon Smyth [cre,aut], Yifang Hu [ctb], Matthew Ritchie [ctb], Jeremy Silver
          [ctb], James Wettenhall [ctb], Davis McCarthy [ctb], Di Wu [ctb], Wei Shi
          [ctb], Belinda Phipson [ctb], Aaron Lun [ctb], Natalie Thorne [ctb], Alicia
          Oshlack [ctb], Carolyn de Graaf [ctb], Yunshun Chen [ctb], Mette Langaas [ctb],
          Egil Ferkingstad [ctb], Marcus Davy [ctb], Francois Pepin [ctb], Dongseok Choi
          [ctb]
Maintainer: Gordon Smyth

ADD REPLY
0
Entering edit mode

I am using the devel version so I can work with Levi's code using multiassayexperiment

ADD REPLY
1
Entering edit mode
@gordon-smyth
Last seen 3 hours ago
WEHI, Melbourne, Australia

All the code you have given (and even the comment) is identical to code in the vignette of Levi Waldron's LeviRmisc package:

   https://github.com/lwaldron/tcga_prad/blob/master/TCGA_prad.Rmd

except that you have incorrectly omitted the 'coef' argument from the topTable() call. I suggest that you study Levi's code and fix the call to topTable(). Try for example

topTable(fit, coef="racevarblack")

Since you are working with Levi and obviously following his code, he would be your first point of call if you have problems. You might be confused by his pseudo code where he has given you a choice of options for the coef.

As for the error message, we cannot reproduce the error with any dataset that we have so we don't know what the problem is. I have asked you what version of limma you are using but you have not told me. As a first step, you could try updating limma and edgeR and see if it helps. If that doesn't work, then you could mail to me an RData file containing your eset.rnaseq object.

As Ryan has explained to you, you are actually using a mainly limma package DE pipeline. You have only used edgeR to create the DGEList object. All the other commands are limma. edgeR works with limma and limma is automatically loaded by edgeR.

Edit 3 hours later.

I can now see that you are using a recent version of limma on the devel repository, so the version isn't the problem. You should not be using topTable(fit) without specifying coef, because it is not appropriate for your data, but at the same time it should not give an error. If you want me to investigate further, please give a reproducible code example that I can run myself in which the required data is loaded as well as the code leading to the error.

If you are open to advice on your actual analysis, then the following code would be quicker and possibly better:

logCPM <- dpm(dge, prior.count=3)
fit <- lmFit(logCPM, design)
fit <- eBayes(fit, trend=TRUE, robust=TRUE)
topTable(fit, coef="racevarblack")
ADD COMMENT
0
Entering edit mode

limma version 3.31.16, I just showed my output of packageDescription("limma"), above

ADD REPLY
0
Entering edit mode

I am doing this analysis over again since I have to update some of the plots with new labels, but now I am getting this error so it is frustrating.

ADD REPLY
0
Entering edit mode

Ok, thanks for your help! I will try and put in the coef argument and see if it changes the error. If not I will send you all my code.

ADD REPLY
0
Entering edit mode

Hi Gordon,

I was able to get it to work fine using the coef argument. Why is there no space between racevar and black or african american necessary? Also, how would the code above be better in terms of my analysis?

coef="racevarblack or african american"
ADD REPLY

Login before adding your answer.

Traffic: 558 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6