Question

convertId returning "attempt to select less than one element in integerOneIndex" Error

0

Entering edit mode

CF556 • 0

@cf556-16021

Last seen 7.5 years ago

I am just learning R and familiarizing myself with the bioconductor packages. I am running through the KEGG profile application example to play around with annotation. Early on I ran into this error:

Here is the code for a smaller version of my dataframe:

> sam <- Rnacounts[1:4, 1:4]

>sam
              X     USC_1     USC_2     USC_3
1 0610009B22Rik 26.993871 37.116868 19.684120
2 0610009O20Rik  3.470513  2.319068  4.517183
3 0610010F05Rik  8.087210  7.767922  6.730636
4 0610010K14Rik 21.892374 21.090751 28.342038
> convertId(sam, dataset = "mmusculus_gene_ensembl")

Error in args[[i]] : 
  attempt to select less than one element in integerOneIndex
> traceback()
7: is.factor(x)
6: as.factor(args[[i]])
5: interaction(f, drop = drop, sep = sep, lex.order = lex.order)
4: split.default(testStat, convertIdTable)
3: split(testStat, convertIdTable)
2: newIdMatrix(x, genesKept = genesKept, convertIdTable = newIdTable)
1: convertId(sam, dataset = "mmusculus_gene_ensembl")

I am not used to debugging R code so I am not entirely sure where this error is coming from. I am assuming it is an indexing error where i is now outside of range, but I am not sure how to track down the specific issue (where is i coming from/iterating from?) my other guess would be that some of the genes I have may not be named, a lot of the form '#######'Rik, but i am not sure why that would throw this error.

output of session info:

R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
 [1] grid      parallel  stats4    stats     graphics  grDevices utils     datasets 
 [9] methods   base     

other attached packages:
 [1] biomaRt_2.36.1              KEGGprofile_1.22.0          RCurl_1.95-4.10            
 [4] bitops_1.0-6                BiocInstaller_1.30.0        edgeR_3.22.2               
 [7] limma_3.36.1                knitr_1.20                  BiocStyle_2.8.2            
[10] EDASeq_2.14.0               ShortRead_1.38.0            GenomicAlignments_1.16.0   
[13] SummarizedExperiment_1.10.1 DelayedArray_0.6.0          matrixStats_0.53.1         
[16] Rsamtools_1.32.0            Biostrings_2.48.0           XVector_0.20.0             
[19] BiocParallel_1.14.1         Biobase_2.40.0              Gviz_1.24.0                
[22] GenomicRanges_1.32.3        GenomeInfoDb_1.16.0         IRanges_2.14.10            
[25] S4Vectors_0.18.2            BiocGenerics_0.26.0        

loaded via a namespace (and not attached):
Error in x[["Version"]] : subscript out of bounds
In addition: Warning message:
In FUN(X[[i]], ...) :
  DESCRIPTION file of package 'rtracklayer' is missing or broken

I also read that the KEGG.db was deprecated so if you have any recommendations about annotation packages I would love to hear them. Sorry if post is improperly formatted, first time with this forum.

Thanks!

Edit: I have re-formatted,post. Also removed all Riks, still does not work. Made the gene names the index, does not work.

bioconductor annotation keggprofile • 8.7k views

ADD COMMENT • link updated 7.6 years ago by James W. MacDonald 68k • written 7.6 years ago by CF556 • 0

score 0 · Answer 1 · 2018-06-05

The help page for convertId is rather sparse, but the examples are instructive:

 temp<-cbind(rnorm(10),rnorm(10))
 row.names(temp)<-c("Q04837","P0C0L4","P0C0L5","O75379","Q13068",
                    "A2MYD1","P60709","P30462","P30475","P30479")
 colnames(temp)<-c("Exp1","Exp2")
 convertId(temp,filters="uniprotswissprot",keepMultipleId=TRUE)

Note that the input here is a matrix, where the row.names are the IDs. You are passing in a data.frame, where the first column contains the IDs, and the row.names (which is what will be used) are simply integers from 1:nrow(sam). In addition, the help page says this:

Usage:

     convertId(x, dataset = "hsapiens_gene_ensembl",
       filters = "uniprotswissprot", attributes = c(filters, "entrezgene"),
       genesKept = c("foldchange", "first", "random", "var", "abs"),
       keepNoId = T, keepMultipleId = F, verbose = F)

Which is also instructive, as the default filter is 'uniprotswissprot', and the default attribute is 'entrezgene'. Do note that the IDs you have there are Entrez Gene IDs, and are not Uniprot/SwissProt IDs. So if you are going to use the function and rely on the defaults to do whatever you are trying to do, you will be sorely disappointed in the results.

It is very rare that an end user needs to debug somebody's package code; the smart play is to peruse the help pages for the functions you are using and make sure you understand what is said there.