Question

affycoretools::annotateEset(eset, pd.ht.hgu133.plus.pm) return error

0

Entering edit mode

giroudpaul ▴ 40

@giroudpaul-10031

Last seen 4.4 years ago

France

Hi

I have been trying to analyses a microarray dataset GSE85543 which has been done using Affymetrix HT HG-U133+ PM Array Plate.

I used to annotate microarray data using affycoretools::annotateEset and the corresponding ChipDB package (e.g. hgu133plus2.db) The reference manual for affycoretools indicate that annotateEset can work with either a ChipDB object or an AffyGenePDInfo.

However, when I try to annotate my data (post rma), I get the following error :

"There is no annotation object provided with the x package"

What does this mean ? Is there a problem with the package ? Or did I do something wrong ?

Code :

library("BiocManager")
library("GEOquery")
library("affy")
library("oligo")
library("pd.ht.hg.u133.plus.pm")
library("affycoretools")
library("ggplot2")

celpath = "C:/Users/pgiroud/OneDrive - Elsalys Biotech/Bioinfo/GSE85543/CEL/"
celFiles <- list.celfiles(celpath, full.names=TRUE)
data <- oligo::read.celfiles(celFiles)

data.rma = oligo::rma(data, background=TRUE, normalize=TRUE) 

data.ann <- affycoretools::annotateEset(data.rma, pd.ht.hg.u133.plus.pm)

Session Info :

R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252    LC_MONETARY=French_France.1252
[4] LC_NUMERIC=C                   LC_TIME=French_France.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] affycoretools_1.58.2         pd.ht.hg.u133.plus.pm_3.12.0 DBI_1.0.0                   
 [4] RSQLite_2.1.4                oligo_1.50.0                 ggplot2_3.2.1               
 [7] Biostrings_2.54.0            XVector_0.26.0               IRanges_2.20.1              
[10] S4Vectors_0.24.1             oligoClasses_1.48.0          affy_1.64.0                 
[13] GEOquery_2.54.1              Biobase_2.46.0               BiocGenerics_0.32.0         
[16] BiocManager_1.30.10         

loaded via a namespace (and not attached):
  [1] backports_1.1.5             GOstats_2.52.0              Hmisc_4.3-0                
  [4] BiocFileCache_1.10.2        plyr_1.8.4                  lazyeval_0.2.2             
  [7] GSEABase_1.48.0             splines_3.6.1               BiocParallel_1.20.0        
 [10] GenomeInfoDb_1.22.0         digest_0.6.23               ensembldb_2.10.2           
 [13] foreach_1.4.7               htmltools_0.4.0             GO.db_3.10.0               
 [16] gdata_2.18.0                magrittr_1.5                checkmate_1.9.4            
 [19] memoise_1.1.0               BSgenome_1.54.0             cluster_2.1.0              
 [22] gcrma_2.58.0                limma_3.42.0                readr_1.3.1                
 [25] annotate_1.64.0             matrixStats_0.55.0          R.utils_2.9.2              
 [28] ggbio_1.34.0                askpass_1.1                 prettyunits_1.0.2          
 [31] colorspace_1.4-1            blob_1.2.0                  rappdirs_0.3.1             
 [34] xfun_0.11                   dplyr_0.8.3                 crayon_1.3.4               
 [37] RCurl_1.95-4.12             graph_1.64.0                genefilter_1.68.0          
 [40] zeallot_0.1.0               VariantAnnotation_1.32.0    survival_3.1-8             
 [43] iterators_1.0.12            glue_1.3.1                  gtable_0.3.0               
 [46] zlibbioc_1.32.0             DelayedArray_0.12.0         Rgraphviz_2.30.0           
 [49] scales_1.1.0                GGally_1.4.0                edgeR_3.28.0               
 [52] Rcpp_1.0.3                  xtable_1.8-4                progress_1.2.2             
 [55] htmlTable_1.13.3            foreign_0.8-72              bit_1.1-14                 
 [58] OrganismDbi_1.28.0          preprocessCore_1.48.0       Formula_1.2-3              
 [61] AnnotationForge_1.28.0      htmlwidgets_1.5.1           httr_1.4.1                 
 [64] gplots_3.0.1.1              RColorBrewer_1.1-2          ellipsis_0.3.0             
 [67] acepack_1.4.1               ff_2.2-14                   R.methodsS3_1.7.1          
 [70] pkgconfig_2.0.3             reshape_0.8.8               XML_3.98-1.20              
 [73] nnet_7.3-12                 dbplyr_1.4.2                locfit_1.5-9.1             
 [76] tidyselect_0.2.5            rlang_0.4.2                 reshape2_1.4.3             
 [79] AnnotationDbi_1.48.0        munsell_0.5.0               tools_3.6.1                
 [82] stringr_1.4.0               knitr_1.26                  bit64_0.9-7                
 [85] caTools_1.17.1.3            purrr_0.3.3                 AnnotationFilter_1.10.0    
 [88] RBGL_1.62.1                 R.oo_1.23.0                 xml2_1.2.2                 
 [91] biomaRt_2.42.0              compiler_3.6.1              rstudioapi_0.10            
 [94] curl_4.3                    affyio_1.56.0               PFAM.db_3.10.0             
 [97] tibble_2.1.3                geneplotter_1.64.0          stringi_1.4.3              
[100] GenomicFeatures_1.38.0      lattice_0.20-38             ProtGenerics_1.18.0        
[103] Matrix_1.2-18               vctrs_0.2.0                 pillar_1.4.2               
[106] lifecycle_0.1.0             data.table_1.12.6           bitops_1.0-6               
[109] rtracklayer_1.46.0          GenomicRanges_1.38.0        hwriter_1.3.2              
[112] R6_2.4.1                    latticeExtra_0.6-28         KernSmooth_2.23-16         
[115] gridExtra_2.3               affxparser_1.58.0           codetools_0.2-16           
[118] dichromat_2.0-0             gtools_3.8.1                assertthat_0.2.1           
[121] SummarizedExperiment_1.16.0 openssl_1.4.1               DESeq2_1.26.0              
[124] Category_2.52.1             ReportingTools_2.26.0       withr_2.1.2                
[127] GenomicAlignments_1.22.1    Rsamtools_2.2.1             GenomeInfoDbData_1.2.2     
[130] hms_0.5.2                   grid_3.6.1                  rpart_4.1-15               
[133] tidyr_1.0.0                 biovizBase_1.34.1           base64enc_0.1-3

affycoretools annotation HT HG-U133+ PM Array • 1.3k views

ADD COMMENT • link updated 4.4 years ago by James W. MacDonald 65k • written 4.4 years ago by giroudpaul ▴ 40

score 2 · Accepted Answer · 2019-12-09

2

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 32 minutes ago

United States

For most of the pdInfo packages, there is a file in the extdata directory that contains the annotation for that array. So if you ask for annotations from a pdInfo package, the function looks in the requisite place and tries to load it. So as an example, here is the Clariom D array:

> dir(system.file("extdata/", package = "pd.clariom.d.human"))
[1] "netaffxProbeset.rda"       "netaffxTranscript.rda"    
[3] "pd.clariom.d.human.sqlite"

And the file called netaffxTranscript.rda would be loaded. But for the file you are using, this is what is in that directory:

> dir(system.file("extdata/", package = "pd.ht.hg.u133.plus.pm"))
[1] "pd.ht.hg.u133.plus.pm.sqlite"

Which is why you get the error saying there isn't an annotation file in this package.

I think this array has the same content as the hgu133plus2 array, so you might try

library(hgu133plus2.db)
anno <- do.call(cbind, lapply( c("PROBEID", "ENTREZID", "SYMBOL", "GENENAME"),
                       function(x) mapIds(x, featureNames(data.rma))))
fData(data.rma) <- AnnotatedDataFrame(data = anno)
validObject(data.rma)

ADD COMMENT • link 4.4 years ago James W. MacDonald 65k

0

Entering edit mode

Hi James,

Thank you for the explanation ! You are right, it's the same chip as the hgu133plus2, but with only perfect match (PM) probes. I do not succeed however in making your solution work. I get the following message :

 Error in (function (classes, fdef, mtable)  : 
 unable to find an inherited method for function 'mapIds' for signature '"character"'

ADD REPLY • link 4.4 years ago giroudpaul ▴ 40

1

Entering edit mode

My bad. It should be

library(hgu133plus2.db)
anno <- do.call(cbind, lapply( c("PROBEID", "ENTREZID", "SYMBOL", "GENENAME"),
                       function(x) mapIds(hgu133plus2.db, featureNames(data.rma), x, "PROBEID")))
fData(data.rma) <- AnnotatedDataFrame(data = anno)
validObject(data.rma)

ADD REPLY • link 4.4 years ago James W. MacDonald 65k

0

Entering edit mode

Hello James,

I still met some problems as probename were not exactly the same between hgu133plus2.db and my data :

>head(featureNames(data.rma))
[1] "1007_PM_s_at" "1053_PM_at"   "117_PM_at"    "121_PM_at"    "1255_PM_g_at" "1294_PM_at"  
> head(keys(hgu133plus2.db))
[1] "1007_s_at" "1053_at"   "117_at"    "121_at"    "1255_g_at" "1294_at"

I got around with this :

probes <- gsub("PM_", "", featureNames(data.rma))
anno <- do.call(cbind, lapply(c("ENTREZID", "SYMBOL", "GENENAME"),
                              function(x) mapIds(hgu133plus2.db, keys=probes,
                                                 column = x, keytype = "PROBEID")))

Also I removed "PROBEID" because it returned a memory error.

But now, I get stuck at the next steps when I thought I had it all figured out. The next time return :

fData(data.rma) <- AnnotatedDataFrame(data = anno)
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function 'AnnotatedDataFrame' for signature '"matrix", "missing"'

So from what I understand, the problem is that :

anno is a matix
There is NA value within anno matrix (10384 probes without annotation over 54715)

Is this normal that so many probes return no gene, given it should be a "Perfect macth" only array ? How do I solve this issue ?

EDIT : I found the problem : I removed the PM in the probename in my feature data, but my Assaydata still have it, so I try to insert feature data looking like this "1007sat", in a Eset with assaydata whose featureNames look like this "1007PMs_at". Instead of removing PM, I should add it. I will search how to do this

ADD REPLY • link 4.4 years ago giroudpaul ▴ 40

0

Entering edit mode

It works like this :

featureNames(data.rma) <- gsub("PM_", "", featureNames(data.rma))
anno <- do.call(cbind, lapply(c("ENTREZID", "SYMBOL", "GENENAME"),
                              function(x) mapIds(hgu133plus2.db, keys=featureNames(data.rma), column = x, keytype = "PROBEID")))
fData(data.rma) <- as.data.frame(anno)
validObject(data.rma)

I just removed the PM also in my data.

ADD REPLY • link 4.4 years ago giroudpaul ▴ 40