Search
Question: How to set custom Brainarray annotation for Biobase ExpressionSet object?
0
gravatar for lykhenko.olexandr
4 months ago by
lykhenko.olexandr0 wrote:

I have downloaded from ArrayExpress raw data for microarray experiment E-GEOD-14722 (GSE14722):

aeData = getAE("E-GEOD-14722", type = 'raw')

and turned it into an ExpressionSet object. In fact, a list of two object, since the experiment uses two platforms.

affyData = ae2bioc(mageFiles = aeData)

Now I want to set custom platform annotation for this ExpressionSet object and use it to get expression matrix:

affyData1 = affyData[[1]]
affyData1@annotation = "hgu133bhsentrezg.db"
eset.br = rma(affyData1)

But I get the following error:

Loading required package: AnnotationDbi
Loading required package: org.Hs.eg.db
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘pmindex’ for signature ‘"ChipDb"’
In addition: Warning message:
In rsqlite_fetch(res@ptr, n = n) :
  Don't need to call dbFetch() for statements, only for queries

while everything works fine with native Affymetrix annotation.

How to properly set custom annotation for ExpressionSet object?

Here is sessioninfo()

R version 3.4.0 (2017-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.2 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
 [1] hgu133bhsentrezg.db_22.0.0 org.Hs.eg.db_3.4.1         AnnotationDbi_1.38.1      
 [4] pd.hg.u133a_3.12.0         pd.hg.u133b_3.12.0         DBI_0.7                   
 [7] oligo_1.40.1               RSQLite_2.0                Biostrings_2.44.1         
[10] XVector_0.16.0             IRanges_2.10.2             S4Vectors_0.14.3          
[13] ArrayExpress_1.36.0        cowplot_0.7.0              ggfortify_0.4.1           
[16] ggplot2_2.2.1              stringr_1.2.0              sva_3.24.4                
[19] BiocParallel_1.10.1        genefilter_1.58.1          mgcv_1.8-17               
[22] nlme_3.1-131               affy_1.54.0                Biobase_2.36.2            
[25] BiocGenerics_0.22.0        oligoClasses_1.38.0       

loaded via a namespace (and not attached):
 [1] tidyr_0.6.3                bit64_0.9-7                splines_3.4.0             
 [4] foreach_1.4.3              assertthat_0.2.0           blob_1.1.0                
 [7] GenomeInfoDbData_0.99.0    lattice_0.20-35            glue_1.1.1                
[10] limma_3.32.2               digest_0.6.12              GenomicRanges_1.28.3      
[13] colorspace_1.3-2           preprocessCore_1.38.1      Matrix_1.2-10             
[16] plyr_1.8.4                 XML_3.98-1.9               pkgconfig_2.0.1           
[19] affxparser_1.48.0          zlibbioc_1.22.0            xtable_1.8-2              
[22] scales_0.4.1               affyio_1.46.0              ff_2.2-13                 
[25] tibble_1.3.3               annotate_1.54.0            SummarizedExperiment_1.6.3
[28] lazyeval_0.2.0             survival_2.41-3            magrittr_1.5              
[31] memoise_1.1.0              BiocInstaller_1.26.0       tools_3.4.0               
[34] matrixStats_0.52.2         munsell_0.4.3              DelayedArray_0.2.7        
[37] bindrcpp_0.2               compiler_3.4.0             GenomeInfoDb_1.12.2       
[40] rlang_0.1.1                grid_3.4.0                 RCurl_1.95-4.8            
[43] iterators_1.0.8            bitops_1.0-6               gtable_0.2.0              
[46] codetools_0.2-15           R6_2.2.2                   gridExtra_2.2.1           
[49] dplyr_0.7.1                bit_1.1-12                 bindr_0.1                 
[52] stringi_1.1.5              Rcpp_0.12.11

 

ADD COMMENTlink modified 4 months ago by James W. MacDonald45k • written 4 months ago by lykhenko.olexandr0
3
gravatar for James W. MacDonald
4 months ago by
United States
James W. MacDonald45k wrote:

You won't be able to do things that way if you want to use MBNI remapped cdfs. The ae2bioc function is hard-coded to get the Bioconductor pdInfo package for the respective array types. But you can get around it...

> aeData = getAE("E-GEOD-14722", type = 'raw')

> install.packages("http://mbni.org/customcdf/22.0.0/entrezg.download/pd.hgu133a.hs.entrezg_22.0.0.tar.gz", repos = NULL)

> install.packages("http://mbni.org/customcdf/22.0.0/entrezg.download/pd.hgu133b.hs.entrezg_22.0.0.tar.gz", repos = NULL)

> z <- ArrayExpress:::readPhenoData(aeData$sdrf, aeData$path)

> fllst <- split(aeData$rawFiles, pData(z)$Array.Design.REF)

> pkglst <- c("pd.hgu133b.hs.entrezg","pd.hgu133a.hs.entrezg")
> affyData <- lapply(1:2, function(x) read.celfiles(filenames = fllst[[x]], pkgname = pkglst[x]))
Platform design info loaded.
Reading in : GSM367803.CEL
Reading in : GSM367802.CEL
Reading in : GSM367801.CEL
Reading in : GSM367800.CEL
Reading in : GSM367799.CEL
Reading in : GSM367798.CEL
Reading in : GSM367797.CEL
Reading in : GSM367796.CEL
Reading in : GSM367795.CEL
Reading in : GSM367794.CEL
Reading in : GSM367793.CEL
Reading in : GSM367792.CEL
Reading in : GSM367791.CEL
Reading in : GSM367790.CEL
Reading in : GSM367789.CEL
Reading in : GSM367788.CEL
Reading in : GSM367787.CEL
Reading in : GSM367786.CEL
Reading in : GSM367785.CEL
Reading in : GSM367784.CEL
Reading in : GSM367783.CEL
Reading in : GSM367782.CEL
Reading in : GSM367781.CEL
Platform design info loaded.
Reading in : GSM367826.CEL
Reading in : GSM367825.CEL
Reading in : GSM367824.CEL
Reading in : GSM367823.CEL
Reading in : GSM367822.CEL
Reading in : GSM367821.CEL
Reading in : GSM367820.CEL
Reading in : GSM367819.CEL
Reading in : GSM367818.CEL
Reading in : GSM367817.CEL
Reading in : GSM367816.CEL
Reading in : GSM367815.CEL
Reading in : GSM367814.CEL
Reading in : GSM367813.CEL
Reading in : GSM367812.CEL
Reading in : GSM367811.CEL
Reading in : GSM367810.CEL
Reading in : GSM367809.CEL
Reading in : GSM367808.CEL
Reading in : GSM367807.CEL
Reading in : GSM367806.CEL
Reading in : GSM367805.CEL
Reading in : GSM367804.CEL
> affyEsets <- lapply(affyData, rma, target = "mps1")
Background correcting... OK
Normalizing... OK
Calculating Expression
Background correcting... OK
Normalizing... OK
Calculating Expression

> affyEsets
[[1]]
ExpressionSet (storageMode: lockedEnvironment)
assayData: 9142 features, 23 samples
  element names: exprs
protocolData
  rowNames: GSM367803.CEL GSM367802.CEL ... GSM367781.CEL (23 total)
  varLabels: exprs dates
  varMetadata: labelDescription channel
phenoData
  rowNames: GSM367803.CEL GSM367802.CEL ... GSM367781.CEL (23 total)
  varLabels: index
  varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.hgu133b.hs.entrezg

[[2]]
ExpressionSet (storageMode: lockedEnvironment)
assayData: 12322 features, 23 samples
  element names: exprs
protocolData
  rowNames: GSM367826.CEL GSM367825.CEL ... GSM367804.CEL (23 total)
  varLabels: exprs dates
  varMetadata: labelDescription channel
phenoData
  rowNames: GSM367826.CEL GSM367825.CEL ... GSM367804.CEL (23 total)
  varLabels: index
  varMetadata: labelDescription channel
featureData: none
experimentData: use 'experimentData(object)'
Annotation: pd.hgu133a.hs.entrezg

 

ADD COMMENTlink written 4 months ago by James W. MacDonald45k

Thank you James W. MacDonald for suggested solution. I couldn't not find out however what was paremeter target="mps1" for in

affyEsets <- lapply(affyData, rma, target = "mps1")

The interpreter says that this parameter is unused.

 

ADD REPLYlink modified 4 months ago • written 4 months ago by lykhenko.olexandr0
1

Well, it didn't say that in the code I posted, so your best recourse would be to carefully compare what I did versus what you did, and figure out where you went wrong.

ADD REPLYlink written 4 months ago by James W. MacDonald45k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 114 users visited in the last hour