Failed to read SDRF
0
0
Entering edit mode
Ed Siefker ▴ 230
@ed-siefker-5136
Last seen 5 months ago
United States

I am trying to read RNAseq data from ArrayExpress with ArrayExpressHTS(). I am using Bioconductor 3.10.

I have prepared a transcriptome reference with prepareReference() under "reference/".

When I run ArrayExpressHTS(), it downloads the correct SDRF file then fails to read it. This happens with multiple AE identifiers. I can confirm that the SDRF downloads correctly and is found at the location it fails to read.

> aehts <-ArrayExpressHTS("E-MTAB-6071", refdir="reference/")
Mon Mar  2 14:32:09 2020 [AEHTS] PSR: /home/hatta/retina-rnaseq-mouse/E-MTAB-6071/PSR
Mon Mar  2 14:32:09 2020 [AEHTS] Backing up /home/hatta/retina-rnaseq-mouse/E-MTAB-6071/PSR to /home/hatta/retina-rnaseq-mouse/E-MTAB-6071/PSR-backup-Mon-Mar--2-14:32:09-2020-1
Mon Mar  2 14:32:09 2020 [AEHTS] Creating /home/hatta/retina-rnaseq-mouse/E-MTAB-6071/PSR
Mon Mar  2 14:32:09 2020 [AEHTS] Master Memory Usage: Start
           used  (Mb) gc trigger  (Mb) max used  (Mb)
Ncells  7248623 387.2   12066822 644.5 10957081 585.2
Vcells 12373218  94.5   22929267 175.0 15800658 120.6
Mon Mar  2 14:32:10 2020 [AEHTS] creating projects
Mon Mar  2 14:32:10 2020 [AEHTS]    Searching /home/hatta/retina-rnaseq-mouse/reference/reference_genomes/ for supported organisms
Mon Mar  2 14:32:11 2020 [AEHTS] Downloading IDF http://www.ebi.ac.uk/arrayexpress/files/E-MTAB-6071/E-MTAB-6071.idf.txt
trying URL 'http://www.ebi.ac.uk/arrayexpress/files/E-MTAB-6071/E-MTAB-6071.idf.txt'
Content type 'text/plain' length 5084 bytes
==================================================
downloaded 5084 bytes

Mon Mar  2 14:32:14 2020 [AEHTS] Downloading SDRF http://www.ebi.ac.uk/arrayexpress/files/E-MTAB-6071/E-MTAB-6071.sdrf.txt
trying URL 'http://www.ebi.ac.uk/arrayexpress/files/E-MTAB-6071/E-MTAB-6071.sdrf.txt'
Content type 'text/plain' length 12504 bytes (12 KB)
==================================================
downloaded 12 KB

Mon Mar  2 14:32:16 2020 [AEHTS] *** ERROR ***
Mon Mar  2 14:32:16 2020 [AEHTS] 
Mon Mar  2 14:32:16 2020 [AEHTS]     E-MTAB-6071 Failed to read SDRF /home/hatta/retina-rnaseq-mouse/E-MTAB-6071/data/E-MTAB-6071.sdrf.txt
Mon Mar  2 14:32:16 2020 [AEHTS] 
Error in mapAEtoENAviaHTTP(accession, descriptors$sdrffname) : 
> aehts <-ArrayExpressHTS("E-MTAB-6071", refdir="reference/")
>

Session info, in case it helps.

> library(BiocManager)
Bioconductor version 3.10 (BiocManager 1.30.10), ?BiocManager::install for help
> sessionInfo()
R version 3.6.3 RC (2020-02-21 r77847)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux bullseye/sid

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=en_US.utf8       LC_NUMERIC=C             
 [3] LC_TIME=en_US.utf8        LC_COLLATE=en_US.utf8    
 [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8   
 [7] LC_PAPER=en_US.utf8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] BiocManager_1.30.10    ArrayExpressHTS_1.36.0 snow_0.4-3            
 [4] Rsamtools_2.2.3        Biostrings_2.54.0      XVector_0.26.0        
 [7] GenomicRanges_1.38.0   GenomeInfoDb_1.22.0    IRanges_2.20.2        
[10] S4Vectors_0.24.3       BiocGenerics_0.32.0    sampling_2.8          

loaded via a namespace (and not attached):
 [1] bitops_1.0-6                matrixStats_0.55.0         
 [3] R2HTML_2.3.2                bit64_0.9-7                
 [5] RColorBrewer_1.1-2          progress_1.2.2             
 [7] httr_1.4.1                  tools_3.6.3                
 [9] backports_1.1.5             R6_2.4.1                   
[11] rpart_4.1-15                Hmisc_4.3-1                
[13] DBI_1.1.0                   lazyeval_0.2.2             
[15] colorspace_1.4-1            nnet_7.3-13                
[17] tidyselect_1.0.0            gridExtra_2.3              
[19] prettyunits_1.1.1           curl_4.3                   
[21] bit_1.1-15.2                compiler_3.6.3             
[23] sendmailR_1.2-1             Biobase_2.46.0             
[25] htmlTable_1.13.3            DelayedArray_0.12.2        
[27] scales_1.1.0                checkmate_2.0.0            
[29] genefilter_1.68.0           askpass_1.1                
[31] rappdirs_0.3.1              DESeq_1.38.0               
[33] stringr_1.4.0               digest_0.6.25              
[35] foreign_0.8-75              base64enc_0.1-3            
[37] jpeg_0.1-8.1                pkgconfig_2.0.3            
[39] htmltools_0.4.0             limma_3.42.2               
[41] dbplyr_1.4.2                htmlwidgets_1.5.1          
[43] rlang_0.4.5                 rstudioapi_0.11            
[45] RSQLite_2.2.0               hwriter_1.3.2              
[47] BiocParallel_1.20.1         acepack_1.4.1              
[49] dplyr_0.8.4                 RCurl_1.98-1.1             
[51] magrittr_1.5                GenomeInfoDbData_1.2.2     
[53] Formula_1.2-3               Matrix_1.2-18              
[55] Rcpp_1.0.3                  munsell_0.5.0              
[57] lifecycle_0.1.0             edgeR_3.28.1               
[59] stringi_1.4.6               MASS_7.3-51.5              
[61] SummarizedExperiment_1.16.1 zlibbioc_1.32.0            
[63] BiocFileCache_1.10.2        grid_3.6.3                 
[65] blob_1.2.1                  crayon_1.3.4               
[67] lattice_0.20-40             splines_3.6.3              
[69] annotate_1.64.0             hms_0.5.3                  
[71] locfit_1.5-9.1              knitr_1.28                 
[73] pillar_1.4.3                geneplotter_1.64.0         
[75] biomaRt_2.42.0              lpSolve_5.6.15             
[77] XML_3.99-0.3                glue_1.3.1                 
[79] ShortRead_1.44.3            latticeExtra_0.6-29        
[81] data.table_1.12.8           png_0.1-7                  
[83] vctrs_0.2.3                 gtable_0.3.0               
[85] openssl_1.4.1               purrr_0.3.3                
[87] assertthat_0.2.1            ggplot2_3.2.1              
[89] xfun_0.12                   xtable_1.8-4               
[91] svMisc_1.1.0                survival_3.1-8             
[93] tibble_2.1.3                rJava_0.9-11               
[95] GenomicAlignments_1.22.1    AnnotationDbi_1.48.0       
[97] memoise_1.1.0               cluster_2.1.0              
>
software error arrayexpressHTS • 1.1k views
ADD COMMENT
0
Entering edit mode

Sorry this isn't much help, but I just wanted to add that I get exactly the same error when I run that code, so it's not just your setup.

ADD REPLY
0
Entering edit mode

Clues. This happens when I run 'library(ArrayExpressHTS)'

No methods found in package ‘Biobase’ for request: ‘read.AnnotatedDataFrame’ when loading ‘ArrayExpressHTS’
No methods found in package ‘IRanges’ for requests: ‘Rle’, ‘subseq’ when loading ‘ArrayExpressHTS’

prepareReference() fails without manually setting "location". Apparently getDefaultReferenceDir() is not defined.

I created a new site library, and reinstalled BioConductor from source. Nothing changed.

ADD REPLY
0
Entering edit mode

Does BiocManager::valid() provide any hints?

ADD REPLY
0
Entering edit mode

Unfortunately not, traceback() might though.

Tue Mar  3 14:15:17 2020 [AEHTS] *** ERROR ***
Tue Mar  3 14:15:17 2020 [AEHTS] 
Tue Mar  3 14:15:17 2020 [AEHTS]     E-MTAB-6071 Failed to read SDRF /home/hatta/retina-rnaseq-mouse/E-MTAB-6071/data/E-MTAB-6071.sdrf.txt
Tue Mar  3 14:15:17 2020 [AEHTS] 
Error in mapAEtoENAviaHTTP(accession, descriptors$sdrffname) : 
> traceback()
4: stop()
3: mapAEtoENAviaHTTP(accession, descriptors$sdrffname)
2: createAEprojects(accession = accession, dir = dir, refdir = refdir, 
       localmode = getPipelineOption("ebilocalmode"), options = options)
1: ArrayExpressHTS("E-MTAB-6071", refdir = "reference/")
> BiocManager::valid() 
[1] TRUE
>
ADD REPLY
0
Entering edit mode

Stepping through the code it fails inside readSDRF(sdrffname) with the error could not find function "read.AnnotatedDataFrame"

I see the following message when the package loads:

No methods found in package ‘Biobase’ for request: ‘read.AnnotatedDataFrame’ when loading ‘ArrayExpressHTS’

So perhaps something has changed in Biobase?

ADD REPLY
0
Entering edit mode

Thanks, I created a new post tagging BioBase. Maybe that will get some attention.

ADD REPLY
0
Entering edit mode

Did some more digging, and read.AnnotatedDataFrame’ is not a method, it's a function, but ArrayExpressHTS tries uses importMethodsFrom() in its NAMESPACE.

Switiching to use just import gets past out original error, but then I end up with

Error in url(descriptorurl) : invalid 'description' argument
In addition: Warning message:
In readLines(rundocurl) :
  incomplete final line found on 'http://www.ebi.ac.uk/ena/data/view/ERR2124678&display=xml'
structure("Error in url(descriptorurl) : invalid 'description' argument\n", class = "try-error", condition = structure(list(
    message = "invalid 'description' argument", call = url(descriptorurl)), class = c("simpleError", 
"error", "condition")))

I.m beginning to think ArrayExpressHTS might need an overhaul!

ADD REPLY
0
Entering edit mode

Biobase hasn't changed recently, e.g., git blame NAMESPACE shows the last change to the relevant line of the NAMESPACE file to be in 2010.

1ef01a72 (Valerie Obenchain 2010-11-01 05:43:37 +0000 61)        readExpressionSet, read.AnnotatedDataFrame, read.MIAME, MIAME,

I don't think that this is an 'error' anyway, just a warning...

ADD REPLY

Login before adding your answer.

Traffic: 641 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6