Empty SQL query result for SRA run
Last seen 6 months ago

Dear Jack and Sean,

I have tried to use the SRAdb package to retrieve FASTQ files for study, e.g. run SRR2961981.Unfortunately, I run into an error when I try to retrieve the paths to its FASTQ file:

This command succeeds:

​listSRAfile("SRR2961981", sra_con, fileType = "sra")

         run     study     sample experiment
1 SRR2961981 SRP066728 SRS1180807 SRX1453139
1 ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByExp/sra/SRX/SRX145/SRX1453139/SRR2961981/SRR2961981.sra

But this command fails:

listSRAfile("SRR2961981", sra_con, fileType = "fastq")

Error in if (nchar(run1) < 10) { : missing value where TRUE/FALSE needed

Yet, the fastq file exists at the EBI:


The error is thrown by the getFASTQinfo function called by listSRAfile with fileType = "fastq":

getFASTQinfo("SRR2961981", sra_con)

Error in if (nchar(run1) < 10) { : missing value where TRUE/FALSE needed

because the SQL query

"SELECT * FROM fastq WHERE run_accession IN ('SRR2961981')"

returns no results (and the function doesn't check for that, hence the generic error message).

I am a bit confused, because the SRAdb SQLite database clearly knows about the run, as listSRAfile succeeds. Is it possible that some runs are missing from its fastq table?

Thanks a lot for any hints,


> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.6 (El Capitan)

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] dplyr_0.5.0          argparse_1.0.1       proto_0.3-10         SRAdb_1.30.0        
 [5] RCurl_1.95-4.8       bitops_1.0-6         graph_1.50.0         BiocGenerics_0.18.0 
 [9] RSQLite_1.0.0        DBI_0.5-1            BiocInstaller_1.22.3

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7      XML_3.98-1.4     assertthat_0.1   R6_2.2.0         magrittr_1.5    
 [6] stats4_3.3.1     httr_1.2.1       lazyeval_0.2.0   getopt_1.20.0    RMySQL_0.10.9   
[11] rjson_0.2.15     tools_3.3.1      Biobase_2.32.0   findpython_1.0.1 tibble_1.2      
[16] GEOquery_2.38.4 
sradb
I am also getting this error for specific datasets. It seems somewhat random. SRP051830 is an example of one that doesn't work.


EDIT: What did work is to set the fileType = 'sra'  ... so perhaps it's an issue with fastq availability. Downside is obviously that you have to do the sra --> fastq conversion yourself.


