Empty SQL query result for SRA run
0
0
Entering edit mode
@thomas-sandmann-6817
Last seen 8 months ago
USA

Dear Jack and Sean,

I have tried to use the SRAdb package to retrieve FASTQ files for study, e.g. run SRR2961981.Unfortunately, I run into an error when I try to retrieve the paths to its FASTQ file:

This command succeeds:

​listSRAfile("SRR2961981", sra_con, fileType = "sra")

         run     study     sample experiment
1 SRR2961981 SRP066728 SRS1180807 SRX1453139
                                                                                                               ftp
1 ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByExp/sra/SRX/SRX145/SRX1453139/SRR2961981/SRR2961981.sra

But this command fails:

listSRAfile("SRR2961981", sra_con, fileType = "fastq")

Error in if (nchar(run1) < 10) { : missing value where TRUE/FALSE needed

Yet, the fastq file exists at the EBI:

ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR296/001/SRR2961981

The error is thrown by the getFASTQinfo function called by listSRAfile with fileType = "fastq":

getFASTQinfo("SRR2961981", sra_con)

Error in if (nchar(run1) < 10) { : missing value where TRUE/FALSE needed

because the SQL query

"SELECT * FROM fastq WHERE run_accession IN ('SRR2961981')"

returns no results (and the function doesn't check for that, hence the generic error message).

I am a bit confused, because the SRAdb SQLite database clearly knows about the run, as listSRAfile succeeds. Is it possible that some runs are missing from its fastq table?

Thanks a lot for any hints,

Thomas

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.6 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] dplyr_0.5.0          argparse_1.0.1       proto_0.3-10         SRAdb_1.30.0        
 [5] RCurl_1.95-4.8       bitops_1.0-6         graph_1.50.0         BiocGenerics_0.18.0 
 [9] RSQLite_1.0.0        DBI_0.5-1            BiocInstaller_1.22.3

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7      XML_3.98-1.4     assertthat_0.1   R6_2.2.0         magrittr_1.5    
 [6] stats4_3.3.1     httr_1.2.1       lazyeval_0.2.0   getopt_1.20.0    RMySQL_0.10.9   
[11] rjson_0.2.15     tools_3.3.1      Biobase_2.32.0   findpython_1.0.1 tibble_1.2      
[16] GEOquery_2.38.4 
sradb • 1.7k views
ADD COMMENT
0
Entering edit mode

I am also getting this error for specific datasets. It seems somewhat random. SRP051830 is an example of one that doesn't work.

 

EDIT: What did work is to set the fileType = 'sra'  ... so perhaps it's an issue with fastq availability. Downside is obviously that you have to do the sra --> fastq conversion yourself.

ADD REPLY

Login before adding your answer.

Traffic: 946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6