Search
Question: Empty SQL query result for SRA run
0
gravatar for Thomas Sandmann
13 months ago by
USA
Thomas Sandmann60 wrote:

Dear Jack and Sean,

I have tried to use the SRAdb package to retrieve FASTQ files for study, e.g. run SRR2961981.Unfortunately, I run into an error when I try to retrieve the paths to its FASTQ file:

This command succeeds:

​listSRAfile("SRR2961981", sra_con, fileType = "sra")

         run     study     sample experiment
1 SRR2961981 SRP066728 SRS1180807 SRX1453139
                                                                                                               ftp
1 ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByExp/sra/SRX/SRX145/SRX1453139/SRR2961981/SRR2961981.sra

But this command fails:

listSRAfile("SRR2961981", sra_con, fileType = "fastq")

Error in if (nchar(run1) < 10) { : missing value where TRUE/FALSE needed

Yet, the fastq file exists at the EBI:

ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR296/001/SRR2961981

The error is thrown by the getFASTQinfo function called by listSRAfile with fileType = "fastq":

getFASTQinfo("SRR2961981", sra_con)

Error in if (nchar(run1) < 10) { : missing value where TRUE/FALSE needed

because the SQL query

"SELECT * FROM fastq WHERE run_accession IN ('SRR2961981')"

returns no results (and the function doesn't check for that, hence the generic error message).

I am a bit confused, because the SRAdb SQLite database clearly knows about the run, as listSRAfile succeeds. Is it possible that some runs are missing from its fastq table?

Thanks a lot for any hints,

Thomas

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.6 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] dplyr_0.5.0          argparse_1.0.1       proto_0.3-10         SRAdb_1.30.0        
 [5] RCurl_1.95-4.8       bitops_1.0-6         graph_1.50.0         BiocGenerics_0.18.0 
 [9] RSQLite_1.0.0        DBI_0.5-1            BiocInstaller_1.22.3

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7      XML_3.98-1.4     assertthat_0.1   R6_2.2.0         magrittr_1.5    
 [6] stats4_3.3.1     httr_1.2.1       lazyeval_0.2.0   getopt_1.20.0    RMySQL_0.10.9   
[11] rjson_0.2.15     tools_3.3.1      Biobase_2.32.0   findpython_1.0.1 tibble_1.2      
[16] GEOquery_2.38.4 
ADD COMMENTlink modified 13 months ago • written 13 months ago by Thomas Sandmann60

I am also getting this error for specific datasets. It seems somewhat random. SRP051830 is an example of one that doesn't work.

 

EDIT: What did work is to set the fileType = 'sra'  ... so perhaps it's an issue with fastq availability. Downside is obviously that you have to do the sra --> fastq conversion yourself.

ADD REPLYlink modified 13 months ago • written 13 months ago by story.benjamin0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 167 users visited in the last hour