error when reading fastq file
Entering edit mode
Last seen 6 weeks ago

Hi I want to open an fastq file with readFastq() but it gives me an error :

is the invalid character a space in the file, if it is how I can remove it?

> f<- readFastq("a.fastq")
Error: Input/Output
  message: invalid character '

> sessionInfo( )
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ShortRead_1.46.0            GenomicAlignments_1.24.0    SummarizedExperiment_1.18.2 DelayedArray_0.14.1         matrixStats_0.58.0         
 [6] Biobase_2.48.0              Rsamtools_2.4.0             GenomicRanges_1.40.0        GenomeInfoDb_1.24.2         Biostrings_2.56.0          
[11] XVector_0.28.0              IRanges_2.22.2              S4Vectors_0.26.1            BiocParallel_1.22.0         BiocGenerics_0.34.0        

loaded via a namespace (and not attached):
 [1] rstudioapi_0.13        zlibbioc_1.34.0        lattice_0.20-41        jpeg_0.1-8.1           hwriter_1.3.2          tools_4.0.3           
 [7] grid_4.0.3             png_0.1-7              latticeExtra_0.6-29    crayon_1.4.1           Matrix_1.3-2           GenomeInfoDbData_1.2.3
[13] RColorBrewer_1.1-2     bitops_1.0-6           RCurl_1.98-1.2         compiler_4.0.3
FastqCleaner fastq ShortRead • 147 views
Entering edit mode
swbarnes2 ▴ 680
Last seen 2 hours ago
San Diego

Probably the simplest thing you can do is to cut the file in half, and see if you can isolate where the error is. Or, if an 8 line fastq won't import, you have a pervasive problem.

Entering edit mode

can I use readFastq() to read RNA fastq file? because when I remove the letter U from the file issue is resolved.

Entering edit mode
Last seen 7 hours ago
United States

If you have 'U' in your FASTQ files I think that will be a problem. I can barely understand C code, but in read_solexa_fastq it says

SEXP read_solexa_fastq(SEXP files, SEXP withId)
    int i, nfiles, nrec = 0;
    const char *fname;
    SEXP ans = R_NilValue, nms = R_NilValue;

    if (!IS_CHARACTER(files))
        Rf_error("'%s' must be '%s'", "files", "character");
    if (!IS_LOGICAL(withId) || LENGTH(withId) != 1)
        Rf_error("'%s' must be '%s'", "withId", "logical(1)");

    nfiles = LENGTH(files);
    nrec = (int) _count_lines_sum(files) / LINES_PER_FASTQ_REC;
    PROTECT(ans = NEW_LIST(3));
    SET_VECTOR_ELT(ans, 0, _NEW_XSNAP(nrec, "DNAString"));      /* sread */

And if it's saying the input data have to be of class DNAString then obviously a 'U' is a non-starter. The help page for readFastq says it uses the IUPAC alphabet from Biostrings, and in that package we have

 [1] "A" "C" "G" "T" "M" "R" "W" "S" "Y" "K" "V" "H" "D" "B" "N" "-" "+" "."

So again, I think a 'U' in your FASTQ file will be problematic because it's not a DNA letter.

Login before adding your answer.

Traffic: 497 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6