No method for coercing this S4 class to a vector, as trying to import my own data rather than using SRA in the workflow RnaSeqGeneEdgeRQL
Entering edit mode
Raito92 ▴ 60
Last seen 6 weeks ago

Hello again, I'm working on the RnaSeqGeneEdgeRQL workflow at the moment, trying to edit it so that it uses my own data rather than importing published RNASeq data from SRA...

I was wondering if there is any easier way to do it.

The original instructions to create the vector were: (If I got it right, basically, I have to act so that my all.fastq object contains my own data rather than using SRA ones...)

   ##Just creating a targets file 

    targetsFile <- system.file("extdata", "targets.txt",
    targets <- read.delim(targetsFile, stringsAsFactors=FALSE)

##Retrieving data

    for (sra in targets$SRA) system(paste("fastq-dump", sra)) 
    all.fastq <- paste0(targets$SRA, ".fastq")

 ##Here is the next original function which requires the all.fastq object

    all.bam <- sub(".fastq", ".bam", all.fastq)
    align(index="mm10", readfile1=all.fastq, input_format="FASTQ", output_file=all.bam)

I looked for a proper solution online, then I tried with the readFastq function from ShortRead package, as follows:

fastqPath <- list.files("/home/genomica/DATA/dwarf/fastq", pattern = ".fastq$", full = TRUE)
reads <- readFastq(fastqPath)

And then:

all.bam <- sub(".fastq", ".bam", reads)
align(index="FASTA", readfile1=reads, input_format="FASTQ", 

That's what I did, also visualizing the contents of the reads and fastqPath objects.

It looks as if I managed to create the desider object, but in the wrong format (belonging to the ShortRead class rather than the required vector). As a matter of fact, when trying to go further, I get the No method for coercing this S4 class to a vector

How can I overcome this issue and use the pipeline with a proper vector object containing local data?

Here is my sessioninfo() output:

rnaseqgeneedgerql shortread Rsubread • 852 views
Entering edit mode
Aspie96 ▴ 60
Last seen 3.3 years ago

Hi, Raito.

First, you should call:

buildindex(basename = "index", reference="./Data.fna")

This generate the needed index files and the input must be in FASTA format. We will need these files later. The basename parameter ("index") indicates the prefix of the files.

Then, you call:

fastqPath <- list.files("./fastq", pattern="\\.fastq$", full=TRUE)

This creates an array of all names of all files in the "fastq" directory whose extension is ".fastq".

Then you need to call:

all.bam <- sub("\\.fastq$", ".bam", fastqPath)

The array all.bam is identical to fastqPath, except that the extension of the files is ".bam" instead of ".fastq".

Then you can call:

align(index="index", readfile1=fastqPath, input_format="FASTQ", output_file=all.bam)

It is important to understand the meaning of the parameters:

  • index is the prefix of the index files.
  • readfile1 is the array of input files (in our case ".fastq" files in the "fastq" directory).
  • input_format is the format of input files.
  • output_file is the list of output files. In our case, each output file has the name of an input file, with the exception of the extension (which is ".bam").

The problem with your code is that it doesn't use an array as readfile1 parameter, as it should be.

The full code is:


buildindex(basename = "index", reference="./fastq/GCF_002742605.1_O_europaea_v1_genomic.fna")

fastqPath <- list.files("./fastq", pattern="\\.fastq$", full=TRUE)

all.bam <- sub("\\.fastq$", ".bam", fastqPath)
align(index="index", readfile1=fastqPath, input_format="FASTQ", output_file=all.bam)
Entering edit mode
Last seen 5 hours ago
WEHI, Melbourne, Australia

Dear Raito,

The RnaSeqGeneEdgeRQL workflow is really intended to teach statistical analysis of the RNA-seq counts rather than teaching the alignment steps, although it does give complete code for all the steps.

It seems from your posts that you haven't succeeded yet in doing any alignments, and I'm guessing that you haven't built a genome index yet either. So you are getting stuck at the very first step, which in many ways is the hardest. You can't run align until you have first succeeded in downloading a reference genome and running buildindex.

You might like to see a simple reproducible example of building an index and doing an alignment here: . The example uses limma to do the differential expression analysis, but it could just as well have been edgeR.

Alternatively, I get the impression that you are not familiar with R, and in this case you might be more comfortable with doing things at the Unix prompt instead of using R. Here are some simple Subread examples using Unix command syntax: and

Also see the Subread home page at and the full Subread Users Guide at

Finally let me say it important to read the documentation for Bioconductor software. If you type help("align") it will tell you everything about the function, as James has hinted to you. With Bioconductor, searching for answers online is not the best way to proceed, except perhaps for searching this Support forum itself.

Entering edit mode

Hello, thanks for you answer and suggestions! Yes, I'm not familiar with R and it's my first time running a workflow.

I succedeed in building the genome index, because I already had a reference genome to use and I got no errors after the process.

However, I was unable to go ahead and make the alignments. This happened because I was unable to adapt the workflow code so that it uses my own RNASeq data rather than downloading from SRA, I know it may be very basic but I really don't know how to edit the code to do so. Can you help me here? Thanks in advance.

Entering edit mode

I solved the problem!

Entering edit mode
Last seen 2 hours ago
United States

I am not sure why you are using the ShortRead package here. If you look at the help for align, it says this:

readfile1: a character vector including names of files that include
          sequence reads to be aligned. For paired-end reads, this
          gives the list of files including first reads in each
          library. File format is FASTQ/FASTA by default. See
           input_format  option for more supported formats.

If you were to look at your reads object it would say something like this

> reads
class: ShortReadQ
length: 256 reads; width: 36 cycles

Which is obviously not a character vector of names of files! Whereas your fastqPath object should be a character vector of file names.

Entering edit mode

Hello, first of all thanks for your answer! It's my first time ever using R and a programming language in general, so I may be missing some basic concepts about the required objects.

May you suggest me the right code to go ahead and make the workflow acquire my own data rather than using other scripts from other packages to do the job?

Thanks in advance!

Entering edit mode

Solved, I just had to pass the fastqPath rather than the reads object. XD Sorry.


Login before adding your answer.

Traffic: 408 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6