RSubread 0% mapping

0

Entering edit mode

Jo ▴ 30

@jo-8608

Last seen 8.3 years ago

Hi,

I am trying to produce ReadCount values for a RNA-Seq dataset using the Rsubread package.

I first converted each SRA file from the GEO site for the dataset to a fastq file. I then concatenated all of the fastq files into one file. Then I gzipped the file.

I then used the rsubread package to produce readcount values:

Here is my code:

buildindex(basename="reference_indexreal",reference="hg19.fa", colorspace=TRUE)

align(index="reference_indexreal",readfile1="GSE55296.fastq.gz",output_file="alignResults55296.BAM")

#The align function had this output

Processed : 101042942 reads

Mapped : 15 reads (0.0%)

Indels : 0

Running time : 259.9 minutes

#I then used the featureCounts method to produce read count values

fc_SE <- featureCounts("alignResults55296.BAM",annot.inbuilt="hg19")

#This was the output

//================================= Running ==================================\\

|| ||

|| Load annotation file /Library/Frameworks/R.framework/Versions/3.2/Reso ... ||

|| Features : 225074 ||

|| Meta-features : 25702 ||

|| Chromosomes : 52 ||

|| ||

|| Process BAM file alignResults55296.BAM... ||

|| Single-end reads are included. ||

|| Assign reads to features... ||

|| Total reads : 101042942 ||

|| Successfully assigned reads : 8 (0.0%) ||

|| Running time : 2.38 minutes ||

|| ||

|| Read assignment finished. ||

|| ||

\\===================== http://subread.sourceforge.net/ ======================//

Then I tried to print the count values and got the result:

fc_SE$counts

alignResults55296.BAM

653635 0

100422834 0

645520 0

79501 0

Please advise on how I should fix this. Thanks in advance.

rsubread mapping featurecounts align • 1.7k views

ADD COMMENT • link 8.4 years ago Jo ▴ 30

0

Entering edit mode

Wei Shi is the one to help you. In the meantime, you say you used the sratoolkit to convert to fastq files. It would probably be helpful to see the commands you used. Much of what you have done doesn't make sense to me. Did you really take 32 samples and pile the data into one large fastq file? I have no idea why you would do that. These are paired end data, so I would instead expect you to have used fastq-dump to convert into 64 individual fastq files, not one. Is there a reason for doing that?

ADD REPLY • link 8.4 years ago James W. MacDonald 65k

Login before adding your answer.