DEXSeq-Error in FUN(X[[i]], ...) : subscript out of bounds
0
0
Entering edit mode
@mthabisi-moyo-9721
Last seen 24 months ago
United States

I am trying to use DEXSeq to analyze a mouse paired-end stranded RNA-seq data (poly-A RNA capture). I Aligned the reads using STAR and obtained the reference genome and annotation file (which I collapsed with dexseq_prepare_annotation.py) from GENCODE (release m11). I used HTSeq to count exons (using command "htseq-count -m union -s reverse -i gene_id -t exon) and over 80% mapped to a feature. 


I set up my countFiles, collapsed annotation file and sampleTable in R:
countFiles <- list.files(directory, pattern=".txt", full.names = TRUE)
sampleTable <- data.frame(row.names = c("WT_RNA_1", "WT_RNA_2", "WT_RNA_3", "KO_RNA_1", "KO_RNA_2", "KO_RNA_3"), condition = c("WT", "WT", "WT", "KO", "KO", "KO"), libType = c("paired-end", "paired-end", "paired-end", "paired-end", "paired-end", "paired-end"))

When I try to create a DEXSeqdataset it gives me an out of bounds error:
dxd <- DEXSeqDataSetFromHTSeq(countFiles, sampleData = sampleTable, design= ~sample + exon + condition:exon, flattenedfile = flattenedFile)
Error in FUN(X[[i]], ...) : subscript out of bounds

I have looked over the count files and the collapsed annotation and I can't seem to figure out where the mismatch between these files is occurring. What is curious is that I do not see this error when I use DEXSeq to analyze RNA-seq data from human cell lines. Is there something I am missing regarding alignment and counting of the mouse data set?

Thank you for your help

 

 

 

 

 

dexseq • 2.2k views
ADD COMMENT
0
Entering edit mode

Hi, it's probable that either the gtf file or that one or more of the count files are corrupted.  Could you check that count files have all the same number of lines and that the gtf file has the same number of exonic regions as the number of lines of the count files?

Alejandro

 

ADD REPLY
0
Entering edit mode

Hi Alejandro,

It turns out it was likely the GENCODE gtf file. Switched to the mouse annotation file on Ensembl and it worked well. It is still odd to me considering the count files were created using the GENCODE gtf so I thought it would still result with the same number of lines but I guess not.

Mthabisi

ADD REPLY

Login before adding your answer.

Traffic: 787 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6