Question

DEXSeq Error in FUN(X[[i]], ...) : subscript out of bounds not sure why

0

Entering edit mode

badribio • 0

@badribio-12724

Last seen 5.8 years ago

suppressPackageStartupMessages( library( "DEXSeq" ) )
setwd("pathto/htseq")
inDir = getwd()
countFiles = list.files(inDir, pattern = "htseq.txt$", full.names = TRUE)
flattenedFile = ("pathto/ensembl/Mus_musculus.NCBIM37.67.gtf")
mouseTable = data.frame(
  row.names = c("S75","S76","S77","S72","S73","S74"),
  condition = c("WT","WT","WT","MT","MT","MT"),
  libType = c("paired-end","paired-end","paired-end","paired-end","paired-end","paired-end"))
mouseTable
dxd = DEXSeqDataSetFromHTSeq(countFiles,sampleData = mouseTable,design = ~ sample + exon + condition:exon, flattenedfile = flattenedFile)

37996 ./76.htseq.txt

37996 ./72.htseq.txt

37996 ./77.htseq.txt

37996 ./75.htseq.txt

37996 ./73.htseq.txt

37996 ./74.htseq.txt

Number of lines are same in all the count files and none of them have any headers. I used the same gtf file for htseq.

not sure where am I going wrong due to which I get this error

Error in FUN(X[[i]], ...) : subscript out of bounds

Thank you in anticipation.

dexseq • 2.3k views

ADD COMMENT • link 8.1 years ago badribio • 0

0

Entering edit mode

Hi. Are you using the python scripts dexseq_prepare_annotation.py and dexseq_count.py) that come with the DEXSeq package?

ADD REPLY • link 8.1 years ago Alejandro Reyes ★ 1.9k

0

Entering edit mode

let me walk you through my steps may be you could please point out where i went wrong,

1) Bam files processed using Htseq count with parameters with parameters: --mode intersection-strict --minaqual 1 --stranded reverse --type exon --idattr gene_id using gtf (not gff) from mm9.37.67(ensembl)

2) load the output for 6 samples using "DEXSeqDataSetFromHTSeq" <- I am stuck here for now

3) also if i supply the gff from dexseq_prepare_annotation.py while using htseq, i get "Warning: No features of type 'exon' found" as an error.

4) I also have the results from dexseq_count.py as well as explained in the DEXSeq manual, but the paper I am referring to uses htseq counts.

Thank you, for replying!!! really appreciate it.

ADD REPLY • link 8.1 years ago badribio • 0

0

Entering edit mode

I think I did not understand the reason why you are not using the scripts from dexseq. But if you want to use htseq instead to count on exons, you might want to read the documentation from the --type and --idattr parameters.

Also, notice that the function DEXSeqDataSetFromHTSeq is expecting the output from the dexseq scripts, so it is likely that it won't work with the output of htseq-count. Nevertheless, you can still use the DEXSeqDataSet function to create a DEXSeqDataSet object.

Alejandro

ADD REPLY • link 8.1 years ago Alejandro Reyes ★ 1.9k

0

Entering edit mode

so if I am comprehending it correct although the function is DEXSeqDataSetFromHTSeq it still expects the input from dexseq_count.py? If so then could you please let me know which file would the right result to use for downstream analysis based on my sequencing type?

rna seq data paired end stranded (tophat option - fr-firststrand)

dexoptions

1) dexseq_count.py -p yes -s reverse -f bam

2) dexseq_count.py -p yes -s yes -f bam

dexout$ tail S72.dex.reverse.txt

ENSMUSG00000093784+ENSMUSG00000009292+ENSMUSG00000093443:058 25

ENSMUSG00000093785:001 0

ENSMUSG00000093786:001 0

ENSMUSG00000093787:001 0

ENSMUSG00000093788:001 0

_ambiguous 155274

_ambiguous_readpair_position 5427516

_empty 43670128

_lowaqual 0

_notaligned 0

dexout$ tail S72.dex.txt

ENSMUSG00000093784+ENSMUSG00000009292+ENSMUSG00000093443:058 0

ENSMUSG00000093785:001 0

ENSMUSG00000093786:001 0

ENSMUSG00000093787:001 5

ENSMUSG00000093788:001 0

_ambiguous 9089

_ambiguous_readpair_position 5427516

_empty 90229017

_lowaqual 0

_notaligned 0

Many thanks

ADD REPLY • link 8.1 years ago badribio • 0

0

Entering edit mode

so if I am comprehending it correct although the function is DEXSeqDataSetFromHTSeq it still expects the input from dexseq_count.py?

Yes

If so then could you please let me know which file would the right result to use for downstream analysis based on my sequencing type?

Not sure I can know that. Would recommend to have a loot at your bam files in a browser to get an idea.