Search
Question: DEXSeq Error in FUN(X[[i]], ...) : subscript out of bounds not sure why
0
gravatar for badribio
8 months ago by
badribio0
badribio0 wrote:
suppressPackageStartupMessages( library( "DEXSeq" ) )
setwd("pathto/htseq")
inDir = getwd()
countFiles = list.files(inDir, pattern = "htseq.txt$", full.names = TRUE)
flattenedFile = ("pathto/ensembl/Mus_musculus.NCBIM37.67.gtf")
mouseTable = data.frame(
  row.names = c("S75","S76","S77","S72","S73","S74"),
  condition = c("WT","WT","WT","MT","MT","MT"),
  libType = c("paired-end","paired-end","paired-end","paired-end","paired-end","paired-end"))
mouseTable
dxd = DEXSeqDataSetFromHTSeq(countFiles,sampleData = mouseTable,design = ~ sample + exon + condition:exon, flattenedfile = flattenedFile)

 37996 ./76.htseq.txt

  37996 ./72.htseq.txt

  37996 ./77.htseq.txt

  37996 ./75.htseq.txt

  37996 ./73.htseq.txt

  37996 ./74.htseq.txt

Number of lines are same in all the count files and none of them have any headers. I used the same gtf file for htseq. 

not sure where am I going wrong due to which I get this error

Error in FUN(X[[i]], ...) : subscript out of bounds

Thank you in anticipation.

ADD COMMENTlink written 8 months ago by badribio0

Hi. Are you using the python scripts dexseq_prepare_annotation.py and dexseq_count.py) that come with the DEXSeq package?

ADD REPLYlink written 8 months ago by Alejandro Reyes1.5k

let me walk you through my steps may be you could please point out where i went wrong,

1) Bam files processed using Htseq count with parameters with parameters: --mode intersection-strict --minaqual 1 --stranded reverse --type exon --idattr gene_id using gtf (not gff) from mm9.37.67(ensembl) 

2) load the output for 6 samples using  "DEXSeqDataSetFromHTSeq" <- I am stuck here for now

3) also if i supply the gff from dexseq_prepare_annotation.py while using htseq, i get "Warning: No features of type 'exon' found" as an error. 

4) I also have the results from dexseq_count.py as well as explained in the DEXSeq manual, but the paper I am referring to uses htseq counts. 

Thank you, for replying!!! really appreciate it. 

 

ADD REPLYlink written 8 months ago by badribio0

I think I did not understand the reason why you are not using the scripts from dexseq. But if you want to use htseq instead to count on exons, you might want to read the documentation from the --type and --idattr parameters.

Also, notice that the function DEXSeqDataSetFromHTSeq is expecting the output from the dexseq scripts, so it is likely that it won't work with the output of htseq-count. Nevertheless, you can still use the DEXSeqDataSet function to create a DEXSeqDataSet object.

Alejandro

ADD REPLYlink written 8 months ago by Alejandro Reyes1.5k

 so if I am comprehending it correct although the function is DEXSeqDataSetFromHTSeq it still expects the input from dexseq_count.py?  If so then could you please let me know which file would the right result to use for downstream analysis based on my sequencing type?

rna seq data paired end stranded (tophat option - fr-firststrand) 

dexoptions  

1) dexseq_count.py -p yes -s reverse -f bam
2) dexseq_count.py -p yes -s yes -f bam

 

dexout$ tail S72.dex.reverse.txt 

ENSMUSG00000093784+ENSMUSG00000009292+ENSMUSG00000093443:058    25

ENSMUSG00000093785:001    0

ENSMUSG00000093786:001    0

ENSMUSG00000093787:001    0

ENSMUSG00000093788:001    0

_ambiguous    155274

_ambiguous_readpair_position    5427516

_empty    43670128

_lowaqual    0

_notaligned    0

 

dexout$ tail S72.dex.txt 

ENSMUSG00000093784+ENSMUSG00000009292+ENSMUSG00000093443:058    0

ENSMUSG00000093785:001    0

ENSMUSG00000093786:001    0

ENSMUSG00000093787:001    5

ENSMUSG00000093788:001    0

_ambiguous    9089

_ambiguous_readpair_position    5427516

_empty    90229017

_lowaqual    0

_notaligned    0

 

Many thanks

ADD REPLYlink written 8 months ago by badribio0

so if I am comprehending it correct although the function is DEXSeqDataSetFromHTSeq it still expects the input from dexseq_count.py? 

Yes

If so then could you please let me know which file would the right result to use for downstream analysis based on my sequencing type?

Not sure I can know that. Would recommend to have a loot at your bam files in a browser to get an idea.

ADD REPLYlink written 8 months ago by Alejandro Reyes1.5k

Snap shot of one of the sample, in igv

ADD REPLYlink written 8 months ago by badribio0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 341 users visited in the last hour