Entering edit mode
                    Hello, Alejandro
Thank you very much for your previous guidance. I did what you
suggested, merging the count data for all samples to generate a matrix
and used this following code to get ecs.
> ecs <- newExonCountSet(countData=countdata, design=sampleTable,
geneIDs=geneid, exonIDs=exonid)
The estimateDispersions(ecs) step was "Done", but the problem came as
the following. Could you please give me some advice on dealing with
this error ??
.............................................
Done
> ecs <- fitDispersionFunction(ecs)
Error in fitDispersionFunction(ecs) :
  no CR dispersion estimations found, please first call
estimateDispersions function
I found a way from the forum to check. It seems that the dispersion
estimates are all NAs, which could also be seen from fData(ecs).
> sum(!is.na(fData(ecs)$dispBeforeSharing)) / nrow(fData(ecs))
[1] 0
> head(countdata)
                     SRR791052 SRR791054 SRR791056 SRR791057 SRR791058
ENSG00000000003:0001         0         0         0         0         0
ENSG00000000003:0002        75       120        67       108        93
ENSG00000000003:0003         0         0         0         0         0
ENSG00000000003:0004        78       126        59        58        74
ENSG00000000003:0005         0         0         0         0         0
ENSG00000000003:0006        67       114        37        47        54
                     SRR791059 SRR791061 SRR791062 SRR791066 SRR791067
ENSG00000000003:0001         0         0         0         0         2
ENSG00000000003:0002       126        85       108        43        41
ENSG00000000003:0003         0         1         0         0         0
ENSG00000000003:0004       131        67        98        37        14
ENSG00000000003:0005         0         0         0         0         0
ENSG00000000003:0006        94        56        80        37        26
                     SRR791068 SRR791069 SRR791070 SRR791071 SRR791072
ENSG00000000003:0001         0         0         0         0         0
ENSG00000000003:0002        34        64        55        62        64
ENSG00000000003:0003         1         0         0         0         0
ENSG00000000003:0004        22        26        32        54        47
ENSG00000000003:0005         0         0         0         0         0
ENSG00000000003:0006        33        36        42        60        40
                     SRR791073
ENSG00000000003:0001         0
ENSG00000000003:0002        50
ENSG00000000003:0003         0
ENSG00000000003:0004        18
ENSG00000000003:0005         1
ENSG00000000003:0006        29
> head(geneid)
[1] "ENSG00000000003" "ENSG00000000003" "ENSG00000000003"
"ENSG00000000003"
[5] "ENSG00000000003" "ENSG00000000003"
> head(exonid)
[1] "0001" "0002" "0003" "0004" "0005" "0006"
> sampleTable
                         condition
SRR791052     HER2+
SRR791054     HER2+
SRR791056     HER2+
SRR791057     HER2+
SRR791058     HER2+
SRR791059     HER2+
SRR791061     HER2+
SRR791062     HER2+
SRR791066    Benign
SRR791067    Benign
SRR791068    Benign
SRR791069    Benign
SRR791070    Benign
SRR791071    Benign
SRR791072    Benign
SRR791073    Benign
Note: I am using R 3.0.3 and DEXSeq 1.8.0.
Thanks,
Xiayu
-----Original Message-----
From: Rao,Xiayu
Sent: Wednesday, April 16, 2014 9:18 AM
To: 'Alejandro Reyes'; bioconductor at r-project.org
Subject: RE: dexseq read.HTSeqCounts error
Thank you, Alejandro!  That sounds to be a good way to go. I will do
what you suggested.
Thanks,
Xiayu
-----Original Message-----
From: Alejandro Reyes [mailto:alejandro.reyes@embl.de]
Sent: Wednesday, April 16, 2014 2:36 AM
To: Rao,Xiayu; bioconductor at r-project.org
Subject: Re: dexseq read.HTSeqCounts error
Hi Xiayu,
Please keep your answers in the mailing list, so it can be useful for
users encountering similar issues.
Yes, that is the problem, the function is expecting the same number of
lines for each file (it is expecting the users to use the same
annotation file or the same features as counting bins for all their
samples).  What you could do is read independently your files within
R, generate a matrix of the union of all possible counting bins (the
first column of your files) times samples (columns), and then fill
them in with the information from your files, leaving a 0 on the
corresponding missing values. Then use the function newExonCountSet to
generate your ecs file.
Best regards,
Alejandro
> Hi, Alejandro
>
> Thank you for your prompt response. Below is the output. I think my
files are OK, but they have different number of lines. So you think
that is the issue?? I am not counting reads that locate in exonic
regions, so I do not have a .gtf file. I am counting the reads within
some splice sites. Do you have any suggestions for me in my case?
>
>> all( file.exists(as.character(sampleTable$countfiles)) )
> [1] TRUE
>
> -sh-4.1$ wc -l *count.txt
>     222604 SRR791043_count.txt
>     215908 SRR791044_count.txt
>     232519 SRR791045_count.txt
>     227275 SRR791046_count.txt
>     241367 SRR791047_count.txt
>     253112 SRR791048_count.txt
>     234680 SRR791049_count.txt
>     199632 SRR791050_count.txt
>     208880 SRR791051_count.txt
>     195265 SRR791052_count.txt
>     .........
>
> Thanks,
> Xiayu
>
>
> -----Original Message-----
> From: Alejandro Reyes [mailto:alejandro.reyes at embl.de]
> Sent: Tuesday, April 15, 2014 11:44 AM
> To: Rao,Xiayu; 'bioconductor at r-project.org'
> Subject: Re: dexseq read.HTSeqCounts error
>
> Dear Xiayu Rao,
>
> What is the output of:
>
> all( file.exists(as.character(sampleTable$countfiles)) )
>
> ?
>
> Maybe one of your files is corrupted? Do they have all the same
number of lines?
>
> Best regards,
> Alejandro
>
>
>> Hi, Alejandro and others
>>
>> Thank you for your information on the forum, which is very helpful.
I now have a problem in running dexseq. I need your advice to solve
the problem. Any input would be very appreciated.
>>
>>> ecs <-
>>>
read.HTSeqCounts(countfiles=as.character(sampleTable$countFile),desi
>>> g
>>> n=sampleTable, flattenedfile=NULL)
>> Error in `rownames<-`(`*tmp*`, value = c("ENSG00000000003:001",
"ENSG00000000003:002",  :
>>     attempt to set 'rownames' on an object with no dimensions
>>
>> I then checked:
>>> dim(lf)
>> NULL
>>> dim(dcounts)
>> NULL
>>
>> I generated count files by myself instead of using dexseq_count.py.
I tried to follow the standard count file format:
>> -sh-4.1$ more SRR791043_count.txt
>> ENSG00000000003:001     284
>> ENSG00000000003:002     179
>> ENSG00000000003:003     275
>> ENSG00000000003:004     156
>> ENSG00000000003:005     177
>> ENSG00000000003:006     157
>> ENSG00000000003:007     9
>> ENSG00000000003:008     45
>> ENSG00000000003:009     4
>> ENSG00000000003+ENSG00000102362:001     3
>> ............
>> Intergene1:001       1
>> .........
>> intergene2097:001       11
>> intergene2098:001       99
>> intergene2099:001       54
>> intergene2100:001       3
>> intergene2101:001       3
>> intergene2102:001       20
>> intergene2103:001       1
>> intergene2104:001       2
>> intergene2105:001       1
>> intergene2106:001       10
>> _ambiguous      0
>> _empty  0
>> _lowaqual       0
>> _notaligned     0
>>
>>
>>> sampleTable
>>                                                 countFile condition
>> SRR791052 SRR791052_count.txt     HER2+
>> SRR791054 SRR791054_count.txt     HER2+
>> SRR791056 SRR791056_count.txt     HER2+
>> SRR791057 SRR791057_count.txt     HER2+
>> SRR791058 SRR791058_count.txt     HER2+
>> SRR791059 SRR791059_count.txt     HER2+
>> SRR791061 SRR791061_count.txt     HER2+
>> SRR791062 SRR791062_count.txt     HER2+
>> SRR791066 SRR791066_count.txt    Benign
>> SRR791067 SRR791067_count.txt    Benign
>> SRR791068 SRR791068_count.txt    Benign
>> SRR791069 SRR791069_count.txt    Benign
>> SRR791070 SRR791070_count.txt    Benign
>> SRR791071 SRR791071_count.txt    Benign
>> SRR791072 SRR791072_count.txt    Benign
>> SRR791073 SRR791073_count.txt    Benign
>>
>> Note: I am using R 3.0.3 and DEXSeq 1.8.0
>>
>>
>> Thanks,
>> Xiayu
>>
                    
                
                