Entering edit mode
Hanneke van Deutekom
▴
20
@hanneke-van-deutekom-6220
Last seen 10.3 years ago
Hi,
I used dexseq_count.py to generate a count set.
I used read.HTSeqCounts to get an ExonCountSet
I manage to go throught the vignette just fine, but I noticed that a
concatenated gene (ENSG+ENSG+ENSG) with 84 exons is not testable. I
have
data for 6 controls and 4 patients. If I sum up their counts, the
median
over all exons is about 2000 and the mean about 5000.
This should be way enough to make a gene testable. However, it was
not.
This is in the vignette:
Before starting estimating the CR dispersion
estimates,estimateDispersionsfi rst defi nes the \
testable" counting bins. In this step, all bins are excluded with less
that
minCount reads. By default, minCount=10, i.e., the bin must have at
least
10 reads, summed up
across all samples, but other values can be specifi ed when calling
estimateDispersion.
If a gene has only one testable counting bin, then this counting bin
is
then considered as not testable, either, because it's usage cannot be
compared to any other counting bins. The testable bins are marked in
the
column testable of the feature data.
With 82 testing bins, this gene should have turned out to be testable.
I
removed the two exons with 1 and 3 reads in total, and rerun the
program.
This time the gene turns up to be testable.
Now, I was highly interested in this particular gene, but I now wonder
whether I have to remove all exons < 10 reads, in order for the
program to
take along all genes that still might be interesting to look at. Or
should
I just put minCount to 1?
Hanneke
[[alternative HTML version deleted]]