Hi, I am using DEXSeq for testing DEU. The DEXSeqDataSet is converted from SummarizedExpriment instance of containing ranges and counts for exons. The covariates used in the design model is Treatment representing two conditions (DOX and NODOX), each of which has three triplicates. I notice there are some exons expected to have differentially usage. The counts in one condition is relatively larger than the other. But the pvalue is almost one. I do not understand why, and perhaps you guys can give me some insights and suggestions?
Below is chunk of my data:
> dxr[name,]
LRT p-value: full vs reduced
DataFrame with 3 rows and 13 columns
groupID featureID exonBaseMean dispersion stat
<character> <character> <numeric> <numeric> <numeric>
7863:E292716 7863 E292716 69.867123 0.001561678 -0.054938652
5789:E214197 5789 E214197 4.986662 0.316687994 0.008410802
8215:E303543 8215 E303543 161.418531 0.000916123 -0.003169622
pvalue padj NODOX DOX log2fold_DOX_NODOX
<numeric> <numeric> <numeric> <numeric> <numeric>
7863:E292716 1.0000000 1 0.6176518 2.652094 2.102266
5789:E214197 0.9269281 1 0.5525945 1.190415 1.107172
8215:E303543 1.0000000 1 0.3627619 4.178818 3.526000
genomicData countData transcripts
<GRanges> <matrix> <list>
7863:E292716 chr3:+:[93820935, 93821042] 114 134 155 ... ########
5789:E214197 chr1:+:[30616742, 30616852] 7 2 20 ... ########
8215:E303543 chr3:-:[94178072, 94178182] 315 288 324 ... ########
The counts for NODOX (column 4-6) are almost zero whereas DOX (1-3) have more counts:
> counts(dxd)[name, 1:6]
[,1] [,2] [,3] [,4] [,5] [,6]
7863:E292716 114 134 155 0 0 0
5789:E214197 7 2 20 0 0 0
8215:E303543 315 288 324 0 0 0
The size of lib is sufficient:
> colSums(counts(dxd)[, 1:6]) [1] 19786058 17779995 18319056 15141153 22307765 20236315
Is there anyway I can fix the problem with modifying any parameters of DEXSeq?
Thanks,
Chao-Jen
HI Chao-Jen,
Could you include the output of plotDEXSeq with the parameter norCounts=TRUE for these genes?
Alejandro
Below is the model design. "exonic" is a SE object with exonic counts and ranges.
Figure below (I hope it shows correctly) is the gene on my first row of the aforementioned example.
Thanks,
Chao-Jen
Here is another plot for the third row of the example: