strange log2FoldChange estimates
1
0
Entering edit mode
Shi, Tao ▴ 720
@shi-tao-199
Last seen 7.1 years ago
hi Michael, I have a simple RNAseq experiment: a cell line treated with a compound at various doses and time. ?Here are the htseq counts for one of gene below along with the treatment conditions and sizeFactor estimated by DESeq2. > tmp ? count sizeFactor time dose 1 11396 1.0069598 8 0 2 12306 1.0913848 8 0 3 11761 1.0611031 8 0.1 4 12177 1.0764237 8 0.1 5 12017 1.1675793 8 1 6 10667 1.0375095 8 1 7 11423 1.0693406 8 10 8 11529 1.1097800 8 10 9 7845 0.9187615 24 0 10 8600 0.9620073 24 0 11 3200 0.9502428 24 0.1 12 2964 0.8491865 24 0.1 13 1079 0.8893242 24 1 14 1391 1.0884646 24 1 15 1261 0.9130893 24 10 16 1451 1.0182247 24 10 The design matrix I used is : ~time + dose + time:dose Here are the coefficient estimates for this particular gene: baseMean 7.266687e+03 baseVar 1.854839e+07 allZero 0.000000e+00 dispGeneEst 1.000000e-08 dispFit 6.694323e-04 dispersion 4.994657e-04 dispIter 1.000000e+01 dispOutlier 0.000000e+00 dispMAP 4.994657e-04 Intercept 1.238851e+01 time8 1.007756e+00 time24 -9.953678e-01 dose0 8.800724e-01 dose0.1 2.014315e-01 dose1 -5.854584e-01 dose10 -4.836570e-01 time8.dose0 -8.127168e-01 time24.dose0 8.201522e-01 time8.dose0.1 -1.467029e-01 time24.dose0.1 1.484047e-01 time8.dose1 5.176693e-01 time24.dose1 -5.226157e-01 time8.dose10 4.502645e-01 time24.dose10 -4.543508e-01 And some of the contrasts I tested: for time24.dose10 vs time24.dose0 baseMean log2FoldChange lfcSE stat pvalue padj 7266.687 -1.274503 0.02810044 -45.35527 0 0 for time8.dose10 vs time8.dose0 baseMean log2FoldChange lfcSE stat pvalue padj 7266.687 1.262981 0.02804676 45.03128 0 0? The thing that puzzles me a lot is that, for the log2FC of the 2nd contrast, I can't see how it can be ~1.26 given the sizefactor corrected mean counts for the gene for these two conditions are 11296.41 and 10535.41, respectively. To me, the log2FC should be negative or close to 0. Attach is the tmp matrix you saw above. ?I know you can't replicate all the estimates using the tmp matrix alone without other genes in the dataset, but that's all I can provide for confidentiality. ?Hopefully, you can figure out where the problem is. I'm using DESeq2 v1.4.5, R3.1.0, under Win 7. Thank you very much! Tao
RNASeq DOSE DESeq2 RNASeq DOSE DESeq2 • 1.0k views
0
Entering edit mode
Shi, Tao ▴ 720
@shi-tao-199
Last seen 7.1 years ago
Thank you, Mike, for your quick response! ?That explains. ?Do you mind in your user guide on page 21 in the R code to make this more explicit, even I know you have mentioned it right below the R chunk. Thanks, again! Tao On Thursday, August 14, 2014 2:29 PM, Michael Love <michaelisaiahlove at="" gmail.com=""> wrote: hi Tao, If you want to generate the log fold change of the two groups you are focusing on (dose 10 and 0 for group 8), this is not just the interaction term, but the main effect (dose 10 vs 0) plus the interaction term (time8.dose10 vs time8.dose0). ?See the last example in ?results for an example of this ("the condition effect for group Z"). You should use the list() argument to contrast, something like: results(dds, contrast=list(c("dose10","time8.dose10"), c("dose0","time8.dose0"))) Given the parameters, it looks like this would be: > (-.48 - .88) + 1.26 [1] -0.1 Mike On Thu, Aug 14, 2014 at 5:06 PM, Shi, Tao <shidaxia at="" yahoo.com=""> wrote: hi Michael, > >I have a simple RNAseq experiment: a cell line treated with a compound at various doses and time. ?Here are the htseq counts for one of gene below along with the treatment conditions and sizeFactor estimated by DESeq2. > >> tmp >? count sizeFactor time dose >1 ?11396 ?1.0069598 ? ?8 ? ?0 >2 ?12306 ?1.0913848 ? ?8 ? ?0 >3 ?11761 ?1.0611031 ? ?8 ?0.1 >4 ?12177 ?1.0764237 ? ?8 ?0.1 >5 ?12017 ?1.1675793 ? ?8 ? ?1 >6 ?10667 ?1.0375095 ? ?8 ? ?1 >7 ?11423 ?1.0693406 ? ?8 ? 10 >8 ?11529 ?1.1097800 ? ?8 ? 10 >9 ? 7845 ?0.9187615 ? 24 ? ?0 >10 ?8600 ?0.9620073 ? 24 ? ?0 >11 ?3200 ?0.9502428 ? 24 ?0.1 >12 ?2964 ?0.8491865 ? 24 ?0.1 >13 ?1079 ?0.8893242 ? 24 ? ?1 >14 ?1391 ?1.0884646 ? 24 ? ?1 >15 ?1261 ?0.9130893 ? 24 ? 10 >16 ?1451 ?1.0182247 ? 24 ? 10 > >The design matrix I used is : ~time + dose + time:dose > >Here are the coefficient estimates for this particular gene: > >baseMean ? ? ? ? ? ? ? ? ? ? ? 7.266687e+03 >baseVar ? ? ? ? ? ? ? ? ? ? ? ?1.854839e+07 >allZero ? ? ? ? ? ? ? ? ? ? ? ?0.000000e+00 >dispGeneEst ? ? ? ? ? ? ? ? ? ?1.000000e-08 >dispFit ? ? ? ? ? ? ? ? ? ? ? ?6.694323e-04 >dispersion ? ? ? ? ? ? ? ? ? ? 4.994657e-04 >dispIter ? ? ? ? ? ? ? ? ? ? ? 1.000000e+01 >dispOutlier ? ? ? ? ? ? ? ? ? ?0.000000e+00 >dispMAP ? ? ? ? ? ? ? ? ? ? ? ?4.994657e-04 >Intercept ? ? ? ? ? ? ? ? ? ? ?1.238851e+01 >time8 ? ? ? ? ? ? ? ? ? ? ? ? ?1.007756e+00 >time24 ? ? ? ? ? ? ? ? ? ? ? ?-9.953678e-01 >dose0 ? ? ? ? ? ? ? ? ? ? ? ? ?8.800724e-01 >dose0.1 ? ? ? ? ? ? ? ? ? ? ? ?2.014315e-01 >dose1 ? ? ? ? ? ? ? ? ? ? ? ? -5.854584e-01 >dose10 ? ? ? ? ? ? ? ? ? ? ? ?-4.836570e-01 >time8.dose0 ? ? ? ? ? ? ? ? ? -8.127168e-01 >time24.dose0 ? ? ? ? ? ? ? ? ? 8.201522e-01 >time8.dose0.1 ? ? ? ? ? ? ? ? -1.467029e-01 >time24.dose0.1 ? ? ? ? ? ? ? ? 1.484047e-01 >time8.dose1 ? ? ? ? ? ? ? ? ? ?5.176693e-01 >time24.dose1 ? ? ? ? ? ? ? ? ?-5.226157e-01 >time8.dose10 ? ? ? ? ? ? ? ? ? 4.502645e-01 >time24.dose10 ? ? ? ? ? ? ? ? -4.543508e-01 > >And some of the contrasts I tested: > >for time24.dose10 vs time24.dose0 >baseMean log2FoldChange ? ? ?lfcSE ? ? ?stat ? ?pvalue ? ? ?padj >7266.687 ? ? ?-1.274503 0.02810044 -45.35527 ? ? ? ? 0 ? ? ? ? 0 > >for time8.dose10 vs time8.dose0 >baseMean log2FoldChange ? ? ?lfcSE ? ? ?stat ? ?pvalue ? ? ?padj >7266.687 ? ? ? 1.262981 0.02804676 ?45.03128 ? ? ? ? 0 ? ? ? ? 0? > > >The thing that puzzles me a lot is that, for the log2FC of the 2nd contrast, I can't see how it can be ~1.26 given the sizefactor corrected mean counts for the gene for these two conditions are 11296.41 and 10535.41, respectively. To me, the log2FC should be negative or close to 0. > >Attach is the tmp matrix you saw above. ?I know you can't replicate all the estimates using the tmp matrix alone without other genes in the dataset, but that's all I can provide for confidentiality. ?Hopefully, you can figure out where the problem is. > >I'm using DESeq2 v1.4.5, R3.1.0, under Win 7. > >Thank you very much! > >Tao