The support.bioconductor.org editor has been updated to markdown! Please see more info at: Tutorial: Updated Support Site Editor

Question: DESeq2: Extremely low pvalues
0
gravatar for szenitha
2.6 years ago by
szenitha0
szenitha0 wrote:

Dear all,

I have a question regarding DESeq2 package. Please help. I am using this package to analyze a RNA seq dataset. The dataset has  3 different groups and my goal is to find out the DE genes between groups. I used the test 'LRT' to find out genes that are differentially expressed between all the groups. The problem is that I am getting extremly low p-values and do not understand why. I am getting p-values as low as 3.067392e-314.

Code:

dds.fc <- DESeqDataSetFromMatrix(dat0,colData=pheno,design=~ State) #condition
#relevel groups
dds.fc@colData$State<-factor(dds.fc@colData$State,levels=c("non-infected spleen","infected lung","infected spleen"))
dds<-DESeq(dds.fc,test = "LRT",reduced = ~1)

res<-results(dds,name =  "State_infected.lung_vs_non.infected.spleen")

#order with respect to p-value

temp<-res[with(res,order(res[,5])),]

#get result with pvalue below 0.05

sub <- subset(temp,temp$pvalue<0.05)

Result:

head(sub,25)
log2 fold change (MLE): State infected.lung vs non.infected.spleen 
LRT p-value: '~ State' vs '~ 1' 
DataFrame with 25 rows and 6 columns
          baseMean log2FoldChange     lfcSE      stat        pvalue          padj
         <numeric>      <numeric> <numeric> <numeric>     <numeric>     <numeric>
Arg1     2337.0486      14.430641 1.1909019  2463.520             0             0
Ccl12     817.8881      11.012091 0.7409714  1723.197             0             0
Ccl7      855.1171      12.086512 1.0304499  2447.651             0             0
Cd38      881.1564       6.760928 0.2047603  1766.141             0             0
Cd72      533.9823       7.131620 0.2767360  1654.121             0             0
...            ...            ...       ...       ...           ...           ...
Irg1     2944.5393       9.067204 0.2761566  1443.782 3.067392e-314 1.831233e-311
Htr7      393.6399       8.571863 0.4775459  1436.052 1.462933e-312 8.336721e-310
C3ar1    1772.5935       6.709288 0.2091927  1432.418 9.002270e-312 4.907020e-309
Rasgef1b  805.7408       4.032636 0.1431151  1407.784 2.011923e-306 1.050978e-303
Ets2     3110.7434       5.978289 0.1886460  1364.882 4.164787e-297 2.088558e-294

Session Info

sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pheatmap_1.0.8             data.table_1.9.6           DESeq2_1.12.3             
 [4] rgl_0.95.1441              corrplot_0.77              RColorBrewer_1.1-2        
 [7] ffpe_1.16.0                TTR_0.23-1                 RUVSeq_1.6.2              
[10] edgeR_3.14.0               limma_3.28.14              EDASeq_2.6.2              
[13] ShortRead_1.30.0           GenomicAlignments_1.8.4    Rsamtools_1.24.0          
[16] Biostrings_2.40.2          XVector_0.12.0             BiocParallel_1.6.2        
[19] RSkittleBrewer_1.1         zebrafishRNASeq_0.106.2    BiocInstaller_1.22.3      
[22] scales_0.4.0               ggplot2_2.1.0              SummarizedExperiment_1.2.3
[25] Biobase_2.32.0             GenomicRanges_1.24.2       GenomeInfoDb_1.8.2        
[28] IRanges_2.6.1              S4Vectors_0.10.1           BiocGenerics_0.18.0       
[31] sva_3.20.0                 genefilter_1.54.2          mgcv_1.8-12               
[34] nlme_3.1-128              

 

deseq2 • 850 views
ADD COMMENTlink modified 2.6 years ago by Steve Lianoglou12k • written 2.6 years ago by szenitha0
Answer: DESeq2: Extremely low pvalues
2
gravatar for Michael Love
2.6 years ago by
Michael Love21k
United States
Michael Love21k wrote:

Why don't you plot an example of one of these genes using plotCounts. I'd guess that these are all "off" in one group and have high expression in another. Note that you are testing that there are no differences across State, so if there are any large differences it makes sense that the probability of the data under the null is very small.

The p-value can be arbitrarily low when the null hypothesis is false. In fact it becomes not so interesting as a statistic when the null is clearly false, and other statistics become more interesting, such as the effect size: the log2 fold change between groups. Actually, we discuss just this topic in the DESeq2 paper.

ADD COMMENTlink written 2.6 years ago by Michael Love21k

Thank you michael. Yes I took the plotCounts for some genes and found that they were expressed in one condition and off in the other. I also went through the paper. I hope that I have understood it correctly.
By default we check if there is no differences across condition(null hypothesis). But if there is a clear biological difference across condition then even genes with very small foldchange will be flagged as significant because the null hypothesis is clearly false. 

In such a scenario I would need to focus on the magnitude of the fold change.So if I have to find out the significant DE genes between two conditions then I would need to specify a composite null hypothesis wherein the null hypothesis would be that genes with fold change > or < threshold is not significant.  So if I am interested in finding DE genes between two groups  "infected lung" and  "non-infected spleen" that have a fold change > 2 or <-2 then is this correct?

res<- results(dds,contrast=c("State", "infected lung", "non-infected spleen"), altHypothesis="greaterAbs",lfcThreshold = 1)

res[which(res$padj<0.05),]

ADD REPLYlink written 2.6 years ago by szenitha0

exactly.

one correction is that lfcThreshold=1 and altHypothesis="greaterAbs" corresponds to a fold change of > 2 or < 1/2.

ADD REPLYlink written 2.6 years ago by Michael Love21k

Yes. I understand now. Thank you for helping me out. 

ADD REPLYlink written 2.5 years ago by szenitha0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 208 users visited in the last hour