Question

DESeq2 question: why I got discrepancy results for DEGs when using different ways setting up the models

0

Entering edit mode

Ming ▴ 380

@ming-yi-6363

Last seen 19 months ago

United States

Enter the body of text here

Hi, there:

I used DESeq2 a while back, and just recently I started to use it again for a project, and I bet there might be some updates or changes, since I am trying to understand why I got different DEGs results with discrepancy values use two ways of model settings that supposed to be no difference to derive DEGs unless my codes have some issues.

Code should be placed in three backticks as shown below

##the first way: the main code for setting the model and contrast as below
> show(head(tar3,4))
                         DepMap_ID   Lines        Types RAS
AUTO_GANGLIA.Mut.CHP212 ACH.000120  CHP212 AUTO_GANGLIA    Mut
AUTO_GANGLIA.Mut.SKNSH  ACH.000149   SKNSH AUTO_GANGLIA   Mut
AUTO_GANGLIA.Mut.SKNAS  ACH.000260   SKNAS AUTO_GANGLIA   Mut
AUTO_GANGLIA.WT.MHHNB11 ACH.000078 MHHNB11 AUTO_GANGLIA   WT
...more lines not shown

> mydata3<-mydata3[,tar3$DepMap_ID];
> show("all(colnames(mydata3)==tar3$DepMap_ID)")
[1] "all(colnames(mydata3)==tar3$DepMap_ID)"
> show(all(colnames(mydata3)==tar3$DepMap_ID))
[1] TRUE

tar3$TypesRas<-paste(tar3$Types,tar3$RAS,sep=".");
tar3$TypesRas <- factor(tar3$TypesRas)

mydata3<-round(mydata3)
library("DESeq2")
dds <- DESeqDataSetFromMatrix(countData = mydata3,
                              colData = tar3,
                              design = ~TypesRas)
colData(dds)$TypesRas <- factor(colData(dds)$TypesRas,
levels=c("AUTO_GANGLIA.Mut", "AUTO_GANGLIA.WT", 
"BILE.Mut","BILE.WT",......
"URINARY.Mut", "URINARY.WT"));
dds$TypesRas <- relevel(dds$TypesRas, ref = "AUTO_GANGLIA.WT")
dds<- DESeq(dds);
show(resultsNames(dds)) # lists the coefficients
# [1] "Intercept"
 #[2] "TypesRas_AUTO_GANGLIA.Mut_vs_AUTO_GANGLIA.WT"
 #[3] "TypesRas_BILE.Mut_vs_AUTO_GANGLIA.WT"
# [4] "TypesRas_BILE.WT_vs_AUTO_GANGLIA.WT"
# [5] "TypesRas_BREAST.Mut_vs_AUTO_GANGLIA.WT"
 #[6] "TypesRas_BREAST.WT_vs_AUTO_GANGLIA.WT"
 ........(more here not shown)
#[33] "TypesRas_URINARY.Mut_vs_AUTO_GANGLIA.WT"
#[34] "TypesRas_URINARY.WT_vs_AUTO_GANGLIA.WT"

##set up wanted contrasts
myConL<-list();
kk<-1;
myConL[[kk]]<-c("TypesRas","AUTO_GANGLIA.Mut", "AUTO_GANGLIA.WT");
names(myConL)[kk]<-"AUTO_GANGLIA.Mut_WT";
kk<-kk+1;
myConL[[kk]]<-c("TypesRas","BILE.Mut", "BILE.WT");
names(myConL)[kk]<-"BILE.Mut_WT";
kk<-kk+1;
myConL[[kk]]<-c("TypesRas","BREAST.Mut", "BREAST.WT");
names(myConL)[kk]<-"BREAST.Mut_WT";
....
##for question here, only retrieve DEGs for the first three contrasts
for(i in 1:3){
res <- results(dds, contrast=myConL[[i]]);
res<-as.data.frame(res[order(res$pvalue),]);
}


##the 2nd way: the main code for setting the model and contrast as below
> show(head(tar3,4))
                         DepMap_ID   Lines        Types RAS
AUTO_GANGLIA.Mut.CHP212 ACH.000120  CHP212 AUTO_GANGLIA    Mut
AUTO_GANGLIA.Mut.SKNSH  ACH.000149   SKNSH AUTO_GANGLIA   Mut
AUTO_GANGLIA.Mut.SKNAS  ACH.000260   SKNAS AUTO_GANGLIA   Mut
AUTO_GANGLIA.WT.MHHNB11 ACH.000078 MHHNB11 AUTO_GANGLIA   WT
...more lines not shown

> mydata3<-mydata3[,tar3$DepMap_ID];
> show("all(colnames(mydata3)==tar3$DepMap_ID)")
[1] "all(colnames(mydata3)==tar3$DepMap_ID)"
> show(all(colnames(mydata3)==tar3$DepMap_ID))
[1] TRUE

mydata3<-round(mydata3)
dds <- DESeqDataSetFromMatrix(countData = mydata3,
                              colData = tar3,
                            design = ~Types+RAS+Types:RAS);
dds$RAS <- relevel(dds$RAS, ref = "WT")
dds$Types <- relevel(dds$Types, ref = "AUTO_GANGLIA")
dds<- DESeq(dds)
show(resultsNames(dds)) # lists the coefficients
#[1] "Intercept"                           "Types_BILE_vs_AUTO_GANGLIA"
# [3] "Types_BREAST_vs_AUTO_GANGLIA"       
......(more here not shown)
#[17] "Types_URINARY_vs_AUTO_GANGLIA"       "RAS_Mut_vs_WT"
#[19] "TypesBILE.RASMut"                    "TypesBREAST.RASMut"
#[21] "TypesCNS.RASMut"                     "
.......
#[33] "TypesUPPER_DIGEST.RASMut"            "TypesURINARY.RASMut"
##set up wanted contrasts
myConL<-list();
kk<-1;
myConL[[kk]]<-c("RAS_Mut_vs_WT");
names(myConL)[kk]<-"AUTO_GANGLIA.Mut_WT";
kk<-kk+1;
myConL[[kk]]<-c("RAS_Mut_vs_WT","TypesBILE.RASMut");
names(myConL)[kk]<-"BILE.Mut_WT";
kk<-kk+1;
myConL[[kk]]<-c("RAS_Mut_vs_WT","TypesBREAST.RASMut");
names(myConL)[kk]<-"BREAST.Mut_WT";
...
##for question here, only retrieve DEGs for the first three contrasts
for(i in 1:3){
res <- results(dds, contrast=myConL[[i]]);
res<-as.data.frame(res[order(res$pvalue),]);
}

# include your problematic code here with any corresponding output 
For the above two ways of contrast setting for the model, when I retreived the DEGs, say, adjusted p-value <0.05, I not only get different numbers of DEGs, but also when I compared some genes that were DEGs from both setting ways for the same contrast: for example, for contrast AUTO_GANGLIA.Mut_WT, I got the following stats for the same gene:

the first way of model setting:
IDs baseMean    log2FoldChange  lfcSE   stat    pvalue  padj
KRT8P14..ENSG00000214282.   8.109762759 17.28097998 1.586815863 10.8903499  1.28E-27    4.30E-23

the 2nd  way of model setting:
IDs baseMean    log2FoldChange  lfcSE   stat    pvalue  padj
KRT8P14..ENSG00000214282.   8.109762759 11.83411087 1.588355537 7.450542774 9.30E-14    1.37E-09

As you can see, the same gene got different stats for log2FoldChange, pvalue, and padj, although baseMean is the same, and lfcSE is very close but different. 

I checked other contrasts, I had similar observations.

Can someone or Michael can point out anything wrong with my codes or any reason(s) that led to this difference that I observed?

# please also include the results of running the following in an R session 

sessionInfo( )
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS:   /mnt/nasapps/development/R/4.0.2/lib64/R/lib/libRblas.so
LAPACK: /mnt/nasapps/development/R/4.0.2/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] DESeq2_1.28.1               SummarizedExperiment_1.20.0
 [3] Biobase_2.50.0              MatrixGenerics_1.2.1
 [5] matrixStats_0.59.0          GenomicRanges_1.42.0
 [7] GenomeInfoDb_1.26.7         IRanges_2.24.1
 [9] S4Vectors_0.28.1            BiocGenerics_0.36.1
[11] limma_3.46.0                reshape_0.8.8
[13] reshape2_1.4.4              ggrepel_0.9.1
[15] ggplot2_3.3.5

loaded via a namespace (and not attached):
 [1] locfit_1.5-9.4         Rcpp_1.0.6             lattice_0.20-44
 [4] assertthat_0.2.1       utf8_1.2.1             R6_2.5.0
 [7] plyr_1.8.6             RSQLite_2.2.7          pillar_1.6.1
[10] zlibbioc_1.36.0        rlang_0.4.11           annotate_1.66.0
[13] blob_1.2.1             Matrix_1.3-4           splines_4.0.2
[16] BiocParallel_1.24.1    geneplotter_1.66.0     stringr_1.4.0
[19] RCurl_1.98-1.3         bit_4.0.4              munsell_0.5.0
[22] DelayedArray_0.16.3    compiler_4.0.2         pkgconfig_2.0.3
[25] tidyselect_1.1.1       tibble_3.1.2           GenomeInfoDbData_1.2.4
[28] XML_3.99-0.6           fansi_0.5.0            crayon_1.4.1
[31] dplyr_1.0.7            withr_2.4.2            bitops_1.0-7
[34] grid_4.0.2             xtable_1.8-4           gtable_0.3.0
[37] lifecycle_1.0.0        DBI_1.1.1              magrittr_2.0.1
[40] scales_1.1.1           stringi_1.6.2          cachem_1.0.5
[43] XVector_0.30.0         genefilter_1.70.0      ellipsis_0.3.2
[46] generics_0.1.0         vctrs_0.3.8            RColorBrewer_1.1-2
[49] tools_4.0.2            bit64_4.0.5            glue_1.4.2
[52] purrr_0.3.4            survival_3.2-11        fastmap_1.1.0

DESeq2 DEGs • 2.3k views

ADD COMMENT • link 2.8 years ago • updated 2.7 years ago Ming ▴ 380

0

Entering edit mode

Just for people's info and discussion purpose, I did try different variants that used interaction terms (the second way of model and contrast setting that I posted initially) and also I tried to avoid the way of DESeq2 specific way of setting contrasts that I used as posted initally for the second way of setting, but used more general ways instead as shown in example below. It seems that I can get rough close but always slightly different numbers of DEGs for the same contrast (with almost majority of DEGs are overlapped between these variants and the results of the first way of setting), but the issue is: many stats of the DEGs such as log2FoldChange, pvalue, padj are very different for the same DEGs of the same contrast, although basemMean and lfcSE are identical. As curiosity, I did the same exercise using edgeR, and I can get exactly the same numbers of DEGs between the two ways of model and contrast settings for the same contrast, not only the ranking or order of DEGs (if sorted by p-values etc) are the same but every statistics of the DEGs are exactly the same (e.g., logFC, logCPM, F,PValue,FDR). Just wonder if the DESeq2 have some minor issues for such a bit more complicated model and contrast settings or the way to derive the log2FoldChange etc would be subjected to change if changing the way of model and contrast settings, for which the authors of DESeq2 may have the best answers or what I did is not what intended to be in DESeq2.

Below I just give some examples of how I setting up the models and contrasts:

dds <- DESeqDataSetFromMatrix(countData = mydata3, colData = tar3, design = ~0+RAS+RAS:Types)

dds$RAS <- relevel(dds$RAS, ref = "WT") .....

show(resultsNames(dds)) # lists the coefficients

[1] "RASWT" "RASMut"

[3] "RASWT.TypesBILE" "RASMut.TypesBILE"

[5] "RASWT.TypesBREAST" "RASMut.TypesBREAST"

........

[33] "RASWT.TypesURINARY" "RASMut.TypesURINARY"

myConL<-list();

kk<-1;

myConL[[kk]]<-c(-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);

names(myConL)[kk]<-"AUTO_GANGLIA.Mut_WT";

kk<-kk+1;

myConL[[kk]]<-c(-1,1,-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);

names(myConL)[kk]<-"BILE.Mut_WT";

kk<-kk+1;

myConL[[kk]]<-c(-1,1,0,0,-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);

names(myConL)[kk]<-"BREAST.Mut_WT";

for(i in 1:3){

res <- results(dds, contrast=myConL[[i]]);

currTypes<-gsub(".Mut_WT","",names(myConL)[i])

}

this is one of the example variants I tried, which has most overlapping DEGs for the same contrast with those from my original first model/contrast setting using combined strings in my initial post. just give one example of rough idea: 1834 vs 1846 (overlapped 1830) for BILE. From the numbers, one would say not bad at all, but the issue is: for the same contrast, many statistics including log2FoldChange for the same gene are very different, which is bothering me. for example, one gene for the two settings look like:

  SYMBOL            ENSID                          IDs baseMean

2 AL035541.1 ENSG00000227297. AL035541.1..ENSG00000227297. 65.83473

log2FoldChange lfcSE stat pvalue padj

2 -42.03294 1.684618 -24.95103 2.081179e-137 3.59555e-133

   SYMBOL            ENSID                          IDs baseMean

23 AL035541.1 ENSG00000227297. AL035541.1..ENSG00000227297. 65.83473

log2FoldChange lfcSE stat pvalue padj

23 -24.60017 1.684421 -14.60452 2.628189e-48 3.948225e-45

ADD REPLY • link 2.7 years ago Ming ▴ 380

0

Entering edit mode

Here the LFC is not finite. Because we use convergence to stop for infinite LFC you get a different number. I’d recommend to just stick with a sensible coding of reference levels.

ADD REPLY • link 2.7 years ago Michael Love 41k

0

Entering edit mode

Hi, Michael:

Thx for the input. Can you be more specific on what you suggested: "I’d recommend to just stick with a sensible coding of reference levels"? Do you mean that in my second way of model setting, try different ways? Indeed, Idid try: design = ~0+RAS+RAS:Types, design =~RAS+RAS:Types design = ~0+RAS+Types+RAS:Types, or design = ~RAS+Types+RAS:Types, and also tried dds$RAS <- relevel(dds$RAS, ref = "WT") with or without dds$Types <- relevel(dds$Types, ref = "AUTO_GANGLIA"), still the similar observations. Let me know if you have alternative ways or what you really meant here.

Also not just log2FoldChange, other stats including p-values except baseMean are all different, which seem with larger scheme of difference in handling contrasts in models than just LFC-related shrinkage or convergence issue.

Alternatively, in both ways of my model/contrast settings, DESeq2 handles the same contrast in different ways that may have led to different final results. Since when I tried the similar ways setting up models in edgeR, I did not see any difference for the same contrast in the result stats from different model/contrast settings (the same way as I did in DESEq2), as observed in results of DESeq2. This seems suggesting some potential handling issues or limitation in DESeq2 for relatively complicated model settings unless my codes have some issues.

Also I did try lfcShrink(dds, type="apeglm") function. however, I did find that for my model and contrast setting, it is kind of hard for the ways of my both ways of model settings in my initial post to get this function to work for all of my contrasts, since ‘apeglm’ requires use of ‘coef’ but not "contrast=" , which is really limiting my ability to use lfcShrink() function unless I have to set up models for each pair of contrast.

This brings up my alternative question for your expertise opinions: my model setting is kind of including all contrasts in the same roof of one global model as I showed (for both ways of my model settings), although I can set up a model per contrast that I am interested in. Do you think which way is the better at least from point of DEseq2? I read from somewhere that setting up a global model for all contrasts altogether usually is better than setting one model for each contrast. Do you agree on this? This would be of course a big topic, since it may be related to how different in the different contrast in data etc. But at least for this case setting up a model per contrast,
I shall be able to use lfcShrink(dds, type="apeglm") for each contrast by using "coef=".

I would really appreciate your expertise input on these aspects. Thx so much in advance!

ADD REPLY • link 2.7 years ago Ming ▴ 380

0

Entering edit mode

I mean to set the reference levels from the start to a meaningful baseline like WT and then use this throughout.

Note that for lfcShrink you can use ashr and contrast, this combination works well.

I don’t have a preference on your last question.

ADD REPLY • link 2.7 years ago Michael Love 41k

0

Entering edit mode

Hi, Michael:

Really appreciated your input and advice!

For the first question, I did used "WT" for my second way of modle setting although I did try with or without dds$Types <- relevel(dds$Types, ref = "AUTO_GANGLIA"). But for my first way of mode setting, since I used combined string, the reference level has to be "AUTO_GANGLIA.WT" as I did as below: dds$TypesRas <- relevel(dds$TypesRas, ref = "AUTO_GANGLIA.WT"). These efforts seem not able to make the consensus results in DESeq2 so far between the two ways of model settings as I had showed, although I was able to make consensus results in edegR between the two ways of model settings. My guess is my first way of model setting may have to be preferentially used in DESeq2. Hope you have any advice on this aspect.

I will try ashr with my contrast as your suggested for situations using contrast in the model.

Thx again for your great input!

ADD REPLY • link 2.7 years ago Ming ▴ 380

0

Entering edit mode

I would set reference levels for all factors in the design.

ADD REPLY • link 2.7 years ago Michael Love 41k

0

Entering edit mode

Thx for the advice about the reference levels, in which I did do as you suggested! In fact, I even tried more options just want to see why I had discrepancy between the resuts of my first way of model setting vs second way of mdoel settngs.

Any advice on the discrepancy between the resuts of my first way of model setting vs second way of mdoel settngs that my post originally asked about?

Since these efforts seem not able to make the consensus results beyond just LFC in DESeq2 but also for other stats including p-values etc that are all different, which were observed so far between the two ways of model settings for DESeq2 as I had showed, although I was able to make complete consensus results in edegR between the two ways of mbodel settings

Therefore, Just to make it clear since too many questions have been asked in this post, hereby I reiterate below the two ways of model and contrast settings and corresponding results that I did in DESeq2: (I only listed the critical code here; and for the 2nd way of model setting, I only showed the options I tried, which gives the most overlapped DEGs with the 1st way for the same contrast)

the first way: the main code for setting the model and contrast as below:

tar3$TypesRas<-paste(tar3$Types,tar3$RAS,sep=".");

tar3$TypesRas <- factor(tar3$TypesRas)

dds <- DESeqDataSetFromMatrix(countData = mydata3, colData = tar3, design = ~TypesRas)

colData(dds)$TypesRas <- factor(colData(dds)$TypesRas, levels=c("AUTO_GANGLIA.Mut", "AUTO_GANGLIA.WT", "BILE.Mut","BILE.WT",...... "URINARY.Mut", "URINARY.WT"));

dds$TypesRas <- relevel(dds$TypesRas, ref = "AUTO_GANGLIA.WT")

dds<- DESeq(dds);

show(resultsNames(dds)) # lists the coefficients

[1] "Intercept"

[2] "TypesRas_AUTO_GANGLIA.Mut_vs_AUTO_GANGLIA.WT"

[3] "TypesRas_BILE.Mut_vs_AUTO_GANGLIA.WT"

[4] "TypesRas_BILE.WT_vs_AUTO_GANGLIA.WT"

[5] "TypesRas_BREAST.Mut_vs_AUTO_GANGLIA.WT"

[6] "TypesRas_BREAST.WT_vs_AUTO_GANGLIA.WT"
.....

[33] "TypesRas_URINARY.Mut_vs_AUTO_GANGLIA.WT"

[34] "TypesRas_URINARY.WT_vs_AUTO_GANGLIA.WT"

myConL<-list();

kk<-1;

myConL[[kk]]<-c("TypesRas","AUTO_GANGLIA.Mut", "AUTO_GANGLIA.WT");

names(myConL)[kk]<-"AUTO_GANGLIA.Mut_WT";

kk<-kk+1;

myConL[[kk]]<-c("TypesRas","BILE.Mut", "BILE.WT");

names(myConL)[kk]<-"BILE.Mut_WT";

kk<-kk+1;

myConL[[kk]]<-c("TypesRas","BREAST.Mut", "BREAST.WT");

names(myConL)[kk]<-"BREAST.Mut_WT"; ....

for(i in 1:3){

res <- results(dds, contrast=myConL[[i]]);

res<-as.data.frame(res[order(res$pvalue),]);

}

the 2nd way: the main code for setting the model and contrast as below

tar3$Types <- factor(tar3$Types)

tar3$RAS <- factor(tar3$RAS)

dds <- DESeqDataSetFromMatrix(countData = mydata3, colData = tar3, design = ~0+RAS+RAS:Types);

dds$RAS <- relevel(dds$RAS, ref = "WT")

dds$Types <- relevel(dds$Types, ref = "AUTO_GANGLIA")

dds<- DESeq(dds)

show(resultsNames(dds)) # lists the coefficients

[1] "RASWT" "RASMut"
[3] "RASWT.TypesBILE" "RASMut.TypesBILE"
[5] "RASWT.TypesBREAST" "RASMut.TypesBREAST"
...... [31] "RASWT.TypesUPPER_DIGEST" "RASMut.TypesUPPER_DIGEST" [33] "RASWT.TypesURINARY" "RASMut.TypesURINARY"

myConL<-list();

kk<-1;

myConL[[kk]]<-c(-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);

names(myConL)[kk]<-"AUTO_GANGLIA.Mut_WT";

kk<-kk+1;

myConL[[kk]]<-c(-1,1,-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);

names(myConL)[kk]<-"BILE.Mut_WT";

kk<-kk+1;

myConL[[kk]]<-c(-1,1,0,0,-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0);

names(myConL)[kk]<-"BREAST.Mut_WT"; ......

for(i in 1:3){

res <- results(dds, contrast=myConL[[i]]);

res<-as.data.frame(res[order(res$pvalue),]);

}

When I compared the DEGs between the two ways of settings for the same contrasts, I did see difference between,

for contrast: "AUTO_GANGLIA.Mut_WT": 1479 vs 873 at padj<0.05 with overalpped 849

for contrast: "BILE.Mut_WT": 1834 vs 1846 at padj<0.05 with overalpped 1830

for contrast: "BREAST.Mut_WT": 1763 vs 1784 at padj<0.05 with overalpped 1758

Considering the numbers of overalapped genes, these are rather closed results except for the first contrast "AUTO_GANGLIA.Mut_WT": But still very different in terms of the actual statistics of these DEGs as shown below:

Below I showed the top DEGs from both ways of model seetings for the same contrast "BILE.Mut_WT" here FYI (I took out some anotation and baseMean columns)

first way of model setting:

 SYMBOL log2FoldChange    lfcSE      stat        pvalue          padj

1 AP005229.1 48.05928 1.654480 29.04796 1.632705e-185 5.641484e-181

2 AL035541.1 -42.03294 1.684618 -24.95103 2.081179e-137 3.595550e-133

3 HTR3E -44.55963 1.895428 -23.50900 3.299678e-122 3.800459e-118

4 AC243772.3 -40.67986 1.734676 -23.45098 1.291749e-121 1.115845e-117

second way of model setting:

 SYMBOL log2FoldChange    lfcSE      stat        pvalue          padj

1 AP005229.1 48.00919 1.654480 29.01769 3.936211e-185 1.360040e-180

2 AC022031.1 -39.05947 1.573384 -24.82513 4.800262e-136 8.292933e-132

3 HTR3E -41.68154 1.895425 -21.99060 3.542587e-107 4.080116e-103

4 MS4A14 39.83830 1.931396 20.62668 1.581351e-94 1.365971e-90

I notiiced that log2FoldChange,stat, pvalue and padj are different for the same genes, which make the ranking different. I undertsand as you mentioned log2FoldChange may be impacted by convergence to stop for infinite LFC (which may be impacted more or less by different model setting?), and so maybe p-value and padj as well, I guess?

But why contrast: "AUTO_GANGLIA.Mut_WT" did the worst? could be due to more unbalanced data since much less mut samples in this tissue type than BILE and BREAST? But different model settings all used the same data.

Also what would be the possible reasons I can not get exactly results between the two ways of model and contrast settings like I did using edgeR is still the question that I am wondering.

Any input for these aspects would hghly appreciated!

ADD REPLY • link 2.7 years ago Ming ▴ 380

0

Entering edit mode

What is the correlation of the LFC for your two contrasts of mut vs wt in auto ganglia? Does the scatterplot of log2 fold changes look like these are the same contrast?

The above changes in bile mut vs wt are what I would expect with running equivalent models but with different matrices. Due to how the models converge the values will not be exactly the same. E.g. for AP005229 the test statistic varies by 0.1%, and AC022031 by 0.5%. Even seeing higher differences in LFC like in HTR3E which is ~6% would be expected when the LFC is not finite (the algorithm is stopping as the coefficient goes to positive or negative infinity). All of these are examples you provide are cases where the LFC is not finite.

ADD REPLY • link 2.7 years ago Michael Love 41k

0

Entering edit mode

Plot for log2 fold changes from auto ganglia (G), and from BILE (B) between first way (0), and second way (D).

Thanks a lot for the insights. Also I plotted the scatter plots for log2FC as shown. B0 and BD, also G0 and GD are indeed from the same contrast but different ways of model settings (0 vs D), and they look much better than between B(BILE) and G(auto ganglia). This is promising. Just wonder why not exactly identical. Your interpretation kind of making sense. Thx again!

ADD REPLY • link 2.7 years ago Ming ▴ 380

0

Entering edit mode

For the horizontal lines, I think it is that DESeq2 "zeros" out the LFC when you are doing a two group comparison with contrast, when the two groups under comparison have all zero counts. So this approach (using two group comparison and contrast) is preferred for setting those LFC to exactly zero. For complex reasons, those LFC can drift away from zero when you have complex designs, and then the Wald statistic can even be nonzero. In the case of a multi-group design, with some groups having all zeros, the LRT is superior to Wald test, because it doesn't have the same issue undefined LFC and SE. So in short, you can either use the two group contrast approach here which sets those LFC to exactly zero, or you can create a model matrix and perform LRT in DESeq2 by providing full and reduced matrices to full and reduced arguments of DESeq() with test="LRT".

ADD REPLY • link 2.7 years ago Michael Love 41k

0

Entering edit mode

Thx so much for the diagnosis and advice. Indeed, in my data, some genes have all 0 for all sampes of a contrast for both "Mut" and "WT" groups, although for some other contrasts we have samples with non-zero counts for the same genes. So when you say: "So in short, you can either use the two group contrast approach here which sets those LFC to exactly zero,", Do you mean I just set up model and contrast for subset of the data e.g.,just using the sampes of these two groups of contrast (AUTO_GANGLIA.Mut, and AUTO_GANGLIA.WT) for the contrast AUTO_GANGLIA.Mut_WT for example, and deal with each contrast one model at a time, is my undersatnding correct?

also when you mentioned " or you can create a model matrix and perform LRT in DESeq2 by providing full and reduced matrices to full and reduced arguments of DESeq() with test="LRT". " I think you mean use DESeq(dds, test="LRT") instead of default test="Wald", if I use a global model to cover all contrasts in the same model as shown in the two ways of my model settings as I posted? I am just not sure what "full and reduced matrices to full and reduced arguments" you meant here.

In edgeR, I did use their filterbyExpr(y,group=Group) to get rid of those cases, but for DESeq2, since it is generally not recommended for filtering as I am aware of, although we can do something like:

keep <- rowSums(counts(dds) >= 3) >= 3;

dds <- dds[keep,]

but I did not do that yet according to the general not filtering advice to help something like dispersion assessment as I understood. So based on what you just mentoned, shall I do that here for this dataset? since in fact in some of the contrasts, e.g., for contrast AUTO_GANGLIA.Mut_WT where I actually got one group (say the Mut group) with only three samples, but only one sample with non-zero count and two other samples are with 0 count for a gene X, and the other group (i.e. WT group) has all samples as 0 counts for gene X, and indeed this gene X was picked up by DESeq2 as a DEG, which indeed I do not like, and so maybe the filtering like mentioned above would help this situation, Do you agree?

Thx again in advance!

ADD REPLY • link 2.7 years ago Ming ▴ 380

0

Entering edit mode

Q1: No I meant just use your results from above that you called "the first way". Not sub-setting, as you will have worse dispersion estimates that way.

Don't worry about the LRT suggestion if you are unfamiliar because you are set with "the first way".

I would recommend filtering if you have a heterogenous dataset like this. I would use counts(dds) >= 10 and then perhaps >= 6 samples. If you have a gene with all zeros except for three samples, it's hard to assess the LFC anyway. Filtering will help here. It doesn't hurt dispersion estimation.

ADD REPLY • link 2.7 years ago Michael Love 41k

0

Entering edit mode

Thx so much for your great advice! Really appreciated your kind follow-up along the way!

ADD REPLY • link 2.7 years ago Ming ▴ 380

0

Entering edit mode

Hi, Michael:

Thx a lot for the great help of my previous questions.

Sorry to bother you again since I want to follow up a few questions that are relevant to what I asked before in terms of model settings but this time is related to the derived DEGs that I intended to validate the DEGs results with their TPM values and raw reads values at analysis level.

In a similar model setting that I have asked you previously with multiple groups under the roof of the same model, then in one of the contrasts (GA vs A) that I am checking for the corresponding DEGs, I have the top 5 DEGs with the following statistics from output of results(dds, contrast=…) sorted by pvalue:

SYMBOL baseMean log2FoldChange lfcSE stat pvalue padj

Gm27177 155.421521 9.904237824 2.070825923 4.782748 1.73E-06 0.05207

Entpd4b 614.582446 0.669852004 0.158722096 4.220282 2.44E-05 0.367374

Xist 5543.798443 8.487655879 2.083273481 4.074192 4.62E-05 0.463483

Hs3st3b1 190.0771603 1.324582889 0.339567831 3.90079 9.59E-05 0.721803

Bid 737.1657088 -0.480701471 0.126551375 -3.79847 0.000146 0.876846

Then below we pull out these genes’ original Raw Read (expected counts of RSEM) with two samples for A and two samples for GA:

SYMBOL A.14 A.17 GA.20 GA.23

Gm27177 0 1 499 442

Entpd4b 369 459 638 659

Xist 52 61 38924 34

Hs3st3b1 71 70 195 152

Bid 727 737 482 553

Then below we also pull out these genes’ TPM values with two samples for A and two samples for GA:

SYMBOL A.14 A.17 GA.20 GA.23

Gm27177 0 0.02 10.37 9.18

Entpd4b 5.08 5.64 9.86 9.42

Xist 0.25 0.14 166.29 0.11

Hs3st3b1 0.69 0.58 2.04 1.44

Bid 17.66 15.28 12.94 13.11

If we look at the first table, no single gene with padj<0.05, although all with decent pvalue, however, if we checked the TPM values and even raw reads of these 5 genes between the GA vs A groups, they look not bad at all except for Xist gene although only two samples in each group. I understand this is multiple-test issue as for the high-throughput RNAseq data. And unfortunately, we only have two samples in each group of this contrast even under the same roof of model with many other groups and many other samples.

The result of this contrast apparently shows no DEGs at padj 0.05 level. Also even the PCA plot kind of support this trend that there shall be not much DEGs between GA and A groups. My questions are: Do we really believe in the judgement of the used method for the obtained result: no single DEGs at padj 0.05 even with quite some genes showed actual raw read and TPM data with obvious difference? Do you have experience that even in such situation, you might be able to validate experimentally in some cases that these genes indeed expressed differentially given the TPM and raw reads between the groups. Do you think other data under the same roof may have impacted DESeq2 to give these genes bad padj in general?

For the last question, indeed, I used only the subset of data to set up the model (only used the two samples for A and two samples for GA for the contrast GA vs A) and Top DEGs sort by pvalue shown below:

SYMBOL baseMean log2FoldChange lfcSE stat pvalue padj

Gm16867 465.599518 8.882917552 0.787147 11.28495 1.56E-29 2.31E-25

Gm12671 140.0266586 10.59748901 1.514742 6.996235 2.63E-12 1.95E-08

Gm27177 238.9063234 9.924949846 1.468001 6.760861 1.37E-11 6.79E-08

Tmem267 179.963209 -1.373050954 0.316497 -4.33827 1.44E-05 0.049749

H19 1758.066822 0.817829696 0.19 4.304356 1.67E-05 0.049749

Gm38378 13.71554853 7.245713788 1.800326 4.024667 5.71E-05 0.121064

Comparing to above DEGs result, only one gene GM27177 is in common. But this setting create more DEGs at padj<0.05 as seen from above. And the reads and TPM values also kind of support.

Raw Reads:

SYMBOL A.14 A.17 GA.20 GA.23

Gm16867 4 0 726 1120

Gm12671 0 0 346 203

Gm27177 0 1 499 442

Tmem267 254 275 99 99

H19 1321 1268 2270 2157

Gm38378 0 0 30 24

TPM values:

SYMBOL A.14 A.17 GA.20 GA.23

Gm16867 0.06 0 12.38 17.39

Gm12671 0 0 21.02 11.16

Gm27177 0 0.02 10.37 9.18

Tmem267 9.89 8.78 3.96 3.76

H19 29.92 25.32 55.49 50.34

Gm38378 0.01 0 0.72 0.53

Any advices would be really appreciated!

ADD REPLY • link 2.7 years ago Ming ▴ 380

0

Entering edit mode

Do we really believe in the judgement of the used method for the obtained result: no single DEGs at padj 0.05 even with quite some genes showed actual raw read and TPM data with obvious difference?

The p-values from DESeq2 are small. I don't know what you are expecting a procedure to do here. DESeq2 is a statistical inference procedure while you are just eyeballing TPM values and thinking that the p-value should somehow be lower than the very low values that are given. I don't have a response for you here.

Do you have experience that even in such situation, you might be able to validate experimentally in some cases that these genes indeed expressed differentially given the TPM and raw reads between the groups.

Note that the top three have FDR < 0.5. This means that, even given the very low amount of data provided, the model thinks about 1.5 out of these 3 would validate.

Do you think other data under the same roof may have impacted DESeq2 to give these genes bad padj in general?

No I think the p-values are small and you need more data.

I'll be working on BioC conference for the next week, but I think you may want to discuss power tradeoffs with local bioinformaticians to calibrate expectations here.

ADD REPLY • link 2.7 years ago Michael Love 41k

0

Entering edit mode

Thanks a lot for your kind and prompt response! Apreciated very much!

Indeed, I was just wondering if there exist general validatable alpha level that applied to your DEGs lists in your experience, although I noticed that the default DESeq2 alpha level is 0.1.

Also as mentioned in my last question, although sure as you suggested that we need more data to convince the results, but for the given available data, when I just used the subset of data to set up the model in this case, I seem to be able to derive more DEGs for the same contrast comparing to when I used all data to set up the model in the same roof. I know you mentioned before you have no preference between these two ways of model settings. But I just want to understand the underlying reasons that at least may help us to understand what the DESeq2 did behind the scene in these two scenarios and determine whether to proceed with these DEGs from biology point of views. My guess is the difference may be due to the dispersion assessment of subset vs whole set of data?

ADD REPLY • link 2.7 years ago Ming ▴ 380

score 0 · Answer 1 · 2021-07-08

0

Entering edit mode

Kevin Blighe ★ 3.9k

@kevin

Last seen 7 days ago

Republic of Ireland

To understand why you are getting different results with different design formulae, having a fundamental understanding of regression formulae would help.

If we look at these examples:

geneX ~ condition, p-value for condition in relation to geneX
geneX ~ condition + age, p-value for condition in relation to geneX, adjusted for age
geneX ~ condition + age + BMI, p-value for condition in relation to geneX, adjusted for age and BMI

Kevin

ADD COMMENT • link 2.8 years ago Kevin Blighe ★ 3.9k

0

Entering edit mode

Hi, Kevin: Thanks so much for your input! But I am not catching what you suggested here unless you can be more specific. Indeed, I am interested in the contrasts: AUTO_GANGLIA.Mut_WT (actually AUTO_GANGLIA.Mut vs AUTO_GANGLIA.WT) and so on. I personally think my first way shall be the correct one. But I tried to use another way (the 2nd way) to validate it. Based on the Examples from result() of DESeq2 package, actuallty Example 3 "Using interaction terms" (similar to my 2nd way) and "Using a grouping variable" (similar to my first way), I basically derive the 2nd way inferred from that. But somehow the results not consistent with what I got from the first way, kind of confused to me, although in the Example 3 of result(), I did see consistent results there. Hope you can clarify what you suggested here.

ADD REPLY • link 2.8 years ago Ming ▴ 380

score 0 · Answer 2 · 2021-07-08

0

Entering edit mode

swbarnes2 ★ 1.3k

@swbarnes2-14086

Last seen 8 hours ago

San Diego

It is not very clear from your code exactly what two comparisons you think should be identical.

But it can be tricky to get the comparison you want from the ~ Type + Ras + Type:Ras design.

ADD COMMENT • link 2.8 years ago swbarnes2 ★ 1.3k

0

Entering edit mode

Thx a lot for your input as well, swbarnes2! as I mentioned, I am interested in the contrasts: AUTO_GANGLIA.Mut_WT (actually AUTO_GANGLIA.Mut vs AUTO_GANGLIA.WT), BILE.Mut_WT ( actually BILE.Mut vs BILE.WT), and so on. Yes, I understand what you meant by "tricky" here using the ~ Type + Ras + Type:Ras design. But I tried the same thing on Example 3 "Using interaction terms" (similar to my 2nd way) and "Using a grouping variable" (similar to my first way),which indeed works. Not sure why for my data, it is not workinhg. Any way, any insight would be really appreciated!

ADD REPLY • link 2.8 years ago Ming ▴ 380