Question

Number of genes differs between two design formulas in deseq2

0

Entering edit mode

Mohamed Malek • 0

@2e651efe

Last seen 2.8 years ago

France

Hello, i'am a bit new to deseq2 and wanted to understand

I wanted to test for two factors (meiosis stage and température) separately:

code

samples <- read.table(file="samples_WheatOmics.txt", header=TRUE)

model <- ~ stade

design=paste0(model[2])

rownames(samples) <- samples$run

files <- file.path("WheatOmics-salmon/salmon", samples$run, "quant.sf")

names(files) <- samples$run

txi <- tximport(files, type="salmon", txOut=TRUE, ignoreAfterBar = TRUE)

ddsTxi <- DESeqDataSetFromTximport(txi, colData = samples, design = model )

dds <- DESeq(ddsTxi)

res_dds_padj <- results(dds, alpha=0.05)

res_dds_padj <- res_dds_padj[order(res_dds_padj$padj),]

res_dds_padj <- na.omit(res_dds_padj)

summary(res_dds_padj)

Summary (Meiosis Stage factor)

out of 124677 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)       : 16853, 14%
LFC < 0 (down)     : 9652, 7.7%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 1)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

Summary (température factor)

out of 91748 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)       : 779, 0.85%
LFC < 0 (down)     : 1036, 1.1%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 8)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

Can you explain why is there 124677 vs 91748, thanks in advance

DESeq2 • 649 views

ADD COMMENT • link 2.9 years ago Mohamed Malek • 0

Michael Love · Answer 1 · 2021-06-08

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 19 hours ago

United States

You removed a different set of genes with the na.omit call, see vignette on the meaning of NA in results.

ADD COMMENT • link 2.9 years ago Michael Love 41k

0

Entering edit mode

Thank you for the fast answer, but i still don't understand the difference since its the same samples for each analysis, i'd appreciate further information.

Note on p-values set to NA: some values in the results table can be set to NA for one of the following reasons: If within a row, all samples have zero counts, the baseMean column will be zero, and the log2 fold change estimates, p value and adjusted p value will all be set to NA. If a row contains a sample with an extreme count outlier then the p value and adjusted p value will be set to NA. These outlier counts are detected by Cook’s distance. Customization of this outlier filtering and description of functionality for replacement of outlier counts and refitting is described below If a row is filtered by automatic independent filtering, for having a low mean normalized count, then only the adjusted p value will be set to NA. Description and customization of independent filtering is described below

ADD REPLY • link updated 2.9 years ago by Michael Love 41k • written 2.9 years ago by Mohamed Malek • 0

0

Entering edit mode

The independent filtering depends on the test. So if you change the test, the NAs change. You can turn this off and do you own filtering if you want.

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#independent-filtering-of-results

ADD REPLY • link 2.9 years ago Michael Love 41k

score 0 · Answer 2 · 2021-06-08

0

Entering edit mode

Mohamed Malek • 0

@2e651efe

Last seen 2.8 years ago

France

Thank you so much, have a great day.

ADD COMMENT • link 2.9 years ago Mohamed Malek • 0