Number of genes differs between two design formulas in deseq2
2
0
Entering edit mode
@2e651efe
Last seen 1 day ago
France

Hello, i'am a bit new to deseq2 and wanted to understand

I wanted to test for two factors (meiosis stage and température) separately:

code

samples <- read.table(file="samples_WheatOmics.txt", header=TRUE)

design=paste0(model[2])

rownames(samples) <- samples$run files <- file.path("WheatOmics-salmon/salmon", samples$run, "quant.sf")

names(files) <- samples$run txi <- tximport(files, type="salmon", txOut=TRUE, ignoreAfterBar = TRUE) ddsTxi <- DESeqDataSetFromTximport(txi, colData = samples, design = model ) dds <- DESeq(ddsTxi) res_dds_padj <- results(dds, alpha=0.05) res_dds_padj <- res_dds_padj[order(res_dds_padj$padj),]



Summary (Meiosis Stage factor)

out of 124677 with nonzero total read count
LFC > 0 (up)       : 16853, 14%
LFC < 0 (down)     : 9652, 7.7%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 1)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results


Summary (température factor)

out of 91748 with nonzero total read count
LFC > 0 (up)       : 779, 0.85%
LFC < 0 (down)     : 1036, 1.1%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 8)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results


Can you explain why is there 124677 vs 91748, thanks in advance

DESeq2 • 102 views
0
Entering edit mode
@mikelove
Last seen 7 hours ago
United States

You removed a different set of genes with the na.omit call, see vignette on the meaning of NA in results.

0
Entering edit mode

Thank you for the fast answer, but i still don't understand the difference since its the same samples for each analysis, i'd appreciate further information.

Note on p-values set to NA: some values in the results table can be set to NA for one of the following reasons: If within a row, all samples have zero counts, the baseMean column will be zero, and the log2 fold change estimates, p value and adjusted p value will all be set to NA. If a row contains a sample with an extreme count outlier then the p value and adjusted p value will be set to NA. These outlier counts are detected by Cook’s distance. Customization of this outlier filtering and description of functionality for replacement of outlier counts and refitting is described below If a row is filtered by automatic independent filtering, for having a low mean normalized count, then only the adjusted p value will be set to NA. Description and customization of independent filtering is described below

0
Entering edit mode

The independent filtering depends on the test. So if you change the test, the NAs change. You can turn this off and do you own filtering if you want.

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#independent-filtering-of-results

0
Entering edit mode
@2e651efe
Last seen 1 day ago
France

Thank you so much, have a great day.