Number of genes differs between two design formulas in deseq2
2
0
Entering edit mode
@2e651efe
Last seen 2.8 years ago
France

Hello, i'am a bit new to deseq2 and wanted to understand

I wanted to test for two factors (meiosis stage and température) separately:

code

samples <- read.table(file="samples_WheatOmics.txt", header=TRUE)

model <- ~ stade

design=paste0(model[2])

rownames(samples) <- samples$run

files <- file.path("WheatOmics-salmon/salmon", samples$run, "quant.sf")

names(files) <- samples$run

txi <- tximport(files, type="salmon", txOut=TRUE, ignoreAfterBar = TRUE)

ddsTxi <- DESeqDataSetFromTximport(txi, colData = samples, design = model )

dds <- DESeq(ddsTxi)

res_dds_padj <- results(dds, alpha=0.05)

res_dds_padj <- res_dds_padj[order(res_dds_padj$padj),]

res_dds_padj <- na.omit(res_dds_padj)

summary(res_dds_padj)

Summary (Meiosis Stage factor)

out of 124677 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)       : 16853, 14%
LFC < 0 (down)     : 9652, 7.7%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 1)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

Summary (température factor)

out of 91748 with nonzero total read count
adjusted p-value < 0.05
LFC > 0 (up)       : 779, 0.85%
LFC < 0 (down)     : 1036, 1.1%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 8)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

Can you explain why is there 124677 vs 91748, thanks in advance

DESeq2 • 649 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 19 hours ago
United States

You removed a different set of genes with the na.omit call, see vignette on the meaning of NA in results.

ADD COMMENT
0
Entering edit mode

Thank you for the fast answer, but i still don't understand the difference since its the same samples for each analysis, i'd appreciate further information.

Note on p-values set to NA: some values in the results table can be set to NA for one of the following reasons: If within a row, all samples have zero counts, the baseMean column will be zero, and the log2 fold change estimates, p value and adjusted p value will all be set to NA. If a row contains a sample with an extreme count outlier then the p value and adjusted p value will be set to NA. These outlier counts are detected by Cook’s distance. Customization of this outlier filtering and description of functionality for replacement of outlier counts and refitting is described below If a row is filtered by automatic independent filtering, for having a low mean normalized count, then only the adjusted p value will be set to NA. Description and customization of independent filtering is described below

ADD REPLY
0
Entering edit mode

The independent filtering depends on the test. So if you change the test, the NAs change. You can turn this off and do you own filtering if you want.

https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#independent-filtering-of-results

ADD REPLY
0
Entering edit mode
@2e651efe
Last seen 2.8 years ago
France

Thank you so much, have a great day.

ADD COMMENT

Login before adding your answer.

Traffic: 1111 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6