Incorrect DESeq2 Output
1
0
Entering edit mode
umajs • 0
@umajs-21424
Last seen 4.7 years ago

Hi all,

I am wondering if I have missed any essential components in my script for differential gene expression analysis using DESeq2. I first filtered my data by adjusted p value (0.05) and ranked those by fold change.

The following is an example of my output:

  1. Gene x
  2. baseMean 17.26376
  3. log2FC -21.8639
  4. lfcSE 3.90958
  5. stat -5.59239
  6. pvalue 2.24E-08
  7. padj 4.03E-06
  8. Sample 1 (group x) 0
  9. Sample 2 (group x) 0
  10. Sample 3 (group x) 0
  11. Sample 4 (group y) 0
  12. Sample 5 (group y) 0
  13. Sample 6 (group y) 103.5826

I don't understand how group y compared to group x, for this gene, is calculated to have a statistically significant padj value, when all but one have a normalised read count value > 0? Should I be filtering something else? Have I missed something?

Here is the R script:

##Load DESeq2 R package
library(DESeq2)

##Load in raw read count file 
Gene_Counts <- read_excel("N:/path/to/file")

##Set working directory
setwd("N:/path/to/directory")

##Identify the counts within the file
head(Gene_Counts)
counts <- Gene_Counts[, 2:7]

##Determine the meta-data and row names
meta_data <- data.frame(condition = c("x", "x", "x", "y", "y", "y"), sample = colnames(counts)

##Perform the differential gene expression analysis
dds <- DESeqDataSetFromMatrix(countData = counts, colData = meta_data, design = ~condition)

dds <- DESeq(dds)
res <- results(dds)

##Exporting
resdata <- merge(as.data.frame(res), as.data.frame(counts(dds, normalized=TRUE)), by="row.names", sort=FALSE)
head(resdata)

write.csv(resdata, file = "Diff_Expr_Results.csv")

Thanks!

deseq2 • 663 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 26 minutes ago
United States

These can get a small p-value if the average dispersion of the dataset is small. It's hard to know if it's real or not based on 6 counts, so information sharing across genes is used. If you prefer to filter these genes out from the beginning, you can use:

keep <- rowSums(counts(dds) >= 10) >= 3
dds <- dds[keep,]
ADD COMMENT

Login before adding your answer.

Traffic: 965 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6