Contrasting LogFCs in DESeq2
1
0
Entering edit mode
GlycineMax • 0
@user-24655
Last seen 7 months ago
United States

Dear community,

a quick question on how fold changes are calculated in DESeq2. Below I have the counts for a gene X in 5 treatments and a control, with 4 replicates:

Control    :       1    0   0   0
Treatment 1:       1    3   475 0
Treatment 2:       0    0   0   0
Treatment 3:       1    0   0   0
Treatment 4:       1    1   0   0
Treatment 5:       0    0   0   0

After running DESeq2 (code below), these are my results comparing each treatment to the control:

           baseMean      log2FoldChange     lfcSE      stat       pvalue    padj
treat1     22.81609          9.069945      3.5469    2.557085       NA         NA
treat2      22.8160         -15.59239      3.69023   -4.225318      NA         NA
treat3       22.81609       0.0142733      3.6902    0.003867862    NA         NA
treat4       22.81609        0.9338621     3.6900    0.253077       NA         NA
treat5      22.81609        -15.42883      3.69023  -4.180993       NA         NA

I understand the NA p-values, as DESeq is probably taking that "475" as an outlier. My question is: when log2FC is calculated in a treatment1_vs_control comparison for example, is it based directly on the counts of treat 1 and control, or does it involve the overall counts of that gene across treatments? And any idea of why the logFC is so contrasting in treatments 2, 3, 4 and 5, even though their counts were somewhat similar?


#Adding data
cts <- (read.csv("count_matrix.csv",sep=",",check.names=FALSE,row.names="gene_id"))
coldata <- read.csv("metadata.csv", row.names=1)
coldata$treatment <- factor(coldata$treatment)

head(cts,5)

coldata

##Check rownames
rownames(coldata) <- sub("fb", "", rownames(coldata))
all(rownames(coldata) %in% colnames(cts))

all(rownames(coldata) == colnames(cts))

cts <- cts[, rownames(coldata)]
all(rownames(coldata) == colnames(cts))

##Create a DESeq Data Set
dds <- DESeqDataSetFromMatrix(countData = cts,
                              colData = coldata,
                              design = ~ treatment)
dds

featureData <- data.frame(gene=rownames(cts))
mcols(dds) <- DataFrame(mcols(dds), featureData)
mcols(dds)

##Pre-filtering
keep <- rowSums(counts(dds)) >= 1
dds <- dds[keep,]

##Setting levels
dds$treatment <- relevel(dds$treatment, ref = "control")

##Running Deseq
dds <- DESeq(dds)
res <- results(dds)
res
###Analyzing Treatment vs Control (1 analysis per Treatment)
res_treatment1 <- results(dds,  contrast=c("treatment","treatment1","control"))

##Log fold change shrinkage for visualization and ranking
resultsNames(dds)

resLFC_treatment1 <- lfcShrink(dds, coef="treatment_treatment1_vs_mock", type="apeglm")
resLFC_treatment1

####p-values and adjusted p-values
#We can order our results table by the smallest p value:

resOrdered <- res_treatment1L[order(res_treatment1$pvalue),]

summary(res_treatment1)
DESeq2 RNASeqData • 952 views
ADD COMMENT
0
Entering edit mode
swbarnes2 ★ 1.3k
@swbarnes2-14086
Last seen 11 hours ago
San Diego

Honestly, I wouldn't agonize about trying to understand what's going on in a gene with such marginal counts. You can't draw good conclusions from such low counts, and the NA values alert you to that.

ADD COMMENT
0
Entering edit mode

I agree with you, no conclusions to be made here, I was looking at this gene just because those contrasting FC were making it stand out. So I was trying to understand how FCs results were calculated.

ADD REPLY
0
Entering edit mode

Well, the fold change between not zero and zero should be pretty high, right? So it's pretty high. I think doing LFCshrink would attenuate that effect somewhat.

ADD REPLY

Login before adding your answer.

Traffic: 1075 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6