Understanding the Log2FoldChange calculated with DESeq2 and according to formula log2(treated/control)
1
0
Entering edit mode
User000 • 0
@ea03770f
Last seen 18 months ago
Italy

Hello,

I have tried to understand how DESeq2 calculates the Log2FoldChange. I extracted the normalised counts from dds like below, calculated the mean of treated and tried to find the log2FC according to the formula: log2(treated/control). But the log2FC I get using this method is different the one I get using DESeq2. Moreover, it seems like log2FC reported with DESeq2 is calculated based of difference of mean treated and control, like this: log2FC=mean(treated)-mean(control). Why is that? Am i missing something?

newdata <- as.data.frame(cbind(dds@rowRanges@elementMetadata@listData$conditionTREATED1,dds@rowRanges@elementMetadata@listData$conditionTREATED2,dds@rowRanges@elementMetadata@listData$conditionTREATED3,dds@rowRanges@elementMetadata@listData$conditionCTRL1))
newdata$mean <- apply(newdata[,1:3],1, mean) 
newdata$log <- log2(newdata$mean/newdata$conditionCTRL1)

For example using the example below:

 CTRL1 mean_treated
 12.30 12.44

With my calculation I get log2(12.44/12.30)=0.016
DESeq2 gives this value 0.14, which is the difference between 12.44-12.30=0.14

This is my script:

setwd("path/to/work_dir/")
sampledata <- data.frame(run = c("RNA_1","RNA_2","RNA_3","RNA_4","RNA_5","RNA_6","RNA_7","RNA_8","RNA_9","RNA_10","RNA_11","RNA_12"), condition = c("TREATED1","TREATED1","TREATED1","TREATED2","TREATED2","TREATED2","TREATED3","TREATED3","TREATED3","CTRL1","CTRL1","CTRL1"))
sampledata$condition <- factor(sampledata$condition)
table(sampledata$condition)
#Read all RNAseq files
dir <- ("/path/to/mydir")
list.files(dir)
samples <- read.table(file.path(dir, "samples.txt"), header = TRUE)
files <- file.path(dir, samples$run, "quant.sf")
names(files) <- c(1:12)
all(file.exists(files))
#Create a tx2gene file
txdb <- makeTxDbFromGFF(file="gencode.v39.annotation.gtf")
saveDb(x=txdb, file = "gencode.v39.annotation.TxDb")
k <- keys(txdb, keytype = "TXNAME")
tx2gene <- AnnotationDbi::select(txdb, k, "GENEID", "TXNAME")
head(tx2gene)
#Create differential expression
txi.g <- tximport(files, type="salmon", tx2gene=tx2gene)
tcs <- txi.g$counts
mm = model.matrix(~0+condition, sampledata)
dds <- DESeqDataSetFromTximport(txi.g, colData=sampledata, design=mm)
nm <- assays(dds)[["avgTxLength"]]
sf <- estimateSizeFactorsForMatrix(counts(dds)/nm)
#Filter for genes/transcripts that have >= 10 reads
keep <- rowSums(counts(dds)) >= 10
dds <- dds[keep,]
sapply(split(colSums(counts(dds))/1e6, sampledata$condition), summary)
#Calculate dds
dds <- DESeq(dds)
res <- DESeq2::results(dds, 
                       contrast = list(c("conditionTREATED1","conditionTREATED2","conditionTREATED3"), "conditionCTRL1"), 
                       listValues = c(1/3,-1)
)
contrast = (list(c("conditionTREATED1","conditionTREATED2","conditionTREATED3"),( "conditionCTRL1",listValues = c(1/3,-1)))
resLFC <- lfcShrink(dds, contrast=contrast, res=res, type="ashr")
DESeq2 log2FC • 4.3k views
ADD COMMENT
0
Entering edit mode
swbarnes2 ★ 1.3k
@swbarnes2-14086
Last seen 1 day ago
San Diego

If DESeq just returned a simple log of the ratio, you could do that in Excel. DESeq is doing something more complex. If your contrast is made correctly, the value you calculate should be close to what you could calculate yourself, but not the same.

ADD COMMENT
0
Entering edit mode

I added an example, may be my explanation was not completely clear.

ADD REPLY
0
Entering edit mode

I'm not sure what those numbers you posted are, but the point of logs is to avoid long division. You say you are doing doing log(a/b), but log(a) - log(b) is the same thing, so if those numbers are the average of the logged counts, then subtracting one from the other will give you the ratio.

ADD REPLY
0
Entering edit mode

Thank you very much for taking your time and answering. I did not write that the difference is between logs. For me It is obvious that log(a/b) and log(a)-log(b) is the same thing. If you could I suggest you to read better the question, if it is not clear please just ask me clarifications. I really need to understand the problem I posted above. Thanks again.

ADD REPLY

Login before adding your answer.

Traffic: 494 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6