I am experimenting with different normalization techniques to account for batch effects on an RNAseq dataset. I am using DESeq2 for DE analysis, and RUVseq for normalization. For plotting purposes, I want to be able to extract the post-normalization counts (pseudocounts?) from the DESeqDataSet.
#countsTable is a gene x count matix samples<-colnames(countsTable) condition<-c(rep("ctrl",2),rep("A",2),rep("B",2),rep("C",2),rep("D",2), rep("E",2)) pData0 = data.frame(cbind(samples, condition), row.names = samples)# pData<-pData(set4) ## from RUV analysis # x conditions W_1 #ctrl_1 ctrl ctrl -0.2864384 #ctrl_2 ctrl ctrl 0.2952550 #A_1 A A -0.1725057 #A_2 A A 0.1767493 #B_1 B B -0.3269523 #B_2 B B 0.3252628 #C_1 C C -0.4057045 #C_2 C C 0.3960620 #D_1 D D -0.2868378 #D_2 D D 0.2795934 #E_1 E E -0.1910980 #E_2 E E 0.1966142 ddsfm <- DESeqDataSetFromMatrix(countData = filtered, colData=pData0, design=~conditions) ddsfm1 <- DESeqDataSetFromMatrix(countData = filtered, colData=pData, design= ~W_1+conditions) dds<-DESeq(ddsfm1) dds0<-DESeq(ddsfm) head(counts(dds, normalized=T)) head(counts(dds0, normalized=T))
When I do those last calls, the normalized counts look the same for both, even though the log2FoldChanges reflect the normalization. It seems like the counts(dds, normalized=TRUE) call only extracts the counts based on the internal normalization that DESeq2 uses. Is there a way to extract (or compute) post-normalization counts from scaling factors provided in the "design=" section from the creation of the DESeqDataSet?
Thanks in advance!