I have seen many posts about batch effect correction on RNA-seq data. But I'm little confused in using the code for differential analysis. So, asking it here.
I have been using 8 rna-seq samples, 4 NOTCH silenced samples and 4 Controls. The setup is like this
Samples type days 1 Control day1 2 Control day1 3 Control day2 4 Control day2 5 shNotch day2 6 shNotch day2 7 shNotch day1 8 shNotch day1
For my data I did the batch correction and then used the data for an MDS plot to look how they clustered.
library(edgeR) group <- factor(paste0(columndata$type)) dge <- DGEList(counts=countdata, group=group) keep <- rowSums(cpm(dge) > 1) >= 3 dge <- dge[keep, , keep.lib.sizes=FALSE] dge <- calcNormFactors(dge, method = "TMM") logCPM <- cpm(dge, log=TRUE, prior.count=2) logCPM_bc <- removeBatchEffect(logCPM, batch=columndata$days) plotMDS(logCPM_bc)
For differential analysis first I created design matrix like following:
design <- model.matrix(~ 0 + group + columndata$days) colnames(design) <- c("Control","shNotch", "Days") contrast.matrix <- makeContrasts(shNotch-Control, levels=design)
And now I was wondering whether to use DGElist object
logCPM_bc (Batch corrected logCPM) for estimating dispersion
Any help? thanks a lot