I would like to correct batch effect using deseq2, to analyze one hundred of RNA-Seq of tumors, without experimental design. 80 tumors were sequenced in 2018, and 20 in 2019; I can see a strong batch effect whent I plot PCA on reads count or on data after DESeq between 2018 and 2019 tumors.
Usualy I used:
coldata <- as.data.frame(rep(TRUE, each=100)) rownames(coldata)<- colnames(COUNT) colnames(coldata)<- c("group") dds<-DESeqDataSetFromMatrix(COUNT, coldata,design=~1)
I read that DESeq can correct batch effect with this kind of command:
dds <- DESeqDataSet(COUNT, design = ~ batch + condition)
But in my case I have no condition, and I tried "design=~batch", but without effect. I can remove efficiently batch effect with ComBat, with very good result on PCA plot. But then I have negatives values in my matrix, which is a problem for further analysis.
Which solution can I try?
Thank you for any suggestion.