Hi everyone, I need some opinions of what might went wrong on my deseq2 analysis on pseudobulked single cell RNA-seq samples. So after pseudobulking per sample, I am checking DEs with deseq2 among 2 groups (ctr,n=6 and treatment,n=7). I am correcting for sex and run batches. After correction I realised that 2 of my treatment samples have identical normalised gene expression for all genes. These 2 samples have same sex but different runs/batches. Does anyone experienced a similar issue before? Many thanks for helps in advance.
Thanks Michael for quick reply. Actually, I also questioned this but later i saw this issue happens within another cluster and different ctr and trt comparisons with more than 2 samples. I realised this while looking at DEGs with heatmap using rlog values. I see before batch correction assay(rlog) have different values for these samples, while after batch correction these flatten up and get same values for all genes. Is there a way to control this? Or can DEGs be trusted in this case?
Oh, i didn't realize there is another method in the mix here.
I would not necessarily trust this "batch correction" method, and suspect of downstream DE using these corrected values.
Thanks Michael, so is there a way to work around this issue? Can I somehow make sure I can remove this batch effects without overcorrecting these samples?
I'm adding my comment to the threaded section ...
I don't have any suggestions here but whatever method you're using is not appropriate upstream of DESeq2 if it's creating counts. Check out our workflow for recommendations on batch correction: