I am trying to visualise gene counts for a few interesting genes from an RNA-seq study. DESeq2's plotCounts gives me the counts against the conditions. For example,
I have 3 replicates, and there is a pronounced batch effect, as can be seen above. Firstly, is there a way to "cancel out" the batch effect in the plot. For the purpose of differential expression p-values, I have used a design matrix of ~ replicate + condition.
Secondly, should I instead plot the variance stabilized transformed counts? I don't think this has anything to do with batch effects, but if I am plotting a number of genes together, will it show them in the correct scale according to variance? Is there a way to plot the data from rlog(dds) for a gene?
Thank you! Just a small correction. Currently, plotCounts plots the normalized counts, rather than the log2(normalized counts) unless I am missing something.
To do the batch correction, I ended up using removeBatchEffect(assay(rld), batch= replicate_vector) from limma. The plots look pretty similar before and after batch correction to me. Perhaps this is because these genes were identified as differentially expressed across conditions. I should look at genes d.e. w.r.t. replicates. They should look different before and after removing batch effects.
sorry, yes, you're right, it plots the normalized counts with a log-scale axis.