I need some advice. I'm lookin at PCA plots of RNAseq data, and am understand whether my data has batch effects or not. I performed alignment using STAR, and then obtained gene counts, and gene TPM values
I used ```prcomp``` to find the Principal components and plotted PC1 vs PC2 and PC2 vs PC3 for
- (a) Raw counts
- (b) Log2(counts + 1)
- (c) TPM values
- (d) Log2(TPM + 1)
I am showing the PCA plots below (These are links to images from google drive)
It seems that there could be a batch effect, but I'm not a 100% sure, since I'm doing this for the first time.
- Can anyone provide advice on if this is really a batch effect ?
- If there is a batch effect, could this be mitigated with either using ComBat, or SVA , or adjusting in linear model ?
Thank you !