RNAseq PCA resembles subject-specific effect in a paired study design
1
0
Entering edit mode
guanwang179 ▴ 10
@guanwang179-22258
Last seen 4.5 years ago

Hi,

It puzzles me in a recent RNAseq analysis that the PCA plot clustered the repeated measures for each human subject rather reflected the actual biological condition groupings. Two RNAseq pipelines have been adopted:

  1. the standard pipeline: FastQC - HISAT - HTSeq - DESeq2 (all exploratory and results plots look fine)
  2. the 2nd pipeline: FastQC - HISAT - bedtools (converting BAM files to FASTQs) - Salmon (using selective alignment and --seqBias --gcBias flags are on; alignment rate is on average around 75%) - Tximport - DESeq2 (this is where the problematic PCA plot was observed). The PCA plot was generated using the vsd data i.e. vsd <- vst(dds, blind=FALSE) and dds was obtained from DESeqDataSetFromTximport().

The same samples sequenced on a different sequencing platform have also been analysed using the 2nd pipeline above (alignment rate following Salmon is around 30-35%), all plots look fine. I must have done something incorrectly in the most recent analysis with the PCA in question. I tried to troubleshoot where the problem may lie, have not got any luck to figure it out. I'm just wondering if anyone may have similar experience and may provide me a heads-up. Many thanks.

Guan

salmon DESeq2 • 846 views
ADD COMMENT
0
Entering edit mode
Kevin Blighe ★ 3.9k
@kevin
Last seen 12 days ago
Republic of Ireland

You are claiming that there is a problem based solely on the PCA biplots that you have generated? There is not necessarily any problem - sample-specific effects can often be greater than your biological condition of interest. You should take a look at the percent explained variation on your PCs, primarily PC1 and PC2, in order to elucidate further what might be happening. Also, look at other PC bi-plot comparisons to check whether or not your condition of interest is segregated on a PC 'of lesser importance', such as PC8, PC10, or some other PC. You can check this via, for example, a pairsplot or eigencorplot from PCAtools (my own package):

ggg

Kevin

ADD COMMENT
1
Entering edit mode

Thanks Kevin. You are right that the sample-specific effects may be expected.I removed the subjects as batches using limma::removeBatchEffect() and re-plotted the PCA, which is now showing the actual biological effects (PC 1 44% vs PC2 10%).

Also thanks for introducing the PCAtools; in my toolkit.

ADD REPLY
0
Entering edit mode

Thanks Kevin. You are right that the sample-specific effects may be expected.I removed the subjects as batches using limma::removeBatchEffect() and re-plotted the PCA, which is now showing the actual biological effects (PC 1 44% vs PC2 10%).

Also thanks for introducing the PCAtools; in my toolkit.

ADD REPLY

Login before adding your answer.

Traffic: 759 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6