The editor has been updated to markdown! Please see more info at: Tutorial: Updated Support Site Editor

Question: Interpreting results of sample-to-sample PCA/clustering and changing assignment of condition levels
gravatar for fl
8 months ago by
fl0 wrote:


I have a dataset consisting of 16 pooled libraries sequenced on three lanes (2x125bp, 350bp fragment size, 40M reads per library). I isolated RNA from the same type of tissue across different individuals. There are three levels for one condition ("behavior"), and 4-6 biological replicates per level. I assessed the quality of the data using DESeq2 to calculate sample-to-sample VST distances for PCA and hierarchical clustering. I noticed that one of the "level2" replicates clusters with the "level1" replicates. I was wondering what might be the best way to proceed in this case. "Level1" individuals become "level2" individuals because they change behavior throughout their lifespan. Perhaps that "level2" individual had very recently transitioned from "level1"; although we followed the same criteria for collecting all "level2" individuals in the field. Would it be recommendable to treat that "level2" individual as "level1", or perhaps consider "level1 + level2" individuals as a single category and compare against "level3", since I'm mostly interested in the genes up- and down-regulated in "level3"?

I really appreciate any help.

PCA result:

Heatmap of the sample-to-sample distances:

deseq2 • 142 views
ADD COMMENTlink modified 8 months ago • written 8 months ago by fl0
Answer: Interpreting results of sample-to-sample PCA/clustering and changing assignment
gravatar for Michael Love
8 months ago by
Michael Love22k
United States
Michael Love22k wrote:

"I noticed that one of the level1 replicates clusters with the level2 replicates."

I wouldn't worry about the PCA plot, and I wouldn't reclassify the samples.

The PCA plot is just a two dimensional summary and so lots of information is obviously lost (it's the 2D summary which loses the least  "information" in terms of total variance but nevertheless information must be lost), but you may possibly have numerous genes where you find statistical significant differences between level 1 and 2.

ADD COMMENTlink written 8 months ago by Michael Love22k

Thank you, Michael!

ADD REPLYlink written 8 months ago by fl0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 324 users visited in the last hour