PCA subset of genes vs DE
1
0
Entering edit mode
@andrebolerbarros-16788
Last seen 6 months ago
Portugal

Hi everyone,

I am looking at one of the ImmGen RNASeq dataset and I am looking at a specific subset of genes, corresponding to a family of interest.

The first step I performed was to subset the counts for the genes that I wanted (~1357 genes) and then plot the PCA for the samples for that (I had to add a pseudocount in order for it to run). The results were the following (I faceted them by organ to be "cleaner" to see any possible pattern in other subgroups):

PCA Results

However, when performing DE in the full dataset (without any prior filtering), I get a lot of differences in the subset genes of interest between two groups that are appearing together in the clustering (Spleen and Peritoneal Cavity):

DE Results

I faceted these results by the value of the highest base mean per group from the two groups in the pair-wise comparisons. What I really wanted to know is if this makes sense to you. I fear that, by subsetting the data for the PCA I might be causing some artifacts in the clustering.

Thanks in advance.

EDIT: I added some information that was relevant to understanding my problem, specifically the number of genes of interest and that the DE results are also for that subset (performed the DE for the whole dataset but, then filtered the results for the genes of interest).

DESeq2 RNASeq • 1.1k views
ADD COMMENT
0
Entering edit mode
ATpoint ★ 4.5k
@atpoint-13662
Last seen 11 minutes ago
Germany

PCA assesses separation based on what you give it. It can well be that the samples are very different overall with lots of DE genes but the subset of genes you used is completely identical in terms of expression so no separation in early PCs. I see no conflict here.

ADD COMMENT
0
Entering edit mode

Hey! Thanks for your answer. I just added some information that was missing that could improve the understanding of my problem. The DE results are filtered for the subset of genes of interest (despite the analysis being ran for the full dataset).

Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 917 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6