PCA clusters by replicates not by study groups
0
0
Entering edit mode
@stolarekir-10090
Last seen 8.0 years ago

I have a dataset like this:

rep1_group1 rep1_group2 rep1_group3 rep2_group1 rep2_group2 rep2_group3 rep3_group1 rep3_group2 rep3_group3
    18.26426    18.50355    17.87981    18.14181    18.12318    18.37539    17.54155    17.62264    17.21371
    21.10751    21.88614    21.26385    21.42588    21.42358    21.48596    21.18138    21.64957    21.56978
    19.95816    19.93991    19.17141    19.23463    19.49048    19.69481    19.99466    20.27674    19.83937
    15.77427    15.28338    15.56018    14.74557    15.12376    14.87215    17.58013    17.51229    17.24869
    18.55157    18.75156    18.51595    18.69129    18.45551    18.9907 18.31092    18.28075    18.00218
    24.40756    24.3009 24.0354 23.87117    24.03002    24.39447    24.45595    24.40041    24.03842
    20.6223 20.62194    21.19045    20.85316    20.24748    20.99583    21.70248    20.83252    21.417
    18.53522    18.20705    17.84586    18.45471    18.03112    18.24859    17.71512    17.46969    17.20132
    17.87237    17.80663    15.99771    16.63991    17.51884    17.11533    18.12308    17.90783    18.29576

So simply descibing it rows contain measurements, and columns contain 3 study groups each in 3 replicates (rep1, rep2, rep3)

When I normally apply my transformations to the data to obtain pca:

library(ape)
library(data.table)
library(vegan)
tran <- t(data)
tran.pr.b <- vegdist(tran, "bray")
tran.pcoa.b <- pcoa(tran.pr.b)
plot(tran.pcoa.b$vectors[,1:2],main="pcoa, method=bray")

The result is that my data on the plot are grouped by the replicates and not by study groups. How can I work this out?

Kind regards

pcamethods r • 1.8k views
ADD COMMENT
0
Entering edit mode

This means that the variance between your groups is larger than between your reps.

I don't exactly understand what your research question is. Do you want to find differences (genes?) between groups? And what about your design, are these technical replicates? Biological replicates? Is it paired-sample analysis?

You might want to explain more about your experiment.

ADD REPLY
0
Entering edit mode

these are biological replicates. And the question is exactly about the design. Where in code to put this and how?

rep1,rep2,rep3 are biological replicates and group1,group2,group3 are study groups

ADD REPLY
1
Entering edit mode

PCA is to show the variance or similarity within your data. In your case it shows that your replicates are more similar than your groups, I assume from your post.

If you want to know differences between your groups, you should do statistical analysis and use tests. For this you need a proper design (e.g., are your replicates the same in all groups, hence paired-sample analysis or not?).

But I don't know what your research question is!

ADD REPLY

Login before adding your answer.

Traffic: 683 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6