Comparing discrete variable to PCA
1
0
Entering edit mode
@kvittingseerup-7956
Last seen 7 months ago
European Union

Is there a Bioconductor solution of comparing discrete variables to PCs to figure out associations. I'm looking for the discrete version of PCAtools' eigencor plot.

Cheers Kristoffer

EDA pcaExplorer PCAtools PCA • 1.5k views
ADD COMMENT
1
Entering edit mode
Kevin Blighe ★ 3.9k
@kevin
Last seen 23 hours ago
Republic of Ireland

Hey Kristoffer, I developed PCAtools, as you know - is eigencorplot not what you need? PCAtools has been in Bioconductor for > 1 year.

Note that these are the exact same:

  1. Pearson correlation coefficient of Continuous X versus Categorical Y, with Y encoded numerically
  2. Extracting the r correlation value from a linear regression of the form X ~ Y

As I show here:

continuous <- c(45, 67, 12, 65, 75, 3, 44, 90)
categorical <- factor(c(0,0,0,0,1,1,1,1))

cor(continuous, as.numeric(categorical)) ^ 2
[1] 0.01024737

summary(lm(continuous ~ categorical))$r.squared
[1] 0.01024737

[source: https://www.biostars.org/p/349397/#349493]

Kevin

ADD COMMENT
0
Entering edit mode

Hi Kevin

I did try eigencorplot() but got this error when using a categorical (text or factor) variable:

Error in cor(xvals, yvals, use = corUSE, method = corFUN) : 
  'y' must be numeric
In addition: Warning message:
In eigencorplot(myPca, metavars = c("barcode")) :
  barcode is not numeric - please check the source data as everything will be converted to a matrix

Turns out it can be solved by convert them to numerical values before using pca().

Thanks for pointing out it could be done.

Cheers Kristoffer

ADD REPLY
1
Entering edit mode

Oh, I thought that issue was addressed in the previous Bioc release, i.e., it should automatically convert factors to numeric and give a warning, like above. I have not yet come across the other error thrown by cor() internally.

ADD REPLY
0
Entering edit mode

Would that same approach hold true when we have multiple categories.

Let's say ''group_1", ''group_2" and ''group_3" In this case, each will be assigned to an integer (0,1,2) although they are ordered categories they do not exactly match this transformation. Am I right?

If so, any suggestions to working with multiple categories variabels?

ADD REPLY

Login before adding your answer.

Traffic: 731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6