Entering edit mode

Is there a Bioconductor solution of comparing discrete variables to PCs to figure out associations. I'm looking for the discrete version of PCAtools' eigencor plot.

Cheers Kristoffer

Comparing discrete variable to PCA

0

Entering edit mode

Is there a Bioconductor solution of comparing discrete variables to PCs to figure out associations. I'm looking for the discrete version of PCAtools' eigencor plot.

Cheers Kristoffer

1

Entering edit mode

Hey Kristoffer, I developed *PCAtools*, as you know - is eigencorplot not what you need? *PCAtools* has been in Bioconductor for > 1 year.

Note that these are the exact same:

- Pearson correlation coefficient of Continuous X versus Categorical Y, with Y encoded numerically
- Extracting the r correlation value from a linear regression of the form X ~ Y

As I show here:

```
continuous <- c(45, 67, 12, 65, 75, 3, 44, 90)
categorical <- factor(c(0,0,0,0,1,1,1,1))
cor(continuous, as.numeric(categorical)) ^ 2
[1] 0.01024737
summary(lm(continuous ~ categorical))$r.squared
[1] 0.01024737
```

[*source: https://www.biostars.org/p/349397/#349493*]

Kevin

Similar Posts

Loading Similar Posts

Traffic: 692 users visited in the last hour

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Hi Kevin

I did try eigencorplot() but got this error when using a categorical (text or factor) variable:

Turns out it can be solved by convert them to numerical values before using

`pca()`

.Thanks for pointing out it could be done.

Cheers Kristoffer

Oh, I thought that issue was addressed in the previous Bioc release, i.e., it should automatically convert factors to numeric and give a warning, like above. I have not yet come across the other error thrown by

`cor()`

internally.Would that same approach hold true when we have multiple categories.

Let's say ''group_1", ''group_2" and ''group_3" In this case, each will be assigned to an integer (0,1,2) although they are ordered categories they do not exactly match this transformation. Am I right?

If so, any suggestions to working with multiple categories variabels?