The data is part of a tissuesGeneExpression and it looks like this:
GSM92242.CEL.gz GSM92243.CEL.gz GSM92244.CEL.gz GSM92245.CEL.gz
1007_s_at 10.373213 11.395477 10.822040 10.077308
1053_at 6.523120 6.280099 6.377340 6.287809
117_at 7.625365 7.829470 7.461025 7.806562
121_at 10.644904 10.669002 10.332522 10.880915
1255_g_at 5.168378 5.207066 5.152687 4.929143
We are given:
s=svd(e) m = rowMeans(e)
Now I need to find the correlation between u and m, where u is part of the svd and typeof(m) = double
$u
[,1] [,2] [,3] [,4] [,5]
[1,] -0.009117322 4.556496e-03 -1.273478e-02 -2.513035e-03 -1.781687e-02
[2,] -0.005432838 -2.245454e-03 -1.797830e-03 -2.290583e-03 2.775834e-03
[3,] -0.006952054 -3.190183e-03 6.364520e-03 -2.584162e-03 -4.524650e-03
[4,] -0.009712986 -8.782845
ok so we have:
> s[[3]][,1]
[1] -0.07324862 -0.07325096 -0.07295323 -0.07331382 -0.07307976 -0.07300219
[7] -0.07288423 -0.07296209 -0.07322670 -0.07312508 -0.07331067 -0.07287291
[13] -0.07321938 -0.072962
which should be reasonable, and finally we have m:
76057 9.193640 8.021232 9.102173 8.360764 9.023536
201460_at 201461_s_at 201462_at 201463_s_at 201464_x_at 201465_s_at 201466_s_at
9.468738 5.424650 9.649986 10.614343 10.839478 7.494861 7.873803
201467_s_at 201468_s_at 201469_s_at 201470_at 201471_s_at 201472_at
7.367181 8.164352 7.058134 10.627766 11.126659 9.561431
[ reached getOption("max.print") -- omitted 21215 entries ]
I did some experimenting:
201447_at 201448_at 201449_at 201450_s_at 201451_x_at 201452_at
8.158574 8.385202 8.229309 7.934801 6.288633 6.007999 6.565995
201453_x_at 201454_s_at 201455_s_at 201456_s_at 201457_x_at 201458_s_at 201459_at
10.736726 8.276057 9.193640 8.021232 9.102173 8.360764 9.023536
201460_at 201461_s_at 201462_at 201463_s_at 201464_x_at 201465_s_at 201466_s_at
9.468738 5.424650 9.649986 10.614343 10.839478 7.494861 7.873803
201467_s_at 201468_s_at 201469_s_at 201470_at 201471_s_at 201472_at
7.367181 8.164352 7.058134 10.627766 11.126659 9.561431
[ reached getOption("max.print") -- omitted 21215 entries ]
> m[1]
1007_s_at
10.2631
> m[1,]
Error in m[1, ] : incorrect number of dimensions
> m[1:5]
1007_s_at 1053_at 117_at 121_at 1255_g_at
10.263097 6.115715 7.827447 10.934732 5.223437
> m[1]+5
1007_s_at
15.2631
> corr = cor(s[[3]][,1], m)
Error in cor(s[[3]][, 1], m) : incompatible dimensions
length(s[[3]])
[1] 35721
> length(s[[3]][1,])
[1] 189
> length(s[[3]][,1])
[1] 189
> length(n)
Error: object 'n' not found
> length(m)
[1] 22215
>
So from the code above I can propose there is a problem with the length of the two variables, so maybe I am not taking the right one. Do both of them have to have equal elements and can I work around it?
So how do I calculate the correlation of these two data sets?