Dear Bioconductor community,
after normalizing & filtering an microarray expression dataset, with 11658 rows(probesets) and 26 columns(samples) remained, i tried to do a PCA analysis, Firtsly i transposed the data: data <- t(exprs(eset)).
But when i used the function prcomp(data, scale=TRUE) i get the following error:
"Error in prcomp.default(data, scale = TRUE) : cannot rescale a constant/zero column to unit variance"
Afterwards when i didnt use the scale argument the pca function worked, but when i used the ggbiplot function, i also get another error:
"Error: invalid rot value"
Is this mistake also related to the argument scale? how can i fixed it ? I also searched some other similar threads for removing columns with zero variance, but it didnt worked. Any ideas or suggestions ??
Firstly, thank you kindly for answering my question. I didnt intend to post in the support forum inappropriate questions, but i didnt also knew any other forum similar to the background of the question. Firstly, i thought in a wrong way that because of transposing, the first error was correlated with samples being the rows. Moreover, your observation about my dataset is true and makes sense because i didnt filter based on variance, but on present/absent calls & generally, i think it is a general assumption that a important proportion of genes dont change expression in a typical microarray experiment.
While it is a general assumption that a large proportion of genes don't change expression, this doesn't mean that the values you get from a microarray should be identical for all samples. In other words, there will always be some error in our measurements, so the expectation is that the expression values for a gene that is probably not changing expression will be very similar across samples, but not identical.
The fact that you have identical values is probably because you have a small number of replicates (and if these are Affy arrays, an odd number of samples, like 3 or 5 or 7, which due to the medianpolish algorithm can give rise to identical values for all samples).
Anyway, a gene with no variability is de facto uninteresting, and removing those genes is a good idea.
Yes i have affymetrix platform hgu133a and i have biological replicates, that is 13 patients, each with 2 samples: one control & one cancer sample. Regarding the other important aspect you have developed, because i implemented limma afterwards, i hesitated using a genefilter based on variance prior of limma, motivated by the paper from Bourgon et al., 2010.