Question

Error object and replacement value dimnames differ, when changing column names of assayData of ExpressionSet (although dimension is same)

0

Entering edit mode

salamandra ▴ 20

@salamandra-12825

Last seen 2.3 years ago

Portugal

I retrieved raw expression values from geo series supplementary files and did a normalization:

treatment1Samples <- c("GSM437316","GSM437286","GSM437305","GSM437297","GSM437277","GSM437282","GSM437269","GSM437302")
treatment2Samples <- c("GSM437147","GSM437138","GSM437311","GSM437218","GSM437114","GSM437234","GSM437165","GSM437299")
sampleNames <- c(treatment1Samples, treatment2Samples) 
rawData <- read.celfiles(pathSampleCELfiles)
normData <- rma(rawData)

As the resulting expressionset didn't have the complete sample metadata (phenoData) I retrieved it from GEO with:

seriesMatrixFilesList <- getGEO(GEOId, GSEMatrix=T) 

# following selects data of platform we want (there are two in this experiment):
for (i in 1:length(seriesMatrixFilesList)) { 
  if (as.character(unique(seriesMatrixFilesList[[i]]$platform_id)) == GPLId) {
    seriesMatrixFile <- seriesMatrixFilesList[[i]]
  }
}

pheno <- phenoData(seriesMatrixFile) # gets complete phenoData of the study
pheno <- pheno[sampleNames,] # pheno has metadata for all samples, so we select just the samples we want

and then added it to normalized expression values:

phenoData(normData) <- pheno

The rownames of phenodata has the samples names:

rownames(phenoData(normData))
[1] "GSM437316" "GSM437286" "GSM437305" "GSM437297" "GSM437277" "GSM437282" "GSM437269" "GSM437302" "GSM437147" "GSM437138" "GSM437311"
[12] "GSM437218" "GSM437114" "GSM437234" "GSM437165" "GSM437299"

But column names of assaydata are file names:

colnames(exprs(normData))
[1] "GSM437114.CEL.gz" "GSM437138.CEL.gz" "GSM437147.CEL.gz" "GSM437165.CEL.gz" "GSM437218.CEL.gz" "GSM437234.CEL.gz" "GSM437269.CEL.gz"
[8] "GSM437277.CEL.gz" "GSM437282.CEL.gz" "GSM437286.CEL.gz" "GSM437297.CEL.gz" "GSM437299.CEL.gz" "GSM437302.CEL.gz" "GSM437305.CEL.gz"
[15] "GSM437311.CEL.gz" "GSM437316.CEL.gz"

I tried to change assaydata column names to sample names:

newNames<- unlist(lapply(strsplit(colnames(exprs(normData)), '(\\.)|(_)'), function(x) x[1]))
newNames
[1] "GSM437114" "GSM437138" "GSM437147" "GSM437165" "GSM437218" "GSM437234" "GSM437269" "GSM437277" "GSM437282" "GSM437286" "GSM437297"
[12] "GSM437299" "GSM437302" "GSM437305" "GSM437311" "GSM437316"

colnames(exprs(normData)) <- newNames

And got this error:

Error in (function (od, vd)  : 
  object and replacement value dimnames differ

although number of element of newNames is same as of colnames(exprs(normData))

How can I solve this?
Also, does the order of column names in assay data has to be same as row names in phenoData?

Note: Above I din't do colnames(exprs(normData)) <- rownames(phenoData(normData)) cause wanted to keep assayData column names order, which was different from rownames order in phenoData

oligo GEOquery ExpressionSet • 2.2k views

ADD COMMENT • link updated 4.5 years ago by James W. MacDonald 65k • written 4.5 years ago by salamandra ▴ 20

score 3 · Accepted Answer · 2019-10-21

You are recursing too deeply into the bowels of the ExpressionSet. Put a different way, in order for the colnames for the exprs slot of the ExpressionSet to be valid, they have to match up with the rownames of the phenoData slot. If you just change one, you get an error because they both have to be consistent. If you instead change the colnames of the ExpressionSet itself, the rownames of the phenoData slot get changed as well, and all is good.

> library(Biobase)
> data(sample.ExpressionSet)
> colnames(exprs(sample.ExpressionSet))
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
[20] "T" "U" "V" "W" "X" "Y" "Z"
> colnames(sample.ExpressionSet)
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
[20] "T" "U" "V" "W" "X" "Y" "Z"
> colnames(exprs(sample.ExpressionSet)) <- letters
Error in (function (od, vd)  : 
  object and replacement value dimnames differ
> colnames(sample.ExpressionSet) <- letters
> colnames(exprs(sample.ExpressionSet))
 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z"
> colnames(sample.ExpressionSet)
 [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
[20] "t" "u" "v" "w" "x" "y" "z"