Folks:
need some help here... why it is so error prone and hard to deal with eset?
# read in .cel files, normalize it, then try to attach annotation
mydata<-ReadAffy()
mydata.rma<-rma(mydata)
allprobe<-row.names(exprs(mydata.rma))
all.gs <- select(hgu133plus2.db, allprobe, c("SYMBOL"), keytype="PROBEID")
featureData(mydata.rma) <- new("AnnotatedDataFrame", data=all.gs)
however--------------------
> length(allprobe)
[1] 54675
> dimall.gs)
[1] 58608 2
is there a way to attach gene annotation information?
If I want to do this in a small scale:
mygene.probe <-read.table(file="myprobesets.txt", blabla)
gs<-select(hgu133plus2.db, mygene.probe, c("SYMBOL", "ENTREZID"))
subset.for.heatmap <- mydata.rma[featureNames(mydata.rma) %in% gs$PROBEID,]
featureData(subset.for.heatmap) <-new("AnnotatedDataFrame",data=gs)
heatmap.2(exprs(subset.for.heatmap), scale="row", trace="none", col=colorpanel(100,"green", "white", "red"),labCol=pData(subset.for.heamap)$treatment_time, ColSideColors=pData(subset.for.heatmap)$color, labRow=fData(subset.for.heatmap)[[2]])
then the probeset id and gene symbol are all messed up on the heatmap, they are not matched... I guess this probably like in attached pData, I need to check they match before doing the attachment...
apparent there is no error checking and a lot of mistake could result from this ......
anyone has a way to easily and elegantly solving this? or pointing to a good resource?
thanks
Thanks, James... I think I have found the work around:
it seems this will has to be done for individual genes...
mygene.probe <-read.table(file="myprobesets.txt", blabla)
gs<-select(hgu133plus2.db, mygene.probe, c("SYMBOL", "ENTREZID"))
gs<-gs[order(gs$PROBEID), ] # this will take care of it.. given in the eset the probeid is sorted...
subset.for.heatmap <- mydata.rma[featureNames(mydata.rma) %in% gs$PROBEID,]
featureData(subset.for.heatmap) <-new("AnnotatedDataFrame",data=gs)
heatmap.2(exprs(subset.for.heatmap), scale="row", trace="none", col=colorpanel(100,"green", "white", "red"),labCol=pData(subset.for.heamap)$treatment_time, ColSideColors=pData(subset.for.heatmap)$color, labRow=fData(subset.for.heatmap)[[2]])
An alternative is to read the .csv format of the annotation to an object and attached it to the whole eset as fData...
Thanks again