Dear Aaron,
thank you for your immediate response. Firstly i want to pinpoint two important things before asking some further questions about the annotations:
1. When i use the function merge from the package inSilicoMerging, it keeps only the common probesets between the two annotation platforms. So, in your opinion, these common probesets should have the same annotation(i.e SYMBOL) in both hgu133a & hgu133plus2 platforms ?
2. Secondly, regarding your comment " Note that only the featureData
of the firstExpressionSet
seems to be preserved by inSilicoMerging" ,
it seems that when i create a list of the datasets which i want to merge, the first dataset used for the list is this that you mention that seems to be preserved.
Thus, if my above points are on the right direction, regarding your proposal of first annotating and then merging the two datasets, should i follow the next lines of code-using the annotation of the hgu133plus2 dataset ??-
library(hgu133plus2.db)
gns <- select(hgu133plus2.db,featureNames(agcrma),"SYMBOL")
gns <- gns[!duplicated(gns[,1]),] # with the most naive way of removing probesets that return one to many mappings
head(gns)
PROBEID SYMBOL
1 1007_s_at DDR1
3 1053_at RFC2
4 117_at HSPA6
5 121_at PAX8
6 1255_g_at GUCA1A
7 1294_at UBA7
featureData(agcrma) <- new("AnnotatedDataFrame", data=gns) # possible way of annotating the ExpressionSet first(?)
agcrma
ExpressionSet (storageMode: lockedEnvironment)
assayData: 54675 features, 34 samples
element names: exprs
protocolData
sampleNames: St_1_WL57.CEL St_2_WL57.CEL ... St_T_WM60.CEL (34 total)
varLabels: ScanDate
varMetadata: labelDescription
phenoData
sampleNames: St_1_WL57.CEL St_2_WL57.CEL ... St_T_WM60.CEL (34 total)
varLabels: meta factor tissue
varMetadata: labelDescription
featureData
featureNames: 1 3 ... 58608 (54675 total)
fvarLabels: PROBEID SYMBOL
fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation: hgu133plus2
AND then after merging:
eset_COMBAT
ExpressionSet (storageMode: lockedEnvironment)
assayData: 22277 features, 60 samples
element names: exprs
protocolData: none
phenoData
sampleNames: St_1_WL57.CEL St_2_WL57.CEL ... 1554_03_Gemmer_1.CEL (60
total)
varLabels: Disease meta factor ... tissue (5 total)
varMetadata: labelDescription
featureData
featureNames: NA NA.1 ... NA.22276 (22277 total)
fvarLabels: PROBEID SYMBOL
fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
Annotation: hgu133plus2 hgu133a
So this is the appropriate way of moving forward ? Or the functions above about annotating first the one ExpressionSet is wrong ? Im asking because i have never annotated first an ExpressionSet, but only after differential expression with limma.
Finally, if the above functions are ok, how then after merging and performing DExpression with limma, can i access my annotations(Gene Symbols) ?
Thank you for your consideration on this matter !!
featureNames(agcrma)
andgns$PROBEID
are identical after removing duplicates.featureData(eset_COMBAT)$SYMBOL
.