Dear all,
Do you know which Packages can be used Annotation for the GPL2507 Sentrix Human-6 Expression BeadChip? I can't find the right one now. Thanks!
Dear all,
Do you know which Packages can be used Annotation for the GPL2507 Sentrix Human-6 Expression BeadChip? I can't find the right one now. Thanks!
Let's say you are interested in GSE3188, which has some data from that array.
> library(GEOquery)
> z <- getGEO("GSE3188")
< stuff happens>
> z[1]
$`GSE3188-GPL2507_series_matrix.txt.gz`
ExpressionSet (storageMode: lockedEnvironment)
assayData: 47293 features, 18 samples
element names: exprs
protocolData: none
phenoData
sampleNames: GSM71605 GSM71607 ... GSM71670 (18 total)
varLabels: title geo_accession ... data_row_count (34 total)
varMetadata: labelDescription
featureData
featureNames: GI_10047089-S GI_10047091-S ... trpF (47293 total)
fvarLabels: ID SequenceSource ... SPOT_ID (5 total)
fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
pubMedIds: 16565084
Annotation: GPL2507
So the first ExpressionSet
is GPL2507 do note that this comes with some annotation by default
> head(fData(z[[1]]))
ID SequenceSource GB_ACC Annotation Date SPOT_ID
GI_10047089-S GI_10047089-S RefSeq NM_014332.1 NA NA
GI_10047091-S GI_10047091-S RefSeq NM_013259.1 NA NA
GI_10047093-S GI_10047093-S RefSeq NM_016299.1 NA NA
GI_10047099-S GI_10047099-S RefSeq NM_016303.1 NA NA
GI_10047103-S GI_10047103-S RefSeq NM_016305.1 NA NA
GI_10047105-S GI_10047105-S RefSeq NM_016352.1 NA NA
Where you have the ID and the RefSeq ID, which we can use to map things
> library(org.Hs.eg.db)
> ids <- sapply(strsplit(fData(z[[1]])[,3], "\\."), "[", 1)
> head(ids)
[1] "NM_014332" "NM_013259" "NM_016299" "NM_016303" "NM_016305" "NM_016352"
## there are NA values, so coerce to character
> ids[is.na(ids)] <- "NA"
> annot <- lapply(c("ENTREZID","SYMBOL","GENENAME"), function(x) mapIds(org.Hs.eg.db, ids, x, "ACCNUM"))
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:1 mapping between keys and columns
> annotdf <- data.frame(PROBEID = fData(z[[1]])[,1], ACCNUM = ids, ENTREZID = annot[[1]], SYMBOL = annot[[2]], GENENAME = annot[[3]])
> head(annotdf)
PROBEID ACCNUM ENTREZID SYMBOL
1 GI_10047089-S NM_014332 23676 SMPX
2 GI_10047091-S NM_013259 29114 TAGLN3
3 GI_10047093-S NM_016299 51182 HSPA14
4 GI_10047099-S NM_016303 51186 TCEAL9
5 GI_10047103-S NM_016305 51188 SS18L2
6 GI_10047105-S NM_016352 51200 CPA4
GENENAME
1 small muscle protein X-linked
2 transgelin 3
3 heat shock protein family A (Hsp70) member 14
4 transcription elongation factor A like 9
5 SS18 like 2
6 carboxypeptidase A4
> fData(z[[1]]) <- annotdf
And now the featureData
slot of the ExpressionSet
has the annotation, and if you use limma to analyze (which you probably should), then the topTable
output will all be annotated with the data we just put in the ExpressionSet
.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Nice! Thank you very much!!