Question: AnnotationData Packages for GPL2507 Sentrix Human-6 Expression BeadChip
0
gravatar for kankejia0703
12 weeks ago by
kankejia07030 wrote:

Dear all,

Do you know which Packages can be used Annotation for the GPL2507 Sentrix Human-6 Expression BeadChip? I can't find the right one now. Thanks!

microarray annotation • 120 views
ADD COMMENTlink modified 12 weeks ago by James W. MacDonald50k • written 12 weeks ago by kankejia07030
Answer: AnnotationData Packages for GPL2507 Sentrix Human-6 Expression BeadChip
2
gravatar for James W. MacDonald
12 weeks ago by
United States
James W. MacDonald50k wrote:

Let's say you are interested in GSE3188, which has some data from that array.

> library(GEOquery)
> z <- getGEO("GSE3188")
< stuff happens>
> z[1]
$`GSE3188-GPL2507_series_matrix.txt.gz`
ExpressionSet (storageMode: lockedEnvironment)
assayData: 47293 features, 18 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: GSM71605 GSM71607 ... GSM71670 (18 total)
  varLabels: title geo_accession ... data_row_count (34 total)
  varMetadata: labelDescription
featureData
  featureNames: GI_10047089-S GI_10047091-S ... trpF (47293 total)
  fvarLabels: ID SequenceSource ... SPOT_ID (5 total)
  fvarMetadata: Column Description labelDescription
experimentData: use 'experimentData(object)'
  pubMedIds: 16565084 
Annotation: GPL2507 

So the first ExpressionSet is GPL2507 do note that this comes with some annotation by default

> head(fData(z[[1]]))
                         ID SequenceSource      GB_ACC Annotation Date SPOT_ID
GI_10047089-S GI_10047089-S         RefSeq NM_014332.1              NA      NA
GI_10047091-S GI_10047091-S         RefSeq NM_013259.1              NA      NA
GI_10047093-S GI_10047093-S         RefSeq NM_016299.1              NA      NA
GI_10047099-S GI_10047099-S         RefSeq NM_016303.1              NA      NA
GI_10047103-S GI_10047103-S         RefSeq NM_016305.1              NA      NA
GI_10047105-S GI_10047105-S         RefSeq NM_016352.1              NA      NA

Where you have the ID and the RefSeq ID, which we can use to map things

> library(org.Hs.eg.db)
> ids <- sapply(strsplit(fData(z[[1]])[,3], "\\."), "[", 1)
> head(ids)
[1] "NM_014332" "NM_013259" "NM_016299" "NM_016303" "NM_016305" "NM_016352"
## there are NA values, so coerce to character
> ids[is.na(ids)] <- "NA"
> annot <- lapply(c("ENTREZID","SYMBOL","GENENAME"), function(x) mapIds(org.Hs.eg.db, ids, x, "ACCNUM"))
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:1 mapping between keys and columns
'select()' returned 1:1 mapping between keys and columns
> annotdf <- data.frame(PROBEID = fData(z[[1]])[,1], ACCNUM = ids, ENTREZID = annot[[1]], SYMBOL = annot[[2]], GENENAME = annot[[3]])
> head(annotdf)
        PROBEID    ACCNUM ENTREZID SYMBOL
1 GI_10047089-S NM_014332    23676   SMPX
2 GI_10047091-S NM_013259    29114 TAGLN3
3 GI_10047093-S NM_016299    51182 HSPA14
4 GI_10047099-S NM_016303    51186 TCEAL9
5 GI_10047103-S NM_016305    51188 SS18L2
6 GI_10047105-S NM_016352    51200   CPA4
                                       GENENAME
1                 small muscle protein X-linked
2                                  transgelin 3
3 heat shock protein family A (Hsp70) member 14
4      transcription elongation factor A like 9
5                                   SS18 like 2
6                           carboxypeptidase A4
> fData(z[[1]]) <- annotdf

And now the featureData slot of the ExpressionSet has the annotation, and if you use limma to analyze (which you probably should), then the topTable output will all be annotated with the data we just put in the ExpressionSet.

ADD COMMENTlink written 12 weeks ago by James W. MacDonald50k

Nice! Thank you very much!!

ADD REPLYlink written 12 weeks ago by kankejia07030
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 164 users visited in the last hour