Hello
I'm new to affymetrix analysis and have been following several tutorial to analyze microarrays.
I have to analyze a specific microarray (GSE33948 - GPL11180) that used the HT MG430PM array plate, but I can't find any annotation package for this specific plate.
Hi,
It is actually quite straightforward to do using the library AnnotationForge.
Using the (extracted) file form the link you provided: "HT_MG-430_PM.na36.annot.csv".
> # set working directory to location of the csv file
> setwd("E:\\working\\dir")
>
> # install/update required (annotation) libraries.
> # The ChipDb that will be generated will link to the
> # content of the OrgDb "org.Mm.eg.db", so better be
> # up-to-date. Idem for GO.db.
>
> BiocManager::install("AnnotationForge")
> BiocManager::install(c("mouse.db0", "org.Mm.eg.db", "GO.db"))
>
> # load required library
> library("AnnotationForge")
>
> # generate ChipDb (in working directory)
> # note: you can ignore the warnings!
> makeDBPackage("MOUSECHIP_DB",
+ affy = TRUE,
+ prefix="htmg430pm",
+ fileName="HT_MG-430_PM.na36.annot.csv",
+ baseMapType="gbNRef",
+ outputDir=".",
+ version = "3.12.0", #current BioC version
+ manufacturer="Affymetrix",
+ chipName = "htmg430pm",
+ manufacturerUrl="http://www.affymetrix.com")
baseMapType is gb or gbNRef
Prepending Metadata
Creating Genes table
Appending Probes
Found 45124 Probe Accessions
Appending Gene Info
Found 37609 Gene Names
Found 37609 Gene Symbols
Appending Chromosomes
Appending RefSeq
Appending Pubmed
Appending Unigene
Appending ChrLengths
Appending 3 GO tables
Appending 3 GO ALL tables
Appending KEGG
Appending EC
Appending Chromosome Locations
Appending Pfam
Appending Prosite
Appending Alias
Appending Ensembl
Appending Uniprot
Appending MGI
Appending Metadata
simplifying probes table
dropping redundant data
Creating package in ./htmg430pm.db
There were 50 or more warnings (use warnings() to see the first 50)
>
>
> # Done, but also need to install in R!
> install.packages("./htmg430pm.db", repos=NULL, type = "source")
* installing *source* package 'htmg430pm.db' ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
converting help for package 'htmg430pm.db'
finding HTML links ... done
htmg430pmACCNUM html
htmg430pmALIAS2PROBE html
htmg430pmBASE html
htmg430pmCHR html
htmg430pmCHRLENGTHS html
htmg430pmCHRLOC html
htmg430pmENSEMBL html
htmg430pmENTREZID html
htmg430pmENZYME html
htmg430pmGENENAME html
htmg430pmGO html
htmg430pmMAPCOUNTS html
htmg430pmMGI html
htmg430pmORGANISM html
htmg430pmPATH html
htmg430pmPFAM html
htmg430pmPMID html
htmg430pmPROSITE html
htmg430pmREFSEQ html
htmg430pmSYMBOL html
htmg430pmUNIGENE html
htmg430pmUNIPROT html
htmg430pm_dbconn html
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (htmg430pm.db)
>
> # check
> library(htmg430pm.db)
Loading required package: org.Mm.eg.db
>
> # which keytypes can be queried in database?
> # note: default is 'probeid'
> keytypes(htmg430pm.db)
[1] "ACCNUM" "ALIAS" "ENSEMBL" "ENSEMBLPROT" "ENSEMBLTRANS"
[6] "ENTREZID" "ENZYME" "EVIDENCE" "EVIDENCEALL" "GENENAME"
[11] "GO" "GOALL" "IPI" "MGI" "ONTOLOGY"
[16] "ONTOLOGYALL" "PATH" "PFAM" "PMID" "PROBEID"
[21] "PROSITE" "REFSEQ" "SYMBOL" "UNIGENE" "UNIPROT"
>
> # which annotation info can be retrieved?
> columns(htmg430pm.db)
[1] "ACCNUM" "ALIAS" "ENSEMBL" "ENSEMBLPROT" "ENSEMBLTRANS"
[6] "ENTREZID" "ENZYME" "EVIDENCE" "EVIDENCEALL" "GENENAME"
[11] "GO" "GOALL" "IPI" "MGI" "ONTOLOGY"
[16] "ONTOLOGYALL" "PATH" "PFAM" "PMID" "PROBEID"
[21] "PROSITE" "REFSEQ" "SYMBOL" "UNIGENE" "UNIPROT"
>
> # Let's use it! Obtain annotation info!
> k <- keys(htmg430pm.db)
>
> res <- AnnotationDbi::select(htmg430pm.db,
+ keys=k[1:10], #limit to first 10 probesets
+ keytype="PROBEID",
+ columns=c("ENTREZID", "SYMBOL", "GO")
+ )
'select()' returned 1:many mapping between keys and columns
> res
PROBEID ENTREZID SYMBOL GO EVIDENCE ONTOLOGY
1 1415670_PM_at 54161 Copg1 GO:0000139 IBA CC
2 1415670_PM_at 54161 Copg1 GO:0000139 IDA CC
3 1415670_PM_at 54161 Copg1 GO:0000139 ISO CC
4 1415670_PM_at 54161 Copg1 GO:0005198 IEA MF
5 1415670_PM_at 54161 Copg1 GO:0005737 IEA CC
6 1415670_PM_at 54161 Copg1 GO:0005783 IBA CC
7 1415670_PM_at 54161 Copg1 GO:0005793 IBA CC
8 1415670_PM_at 54161 Copg1 GO:0005794 IBA CC
9 1415670_PM_at 54161 Copg1 GO:0005794 ISO CC
10 1415670_PM_at 54161 Copg1 GO:0006886 IEA BP
11 1415670_PM_at 54161 Copg1 GO:0006888 IBA BP
12 1415670_PM_at 54161 Copg1 GO:0006891 IBA BP
13 1415670_PM_at 54161 Copg1 GO:0009306 IBA BP
14 1415670_PM_at 54161 Copg1 GO:0015031 IEA BP
15 1415670_PM_at 54161 Copg1 GO:0016020 IBA CC
16 1415670_PM_at 54161 Copg1 GO:0016192 IEA BP
17 1415670_PM_at 54161 Copg1 GO:0030117 IEA CC
18 1415670_PM_at 54161 Copg1 GO:0030126 IBA CC
19 1415670_PM_at 54161 Copg1 GO:0031410 IEA CC
20 1415670_PM_at 54161 Copg1 GO:0051683 ISO BP
21 1415670_PM_at 54161 Copg1 GO:0072384 IBA BP
22 1415670_PM_at 54161 Copg1 GO:0072384 ISO BP
23 1415671_PM_at 11972 Atp6v0d1 GO:0005515 IPI MF
24 1415671_PM_at 11972 Atp6v0d1 GO:0005765 IBA CC