ACCNUM match zero using AnnotationDBI package

0

Entering edit mode

swang ▴ 120

@swang-1798

Last seen 11.4 years ago

HI, Nianhua: I am using annotationDBI and it is much better than AnnBuilder ( I used to use that package). I found the package I built recently with annotationDBI has ACCNUM match zero. Here is what I did: 1. My code: source("http://bioconductor.org/biocLite.R") biocLite("mouse.db0") makeMOUSECHIP_DB(affy=FALSE, prefix="Rosetta", fileName='Rosettabasefile.txt', baseMapType="eg", outputDir = getwd(), version="3.0.0", manufacturer = "Rosetta", chipName = "Mouse custom Array", manufacturerUrl = "http://www.rii.com/") 2. My base file (example): 10024408304 NA 10024412833 78124 10024395853 50766 10024401691 327766 10024407521 NA 10024397162 192651 10024402992 216395 10024414142 69334 10024410918 105203 10024410918 105203 10024416230 19159 10024416583 22312 I noticed that I have duplicates in both column. 3. the information I got: 4. Can you notice that I have RosettaACCNUM match zero? how does this happen? Thanks Shiliang R version 2.7.2 (2008-08-25) Copyright (C) 2008 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(Rosetta.db) Loading required package: AnnotationDbi Loading required package: Biobase Loading required package: tools Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()'. To cite Bioconductor, see 'citation("Biobase")' and for packages 'citation(pkgname)'. Loading required package: DBI Loading required package: RSQLite > Rosetta() Quality control information for Rosetta: This package has the following mappings: RosettaACCNUM has 0 mapped keys (of 23574 keys) RosettaALIAS2PROBE has 65400 mapped keys (of 65400 keys) RosettaCHR has 19189 mapped keys (of 23574 keys) RosettaCHRLENGTHS has 21 mapped keys (of 21 keys) RosettaCHRLOC has 17558 mapped keys (of 23574 keys) RosettaENSEMBL has 18414 mapped keys (of 23574 keys) RosettaENSEMBL2PROBE has 17096 mapped keys (of 17096 keys) RosettaENTREZID has 19207 mapped keys (of 23574 keys) RosettaENZYME has 1922 mapped keys (of 23574 keys) RosettaENZYME2PROBE has 791 mapped keys (of 791 keys) RosettaGENENAME has 19207 mapped keys (of 23574 keys) RosettaGO has 16380 mapped keys (of 23574 keys) RosettaGO2ALLPROBES has 8447 mapped keys (of 8447 keys) RosettaGO2PROBE has 6152 mapped keys (of 6152 keys) RosettaMAP has 18233 mapped keys (of 23574 keys) RosettaMGI has 19053 mapped keys (of 23574 keys) RosettaMGI2PROBE has 17459 mapped keys (of 17459 keys) RosettaPATH has 4072 mapped keys (of 23574 keys) RosettaPATH2PROBE has 195 mapped keys (of 195 keys) RosettaPFAM has 18761 mapped keys (of 23574 keys) RosettaPMID has 19063 mapped keys (of 23574 keys) RosettaPMID2PROBE has 114988 mapped keys (of 114988 keys) RosettaPROSITE has 18761 mapped keys (of 23574 keys) RosettaREFSEQ has 18781 mapped keys (of 23574 keys) RosettaSYMBOL has 19207 mapped keys (of 23574 keys) RosettaUNIGENE has 18950 mapped keys (of 23574 keys) Additional Information about this package: DB schema: MOUSECHIP_DB DB schema version: 1.0 Organism: Mus musculus Date for NCBI data: 2008-Apr2 Date for GO data: 200803 Date for KEGG data: 2008-Apr1 Date for Golden Path data: 2007-Aug22 Date for IPI data: 2008-Mar19 > sessionInfo() R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] tools stats graphics grDevices utils datasets methods base other attached packages: [1] Rosetta.db_3.0.0 AnnotationDbi_1.2.2 RSQLite_0.7-0 DBI_0.2-4 Biobase_2.0.1 > [[alternative HTML version deleted]]

GO ChipName AnnBuilder AnnotationDbi GO ChipName AnnBuilder AnnotationDbi • 1.7k views

ADD COMMENT • link updated 17.3 years ago by Marc Carlson ★ 7.2k • written 17.3 years ago by swang ▴ 120

0

Entering edit mode

Marc Carlson ★ 7.2k

@marc-carlson-2264

Last seen 9.5 years ago

United States

Hi Shiliang, The SQLForge codebase matches things up to each other based on a range of different types of gene IDs, so depending on what you feed in, you can have no ACCNUMs and everything might still be fine. The ACCNUM mapping is a special case because it is meant to store the accessions that were used to make the probes used in the package, and not simply list all possible genbank accesssions that map to a particular gene, so if you don't tell SQLForge about any ACCNUMs when you make the package, it won't put them in because we don't want to make assumptions when creating these packages. For the package you made, you only used an entrez gene mapping, and did not feed in any genbank accessions, so SQLForge has no way to know whether or not those accessions should be associated with your probes or not. Marc swang wrote: > HI, Nianhua: > > I am using annotationDBI and it is much better than AnnBuilder ( I used to > use that package). > I found the package I built recently with annotationDBI has ACCNUM match > zero. Here is what I did: > > 1. My code: > source("http://bioconductor.org/biocLite.R") > biocLite("mouse.db0") > > > makeMOUSECHIP_DB(affy=FALSE, > prefix="Rosetta", > fileName='Rosettabasefile.txt', > baseMapType="eg", > outputDir = getwd(), > version="3.0.0", > manufacturer = "Rosetta", > chipName = "Mouse custom Array", > manufacturerUrl = "http://www.rii.com/") > > 2. My base file (example): > 10024408304 NA 10024412833 78124 10024395853 50766 > 10024401691 327766 10024407521 NA 10024397162 192651 10024402992 > 216395 10024414142 69334 10024410918 105203 10024410918 105203 > 10024416230 19159 10024416583 22312 I noticed that I have duplicates in > both column. > > 3. the information I got: > > 4. Can you notice that I have RosettaACCNUM match zero? how does this > happen? > > Thanks > > Shiliang > > > R version 2.7.2 (2008-08-25) > Copyright (C) 2008 The R Foundation for Statistical Computing > ISBN 3-900051-07-0 > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > > Natural language support but running in an English locale > > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > >> library(Rosetta.db) >> > Loading required package: AnnotationDbi > Loading required package: Biobase > Loading required package: tools > > Welcome to Bioconductor > > Vignettes contain introductory material. To view, type > 'openVignette()'. To cite Bioconductor, see > 'citation("Biobase")' and for packages 'citation(pkgname)'. > > Loading required package: DBI > Loading required package: RSQLite > >> Rosetta() >> > Quality control information for Rosetta: > > > This package has the following mappings: > > RosettaACCNUM has 0 mapped keys (of 23574 keys) > RosettaALIAS2PROBE has 65400 mapped keys (of 65400 keys) > RosettaCHR has 19189 mapped keys (of 23574 keys) > RosettaCHRLENGTHS has 21 mapped keys (of 21 keys) > RosettaCHRLOC has 17558 mapped keys (of 23574 keys) > RosettaENSEMBL has 18414 mapped keys (of 23574 keys) > RosettaENSEMBL2PROBE has 17096 mapped keys (of 17096 keys) > RosettaENTREZID has 19207 mapped keys (of 23574 keys) > RosettaENZYME has 1922 mapped keys (of 23574 keys) > RosettaENZYME2PROBE has 791 mapped keys (of 791 keys) > RosettaGENENAME has 19207 mapped keys (of 23574 keys) > RosettaGO has 16380 mapped keys (of 23574 keys) > RosettaGO2ALLPROBES has 8447 mapped keys (of 8447 keys) > RosettaGO2PROBE has 6152 mapped keys (of 6152 keys) > RosettaMAP has 18233 mapped keys (of 23574 keys) > RosettaMGI has 19053 mapped keys (of 23574 keys) > RosettaMGI2PROBE has 17459 mapped keys (of 17459 keys) > RosettaPATH has 4072 mapped keys (of 23574 keys) > RosettaPATH2PROBE has 195 mapped keys (of 195 keys) > RosettaPFAM has 18761 mapped keys (of 23574 keys) > RosettaPMID has 19063 mapped keys (of 23574 keys) > RosettaPMID2PROBE has 114988 mapped keys (of 114988 keys) > RosettaPROSITE has 18761 mapped keys (of 23574 keys) > RosettaREFSEQ has 18781 mapped keys (of 23574 keys) > RosettaSYMBOL has 19207 mapped keys (of 23574 keys) > RosettaUNIGENE has 18950 mapped keys (of 23574 keys) > > > Additional Information about this package: > > DB schema: MOUSECHIP_DB > DB schema version: 1.0 > Organism: Mus musculus > Date for NCBI data: 2008-Apr2 > Date for GO data: 200803 > Date for KEGG data: 2008-Apr1 > Date for Golden Path data: 2007-Aug22 > Date for IPI data: 2008-Mar19 > >> sessionInfo() >> > R version 2.7.2 (2008-08-25) > i386-pc-mingw32 > > locale: > LC_COLLATE=English_United States.1252;LC_CTYPE=English_United > States.1252;LC_MONETARY=English_United > States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 > > attached base packages: > [1] tools stats graphics grDevices utils datasets methods > base > > other attached packages: > [1] Rosetta.db_3.0.0 AnnotationDbi_1.2.2 RSQLite_0.7-0 > DBI_0.2-4 Biobase_2.0.1 > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > >

ADD COMMENT • link 17.3 years ago Marc Carlson ★ 7.2k

Login before adding your answer.