problem with IGH genes in org.Hs.egCHRLOC
1
0
Entering edit mode
@francois-pepin-4892
Last seen 9.7 years ago
Hi everyone, I'm trying to get information about the IGH alleles and i'm getting strange results from org.Hs.eg.db. For example, if we use IGHV1-3 (located at chr14, starting around 106,471,246 on NCBI & Ensembl websites): > library(org.Hs.eg.db) > get("28473",org.Hs.egSYMBOL) [1] "IGHV1-3" > get("28473",org.Hs.egCHRLOC) [1] NA I guess I could use biomart for these, but I'm surprised that org.Hs.egCHRLOC would not work for these genes. A quick test with other genes show results that are consistent with the data on the NCBI & Ensembl websites. I've also tested it on the devel version downloaded today (org.Hs.eg.db_2.5.0) with the same results. > sessionInfo() R version 2.13.1 (2011-07-08) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] org.Hs.eg.db_2.5.0 RSQLite_0.9-4 DBI_0.2-5 [4] AnnotationDbi_1.14.1 Biobase_2.12.2 Fran?ois Pepin Scientist Sequenta, Inc. 400 E. Jamie Court, Suite 301 South San Francisco, CA 94080 650 243 3929 p francois.pepin at sequentainc.com www.sequentainc.com The contents of this e-mail message and any attachments are intended solely for the addressee(s) named in this message. This communication is intended to be and to remain confidential and may be subject to applicable attorney/client and/or work product privileges. If you are not the intended recipient of this message, or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and its attachments. Do not deliver, distribute or copy this message and/or any attachments and if you are not the intended recipient, do not disclose the contents or take any action in reliance upon the information contained in this communication or any attachments.
biomaRt biomaRt • 798 views
ADD COMMENT
0
Entering edit mode
Marc Carlson ★ 7.2k
@marc-carlson-2264
Last seen 7.8 years ago
United States
Hi Francois, The CHRLOC mappings are based on the data from UCSC, but in order to find a match to "28473", the UCSC knownToLocusLink table that we download from UCSC every release has to have an entry in it for "28473". Unfortunately, it doesn't. In fact, none of the NCBI linked UCSC resources seems to have any entries for entrez gene "28473", which means that you are probably stuck looking for this value by using another resource (one that is not based on UCSC). So I think you really do want to try biomart in this particular case. Marc On 10/31/2011 05:53 PM, Francois Pepin wrote: > Hi everyone, > > I'm trying to get information about the IGH alleles and i'm getting strange results from org.Hs.eg.db. > > For example, if we use IGHV1-3 (located at chr14, starting around 106,471,246 on NCBI& Ensembl websites): > >> library(org.Hs.eg.db) >> get("28473",org.Hs.egSYMBOL) > [1] "IGHV1-3" >> get("28473",org.Hs.egCHRLOC) > [1] NA > > I guess I could use biomart for these, but I'm surprised that org.Hs.egCHRLOC would not work for these genes. A quick test with other genes show results that are consistent with the data on the NCBI& Ensembl websites. I've also tested it on the devel version downloaded today (org.Hs.eg.db_2.5.0) with the same results. > >> sessionInfo() > R version 2.13.1 (2011-07-08) > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] org.Hs.eg.db_2.5.0 RSQLite_0.9-4 DBI_0.2-5 > [4] AnnotationDbi_1.14.1 Biobase_2.12.2 > > > Fran?ois Pepin > Scientist > > Sequenta, Inc. > 400 E. Jamie Court, Suite 301 > South San Francisco, CA 94080 > > 650 243 3929 p > > francois.pepin at sequentainc.com > www.sequentainc.com > > The contents of this e-mail message and any attachments are intended solely for the addressee(s) named in this message. This communication is intended to be and to remain confidential and may be subject to applicable attorney/client and/or work product privileges. If you are not the intended recipient of this message, or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and its attachments. Do not deliver, distribute or copy this message and/or any attachments and if you are not the intended recipient, do not disclose the contents or take any action in reliance upon the information contained in this communication or any attachments. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives:http://news.gmane.org/gmane.science.biology.info rmatics.conductor
ADD COMMENT
0
Entering edit mode
On Nov 1, 2011, at 15:08 , Marc Carlson wrote: > Hi Francois, > > The CHRLOC mappings are based on the data from UCSC, but in order to > find a match to "28473", the UCSC knownToLocusLink table that we > download from UCSC every release has to have an entry in it for > "28473". Unfortunately, it doesn't. In fact, none of the NCBI linked > UCSC resources seems to have any entries for entrez gene "28473", which > means that you are probably stuck looking for this value by using > another resource (one that is not based on UCSC). So I think you really > do want to try biomart in this particular case. Thanks Marc, I hadn't realized UCSC was used for that since the NCBI website also has chromosomal position. I'm not quite sure why UCSC doesn't like these genes. They're not regular genes because they need to be rearranged to make a functional antibodies, but they're otherwise well-known and well-annotated. BiomaRt works fine in this case. Francois
ADD REPLY

Login before adding your answer.

Traffic: 742 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6