retrieve gene symbol/description
1
0
Entering edit mode
array chip ▴ 420
@array-chip-4136
Last seen 9 months ago
United States
Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs using biomaRt. I was successful with some IDs, but not with others. For example: > library(biomaRt) > ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") > getBM(attributes=c('embl', 'description','hgnc_symbol'),   filters = 'embl', values = c('AF133587','AA456140'), mart = ensembl)       embl description hgnc_symbol 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC Symbol;Acc:13437]       RTDR1 As you can see, the first ID returns gene symbol/description successfully, but the 2nd one did not. What is the reason for the 2nd one not working? Is there other ways to get it to work? Thanks John [[alternative HTML version deleted]]
biomaRt biomaRt • 3.3k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Wed, Apr 18, 2012 at 3:08 PM, array chip <arrayprofile at="" yahoo.com=""> wrote: > Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs using biomaRt. I was successful with some IDs, but not with others. For example: > >> library(biomaRt) > >> ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") > > >> getBM(attributes=c('embl', 'description','hgnc_symbol'), ? filters = 'embl', values = c('AF133587','AA456140'), mart = ensembl) > > ????? embl????????????????????????????????????????????????????????? description hgnc_symbol > 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC Symbol;Acc:13437]?????? RTDR1 > > > As you can see, the first ID returns gene symbol/description successfully, but the 2nd one did not. What is the reason for the 2nd one not working? Is there other ways to get it to work? > Hi, John. This query is working as expected. The genbank accession "AA456140" is not associated with any gene in the Ensembl gene collection. Try typing your two accessions into the ensembl search box. You'll note that one the first is associated with a gene while the second is simply a genomic alignment (and not associated with a gene). Sean
ADD COMMENT
0
Entering edit mode
Thank you Sean. You are right there is no annotation of this gene in GenBank or Ensemble. But if we dig into more, you can see that both GenBank (section "Reference sequence information" on the right panel) and EMBL ("Ensemble Genes" in the "Navigation" section) point to the gene Pannexin 3 (PANX3) for this clone, and BLAST confirms that this clone aligns 100% to PANX3. Is there a package/function in bioconductor that still allows me to retrieve the gene information for this ID? I have a bunch of GenBank/EMBL IDs in this situation, just want to automate the retrieval if possible. Thanks John ________________________________ From: Sean Davis <sdavis2@mail.nih.gov> Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> Sent: Wednesday, April 18, 2012 12:16 PM Subject: Re: [BioC] retrieve gene symbol/description > Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs using biomaRt. I was successful with some IDs, but not with others. For example: > >> library(biomaRt) > >> ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") > > >> getBM(attributes=c('embl', 'description','hgnc_symbol'),   filters = 'embl', values = c('AF133587','AA456140'), mart = ensembl) > >       embl description hgnc_symbol > 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC Symbol;Acc:13437]       RTDR1 > > > As you can see, the first ID returns gene symbol/description successfully, but the 2nd one did not. What is the reason for the 2nd one not working? Is there other ways to get it to work? > Hi, John. This query is working as expected.  The genbank accession "AA456140" is not associated with any gene in the Ensembl gene collection.  Try typing your two accessions into the ensembl search box.  You'll note that one the first is associated with a gene while the second is simply a genomic alignment (and not associated with a gene). Sean [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
On Wed, Apr 18, 2012 at 3:50 PM, array chip <arrayprofile at="" yahoo.com=""> wrote: > Thank you Sean. You are right there is no annotation of this gene in GenBank > or Ensemble. But if we dig into more, you can see that both GenBank (section > "Reference sequence information" on the right panel) > and EMBL ("Ensemble Genes" in the "Navigation" section) point to the gene > Pannexin 3 (PANX3) for this clone, and BLAST confirms that this clone aligns > 100% to PANX3. > > Is there a package/function in bioconductor that still allows me to retrieve > the gene information for this ID? I have a bunch of GenBank/EMBL IDs in this > situation, just want to automate the retrieval if possible. I do not know of a single resource that is complete in this regard. You could try using the AnnotationDbi package to build an annotation package if that is your use case. Otherwise, you might try using Unigene or NCBI Entrez Gene to get some more mapping down. Sean > ________________________________ > From: Sean Davis <sdavis2 at="" mail.nih.gov=""> > To: array chip <arrayprofile at="" yahoo.com=""> > Cc: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> > Sent: Wednesday, April 18, 2012 12:16 PM > Subject: Re: [BioC] retrieve gene symbol/description > > On Wed, Apr 18, 2012 at 3:08 PM, array chip <arrayprofile at="" yahoo.com=""> wrote: >> Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs >> using biomaRt. I was successful with some IDs, but not with others. For >> example: >> >>> library(biomaRt) >> >>> ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") >> >> >>> getBM(attributes=c('embl', 'description','hgnc_symbol'), ? filters = >>> 'embl', values = c('AF133587','AA456140'), mart = ensembl) >> >> ????? embl >> description hgnc_symbol >> 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC >> Symbol;Acc:13437]?????? RTDR1 >> >> >> As you can see, the first ID returns gene symbol/description successfully, >> but the 2nd one did not. What is the reason for the 2nd one not working? Is >> there other ways to get it to work? >> > > Hi, John. > > This query is working as expected.? The genbank accession "AA456140" > is not associated with any gene in the Ensembl gene collection.? Try > typing your two accessions into the ensembl search box.? You'll note > that one the first is associated with a gene while the second is > simply a genomic alignment (and not associated with a gene). > > Sean > >
ADD REPLY
0
Entering edit mode
Ok, thank you Sean! John ________________________________ From: Sean Davis <sdavis2@mail.nih.gov> Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> Sent: Wednesday, April 18, 2012 1:03 PM Subject: Re: [BioC] retrieve gene symbol/description > Thank you Sean. You are right there is no annotation of this gene in GenBank > or Ensemble. But if we dig into more, you can see that both GenBank (section > "Reference sequence information" on the right panel) > and EMBL ("Ensemble Genes" in the "Navigation" section) point to the gene > Pannexin 3 (PANX3) for this clone, and BLAST confirms that this clone aligns > 100% to PANX3. > > Is there a package/function in bioconductor that still allows me to retrieve > the gene information for this ID? I have a bunch of GenBank/EMBL IDs in this > situation, just want to automate the retrieval if possible. I do not know of a single resource that is complete in this regard. You could try using the AnnotationDbi package to build an annotation package if that is your use case.� Otherwise, you might try using Unigene or NCBI Entrez Gene to get some more mapping down. Sean > ________________________________ > From: Sean Davis <sdavis2@mail.nih.gov> > Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> > Sent: Wednesday, April 18, 2012 12:16 PM > Subject: Re: [BioC] retrieve gene symbol/description > e: >> Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs >> using biomaRt. I was successful with some IDs, but not with others. For >> example: >> >>> library(biomaRt) >> >>> ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") >> >> >>> getBM(attributes=c('embl', 'description','hgnc_symbol'), � filters = >>> 'embl', values = c('AF133587','AA456140'), mart = ensembl) >> >> ����� embl >> description hgnc_symbol >> 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC >> Symbol;Acc:13437]������ RTDR1 >> >> >> As you can see, the first ID returns gene symbol/description successfully, >> but the 2nd one did not. What is the reason for the 2nd one not working? Is >> there other ways to get it to work? >> > > Hi, John. > > This query is working as expected.� The genbank accession "AA456140" > is not associated with any gene in the Ensembl gene collection.� Try > typing your two accessions into the ensembl search box.� You'll note > that one the first is associated with a gene while the second is > simply a genomic alignment (and not associated with a gene). > > Sean > > [[alternative HTML version deleted]]
ADD REPLY
0
Entering edit mode
Sean is right, If you have better information about how a set of IDs match up with known gene IDs (entrez gene IDs), then you can create a custom chip package using the instructions in the SQLForge vignette found here: http://www.bioconductor.org/packages/2.10/bioc/html/AnnotationDbi.html Marc On 04/18/2012 03:12 PM, array chip wrote: > Ok, thank you Sean! > > John > > > > ________________________________ > From: Sean Davis<sdavis2@mail.nih.gov> > > Cc: "bioconductor@r-project.org"<bioconductor@r-project.org> > Sent: Wednesday, April 18, 2012 1:03 PM > Subject: Re: [BioC] retrieve gene symbol/description > > >> Thank you Sean. You are right there is no annotation of this gene in GenBank >> or Ensemble. But if we dig into more, you can see that both GenBank (section >> "Reference sequence information" on the right panel) >> and EMBL ("Ensemble Genes" in the "Navigation" section) point to the gene >> Pannexin 3 (PANX3) for this clone, and BLAST confirms that this clone aligns >> 100% to PANX3. >> >> Is there a package/function in bioconductor that still allows me to retrieve >> the gene information for this ID? I have a bunch of GenBank/EMBL IDs in this >> situation, just want to automate the retrieval if possible. > I do not know of a single resource that is complete in this regard. > You could try using the AnnotationDbi package to build an annotation > package if that is your use case. Otherwise, you might try using > Unigene or NCBI Entrez Gene to get some more mapping down. > > Sean > > >> ________________________________ >> From: Sean Davis<sdavis2@mail.nih.gov> >> Cc: "bioconductor@r-project.org"<bioconductor@r-project.org> >> Sent: Wednesday, April 18, 2012 12:16 PM >> Subject: Re: [BioC] retrieve gene symbol/description >> > e: >>> Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs >>> using biomaRt. I was successful with some IDs, but not with others. For >>> example: >>> >>>> library(biomaRt) >>>> ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") >>> >>>> getBM(attributes=c('embl', 'description','hgnc_symbol'), filters = >>>> 'embl', values = c('AF133587','AA456140'), mart = ensembl) >>> embl >>> description hgnc_symbol >>> 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC >>> Symbol;Acc:13437] RTDR1 >>> >>> >>> As you can see, the first ID returns gene symbol/description successfully, >>> but the 2nd one did not. What is the reason for the 2nd one not working? Is >>> there other ways to get it to work? >>> >> Hi, John. >> >> This query is working as expected. The genbank accession "AA456140" >> is not associated with any gene in the Ensembl gene collection. Try >> typing your two accessions into the ensembl search box. You'll note >> that one the first is associated with a gene while the second is >> simply a genomic alignment (and not associated with a gene). >> >> Sean >> >> > [[alternative HTML version deleted]] > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLY

Login before adding your answer.

Traffic: 887 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6