Search
Question: retrieve gene symbol/description
0
gravatar for array chip
5.6 years ago by
array chip340
array chip340 wrote:
Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs using biomaRt. I was successful with some IDs, but not with others. For example: > library(biomaRt) > ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") > getBM(attributes=c('embl', 'description','hgnc_symbol'),   filters = 'embl', values = c('AF133587','AA456140'), mart = ensembl)       embl description hgnc_symbol 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC Symbol;Acc:13437]       RTDR1 As you can see, the first ID returns gene symbol/description successfully, but the 2nd one did not. What is the reason for the 2nd one not working? Is there other ways to get it to work? Thanks John [[alternative HTML version deleted]]
ADD COMMENTlink modified 5.6 years ago by Sean Davis21k • written 5.6 years ago by array chip340
0
gravatar for Sean Davis
5.6 years ago by
Sean Davis21k
United States
Sean Davis21k wrote:
On Wed, Apr 18, 2012 at 3:08 PM, array chip <arrayprofile at="" yahoo.com=""> wrote: > Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs using biomaRt. I was successful with some IDs, but not with others. For example: > >> library(biomaRt) > >> ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") > > >> getBM(attributes=c('embl', 'description','hgnc_symbol'), ? filters = 'embl', values = c('AF133587','AA456140'), mart = ensembl) > > ????? embl????????????????????????????????????????????????????????? description hgnc_symbol > 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC Symbol;Acc:13437]?????? RTDR1 > > > As you can see, the first ID returns gene symbol/description successfully, but the 2nd one did not. What is the reason for the 2nd one not working? Is there other ways to get it to work? > Hi, John. This query is working as expected. The genbank accession "AA456140" is not associated with any gene in the Ensembl gene collection. Try typing your two accessions into the ensembl search box. You'll note that one the first is associated with a gene while the second is simply a genomic alignment (and not associated with a gene). Sean
ADD COMMENTlink written 5.6 years ago by Sean Davis21k
Thank you Sean. You are right there is no annotation of this gene in GenBank or Ensemble. But if we dig into more, you can see that both GenBank (section "Reference sequence information" on the right panel) and EMBL ("Ensemble Genes" in the "Navigation" section) point to the gene Pannexin 3 (PANX3) for this clone, and BLAST confirms that this clone aligns 100% to PANX3. Is there a package/function in bioconductor that still allows me to retrieve the gene information for this ID? I have a bunch of GenBank/EMBL IDs in this situation, just want to automate the retrieval if possible. Thanks John ________________________________ From: Sean Davis <sdavis2@mail.nih.gov> Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> Sent: Wednesday, April 18, 2012 12:16 PM Subject: Re: [BioC] retrieve gene symbol/description > Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs using biomaRt. I was successful with some IDs, but not with others. For example: > >> library(biomaRt) > >> ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") > > >> getBM(attributes=c('embl', 'description','hgnc_symbol'),   filters = 'embl', values = c('AF133587','AA456140'), mart = ensembl) > >       embl description hgnc_symbol > 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC Symbol;Acc:13437]       RTDR1 > > > As you can see, the first ID returns gene symbol/description successfully, but the 2nd one did not. What is the reason for the 2nd one not working? Is there other ways to get it to work? > Hi, John. This query is working as expected.  The genbank accession "AA456140" is not associated with any gene in the Ensembl gene collection.  Try typing your two accessions into the ensembl search box.  You'll note that one the first is associated with a gene while the second is simply a genomic alignment (and not associated with a gene). Sean [[alternative HTML version deleted]]
ADD REPLYlink written 5.6 years ago by array chip340
On Wed, Apr 18, 2012 at 3:50 PM, array chip <arrayprofile at="" yahoo.com=""> wrote: > Thank you Sean. You are right there is no annotation of this gene in GenBank > or Ensemble. But if we dig into more, you can see that both GenBank (section > "Reference sequence information" on the right panel) > and EMBL ("Ensemble Genes" in the "Navigation" section) point to the gene > Pannexin 3 (PANX3) for this clone, and BLAST confirms that this clone aligns > 100% to PANX3. > > Is there a package/function in bioconductor that still allows me to retrieve > the gene information for this ID? I have a bunch of GenBank/EMBL IDs in this > situation, just want to automate the retrieval if possible. I do not know of a single resource that is complete in this regard. You could try using the AnnotationDbi package to build an annotation package if that is your use case. Otherwise, you might try using Unigene or NCBI Entrez Gene to get some more mapping down. Sean > ________________________________ > From: Sean Davis <sdavis2 at="" mail.nih.gov=""> > To: array chip <arrayprofile at="" yahoo.com=""> > Cc: "bioconductor at r-project.org" <bioconductor at="" r-project.org=""> > Sent: Wednesday, April 18, 2012 12:16 PM > Subject: Re: [BioC] retrieve gene symbol/description > > On Wed, Apr 18, 2012 at 3:08 PM, array chip <arrayprofile at="" yahoo.com=""> wrote: >> Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs >> using biomaRt. I was successful with some IDs, but not with others. For >> example: >> >>> library(biomaRt) >> >>> ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") >> >> >>> getBM(attributes=c('embl', 'description','hgnc_symbol'), ? filters = >>> 'embl', values = c('AF133587','AA456140'), mart = ensembl) >> >> ????? embl >> description hgnc_symbol >> 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC >> Symbol;Acc:13437]?????? RTDR1 >> >> >> As you can see, the first ID returns gene symbol/description successfully, >> but the 2nd one did not. What is the reason for the 2nd one not working? Is >> there other ways to get it to work? >> > > Hi, John. > > This query is working as expected.? The genbank accession "AA456140" > is not associated with any gene in the Ensembl gene collection.? Try > typing your two accessions into the ensembl search box.? You'll note > that one the first is associated with a gene while the second is > simply a genomic alignment (and not associated with a gene). > > Sean > >
ADD REPLYlink written 5.6 years ago by Sean Davis21k
Ok, thank you Sean! John ________________________________ From: Sean Davis <sdavis2@mail.nih.gov> Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> Sent: Wednesday, April 18, 2012 1:03 PM Subject: Re: [BioC] retrieve gene symbol/description > Thank you Sean. You are right there is no annotation of this gene in GenBank > or Ensemble. But if we dig into more, you can see that both GenBank (section > "Reference sequence information" on the right panel) > and EMBL ("Ensemble Genes" in the "Navigation" section) point to the gene > Pannexin 3 (PANX3) for this clone, and BLAST confirms that this clone aligns > 100% to PANX3. > > Is there a package/function in bioconductor that still allows me to retrieve > the gene information for this ID? I have a bunch of GenBank/EMBL IDs in this > situation, just want to automate the retrieval if possible. I do not know of a single resource that is complete in this regard. You could try using the AnnotationDbi package to build an annotation package if that is your use case.� Otherwise, you might try using Unigene or NCBI Entrez Gene to get some more mapping down. Sean > ________________________________ > From: Sean Davis <sdavis2@mail.nih.gov> > Cc: "bioconductor@r-project.org" <bioconductor@r-project.org> > Sent: Wednesday, April 18, 2012 12:16 PM > Subject: Re: [BioC] retrieve gene symbol/description > e: >> Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs >> using biomaRt. I was successful with some IDs, but not with others. For >> example: >> >>> library(biomaRt) >> >>> ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") >> >> >>> getBM(attributes=c('embl', 'description','hgnc_symbol'), � filters = >>> 'embl', values = c('AF133587','AA456140'), mart = ensembl) >> >> ����� embl >> description hgnc_symbol >> 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC >> Symbol;Acc:13437]������ RTDR1 >> >> >> As you can see, the first ID returns gene symbol/description successfully, >> but the 2nd one did not. What is the reason for the 2nd one not working? Is >> there other ways to get it to work? >> > > Hi, John. > > This query is working as expected.� The genbank accession "AA456140" > is not associated with any gene in the Ensembl gene collection.� Try > typing your two accessions into the ensembl search box.� You'll note > that one the first is associated with a gene while the second is > simply a genomic alignment (and not associated with a gene). > > Sean > > [[alternative HTML version deleted]]
ADD REPLYlink written 5.6 years ago by array chip340
Sean is right, If you have better information about how a set of IDs match up with known gene IDs (entrez gene IDs), then you can create a custom chip package using the instructions in the SQLForge vignette found here: http://www.bioconductor.org/packages/2.10/bioc/html/AnnotationDbi.html Marc On 04/18/2012 03:12 PM, array chip wrote: > Ok, thank you Sean! > > John > > > > ________________________________ > From: Sean Davis<sdavis2@mail.nih.gov> > > Cc: "bioconductor@r-project.org"<bioconductor@r-project.org> > Sent: Wednesday, April 18, 2012 1:03 PM > Subject: Re: [BioC] retrieve gene symbol/description > > >> Thank you Sean. You are right there is no annotation of this gene in GenBank >> or Ensemble. But if we dig into more, you can see that both GenBank (section >> "Reference sequence information" on the right panel) >> and EMBL ("Ensemble Genes" in the "Navigation" section) point to the gene >> Pannexin 3 (PANX3) for this clone, and BLAST confirms that this clone aligns >> 100% to PANX3. >> >> Is there a package/function in bioconductor that still allows me to retrieve >> the gene information for this ID? I have a bunch of GenBank/EMBL IDs in this >> situation, just want to automate the retrieval if possible. > I do not know of a single resource that is complete in this regard. > You could try using the AnnotationDbi package to build an annotation > package if that is your use case. Otherwise, you might try using > Unigene or NCBI Entrez Gene to get some more mapping down. > > Sean > > >> ________________________________ >> From: Sean Davis<sdavis2@mail.nih.gov> >> Cc: "bioconductor@r-project.org"<bioconductor@r-project.org> >> Sent: Wednesday, April 18, 2012 12:16 PM >> Subject: Re: [BioC] retrieve gene symbol/description >> > e: >>> Hi, I am trying to retrieve gene symbol/description with GenBank/EMBL IDs >>> using biomaRt. I was successful with some IDs, but not with others. For >>> example: >>> >>>> library(biomaRt) >>>> ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") >>> >>>> getBM(attributes=c('embl', 'description','hgnc_symbol'), filters = >>>> 'embl', values = c('AF133587','AA456140'), mart = ensembl) >>> embl >>> description hgnc_symbol >>> 1 AF133587 rhabdoid tumor deletion region gene 1 [Source:HGNC >>> Symbol;Acc:13437] RTDR1 >>> >>> >>> As you can see, the first ID returns gene symbol/description successfully, >>> but the 2nd one did not. What is the reason for the 2nd one not working? Is >>> there other ways to get it to work? >>> >> Hi, John. >> >> This query is working as expected. The genbank accession "AA456140" >> is not associated with any gene in the Ensembl gene collection. Try >> typing your two accessions into the ensembl search box. You'll note >> that one the first is associated with a gene while the second is >> simply a genomic alignment (and not associated with a gene). >> >> Sean >> >> > [[alternative HTML version deleted]] > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
ADD REPLYlink written 5.6 years ago by Marc Carlson7.2k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 138 users visited in the last hour