GenomeGraphs/biomaRt/getBM on older genome builds
1
0
Entering edit mode
Mark Robinson ★ 1.1k
@mark-robinson-2171
Last seen 10.2 years ago
Hi all. I would like to use GenomeGraphs (specifically, a "GeneRegion" object plotted with gdPlot()) ... but I have coordinates from an older genome build. When I try to access the older Ensembl mart, I get an error in getBM(). Is this even possible? I would be delighted if it is. Of course, it does give a warning (see below) that some biomaRt functions will not work, so perhaps this is futile. Is there another alternative? My commands: -------- library(GenomeGraphs) mart <- useMart(biomart="ensembl", dataset="mmusculus_gene_ensembl") ds <- listDatasets(mart) ds[grep("mus",ds$desc),] # RPLP1 on mm8 (i.e. not recent) build # this will run, but obviously won't find my gene gr <- new("GeneRegion", chromosome = "9", start = 61711290, end = 61712548, strand="-", biomart = mart) print(gr) # try the archived version ensembl46 <- useMart(biomart="ensembl_mart_46", dataset="mmusculus_gene_ensembl", archive=TRUE) ds46 <- listDatasets(ensembl46) ds46[grep("mus",ds46$desc),] gr46 <- new("GeneRegion", chromosome = "9", start = 61711290, end = 61712548, strand="-", biomart = ensembl46) -------- My output: > library(GenomeGraphs) Loading required package: biomaRt Loading required package: grid > > mart=useMart(biomart="ensembl", dataset="mmusculus_gene_ensembl") Checking attributes ... ok Checking filters ... ok > ds <- listDatasets(mart) > ds[grep("mus",ds$desc),] dataset description version 43 mmusculus_gene_ensembl Mus musculus genes (NCBIM37) NCBIM37 > > # RPLP1 on mm8 (i.e. not recent) build > # this will run, but obviously won't find my gene > gr <- new("GeneRegion", chromosome = "9", + start = 61711290, end = 61712548, strand="-", biomart = mart) > > print(gr) Object of class 'GeneRegion': Start:61709290 End:61714548 Chromosome: 9 Exons in Ensembl: ensembl_gene_id ensembl_transcript_id ensembl_exon_id exon_chrom_start NA <na> <na> <na> <na> exon_chrom_end rank strand biotype NA <na> <na> <na> <na> There are 0 more rows> > > # try the archived version > ensembl46=useMart(biomart="ensembl_mart_46", dataset="mmusculus_gene_ensembl", archive=TRUE) Checking attributes ... ok Checking filters ... ok Warning messages: 1: In bmAttrFilt("attributes", mart) : biomaRt warning: looks like we're connecting to an older version of BioMart suite. Some biomaRt functions might not work. 2: In bmAttrFilt("filters", mart) : biomaRt warning: looks like we're connecting to an older version of BioMart suite. Some biomaRt functions might not work. > ds46 <- listDatasets(ensembl46) > ds46[grep("mus",ds46$desc),] dataset description version 34 mmusculus_gene_ensembl Mus musculus genes (NCBIM36) NCBIM36 > > gr46 <- new("GeneRegion", chromosome = "9", + start = 61711290, end = 61712548, strand="-", biomart = ensembl46) Error in getBM(c("ensembl_gene_id", "ensembl_transcript_id", "ensembl_exon_id", : Invalid attribute(s): ensembl_exon_id Please use the function 'listAttributes' to get valid attribute names > > sessionInfo() R version 2.9.0 (2009-04-17) i386-apple-darwin8.11.1 locale: en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 attached base packages: [1] grid stats graphics grDevices utils datasets methods [8] base other attached packages: [1] GenomeGraphs_1.3.5 biomaRt_2.0.0 loaded via a namespace (and not attached): [1] RCurl_0.94-1 XML_2.3-0 Thanks, Mark ------------------------------ Mark Robinson, PhD (Melb) Epigenetics Laboratory, Garvan Bioinformatics Division, WEHI e: m.robinson at garvan.org.au e: mrobinson at wehi.edu.au p: +61 (0)3 9345 2628 f: +61 (0)3 9347 0852
Mus musculus biomaRt GenomeGraphs Mus musculus biomaRt GenomeGraphs • 1.7k views
ADD COMMENT
0
Entering edit mode
@steffenstatberkeleyedu-2907
Last seen 10.2 years ago
Hi Mark, GenomeGraphs contains hard coded filter and attribute names which get used by biomaRt to retrieve gene information. This can result in compatibility issues with old archived Ensembl databases. In addition some internal representations have changed since Ensembl 51, I think, and this causes some extra compatibility issues. Archived Ensembl versions >=51 should work, but it looks like your gene is not in those either. If there is any other way (e.g. use the Ensembl web interface on version 46) for you to retrieve the data in this area you could fill out the NA's in the data.frame that sits in gr at ens in your example. And then you should be able to plot them with gdPlot. Cheers, Steffen > Hi all. > > I would like to use GenomeGraphs (specifically, a "GeneRegion" object > plotted with gdPlot()) ... but I have coordinates from an older genome > build. When I try to access the older Ensembl mart, I get an error in > getBM(). > > Is this even possible? I would be delighted if it is. Of course, it > does give a warning (see below) that some biomaRt functions will not > work, so perhaps this is futile. Is there another alternative? > > My commands: > > -------- > library(GenomeGraphs) > > mart <- useMart(biomart="ensembl", dataset="mmusculus_gene_ensembl") > ds <- listDatasets(mart) > ds[grep("mus",ds$desc),] > > # RPLP1 on mm8 (i.e. not recent) build > # this will run, but obviously won't find my gene > gr <- new("GeneRegion", chromosome = "9", > start = 61711290, end = 61712548, strand="-", > biomart = mart) > > print(gr) > > # try the archived version > ensembl46 <- useMart(biomart="ensembl_mart_46", > dataset="mmusculus_gene_ensembl", archive=TRUE) > ds46 <- listDatasets(ensembl46) > ds46[grep("mus",ds46$desc),] > > gr46 <- new("GeneRegion", chromosome = "9", > start = 61711290, end = 61712548, strand="-", > biomart = ensembl46) > -------- > > My output: > > > library(GenomeGraphs) > Loading required package: biomaRt > Loading required package: grid > > > > mart=useMart(biomart="ensembl", dataset="mmusculus_gene_ensembl") > Checking attributes ... ok > Checking filters ... ok > > ds <- listDatasets(mart) > > ds[grep("mus",ds$desc),] > dataset description version > 43 mmusculus_gene_ensembl Mus musculus genes (NCBIM37) NCBIM37 > > > > # RPLP1 on mm8 (i.e. not recent) build > > # this will run, but obviously won't find my gene > > gr <- new("GeneRegion", chromosome = "9", > + start = 61711290, end = 61712548, strand="-", > biomart = mart) > > > > print(gr) > Object of class 'GeneRegion': > Start:61709290 > End:61714548 > Chromosome: 9 > Exons in Ensembl: > ensembl_gene_id ensembl_transcript_id ensembl_exon_id > exon_chrom_start > NA <na> <na> <na> > <na> > exon_chrom_end rank strand biotype > NA <na> <na> <na> <na> > > There are 0 more rows> > > > > # try the archived version > > ensembl46=useMart(biomart="ensembl_mart_46", > dataset="mmusculus_gene_ensembl", archive=TRUE) > Checking attributes ... ok > Checking filters ... ok > Warning messages: > 1: In bmAttrFilt("attributes", mart) : > biomaRt warning: looks like we're connecting to an older version of > BioMart suite. Some biomaRt functions might not work. > 2: In bmAttrFilt("filters", mart) : > biomaRt warning: looks like we're connecting to an older version of > BioMart suite. Some biomaRt functions might not work. > > ds46 <- listDatasets(ensembl46) > > ds46[grep("mus",ds46$desc),] > dataset description version > 34 mmusculus_gene_ensembl Mus musculus genes (NCBIM36) NCBIM36 > > > > gr46 <- new("GeneRegion", chromosome = "9", > + start = 61711290, end = 61712548, strand="-", > biomart = ensembl46) > Error in getBM(c("ensembl_gene_id", "ensembl_transcript_id", > "ensembl_exon_id", : > Invalid attribute(s): ensembl_exon_id > Please use the function 'listAttributes' to get valid attribute names > > > > sessionInfo() > R version 2.9.0 (2009-04-17) > i386-apple-darwin8.11.1 > > locale: > en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8 > > attached base packages: > [1] grid stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] GenomeGraphs_1.3.5 biomaRt_2.0.0 > > loaded via a namespace (and not attached): > [1] RCurl_0.94-1 XML_2.3-0 > > > Thanks, > Mark > > > > ------------------------------ > Mark Robinson, PhD (Melb) > Epigenetics Laboratory, Garvan > Bioinformatics Division, WEHI > e: m.robinson at garvan.org.au > e: mrobinson at wehi.edu.au > p: +61 (0)3 9345 2628 > f: +61 (0)3 9347 0852 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >
ADD COMMENT

Login before adding your answer.

Traffic: 728 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6