biomaRt bug - shuffling getBM headers and content
1
0
Entering edit mode
@nenad-bartonicek-3293
Last seen 10.2 years ago
Hi, I would like to report a bug in biomaRt service. If I use the function getBM and I am searching for UTRs or coding sequence as well as identifiers, I get them in scrambled order. Meaning, the content of columns does not correspond to their headers: library(biomaRt) ensembl=useMart("ensembl", dataset="mmusculus_gene_ensembl") getBM (attributes = c ("ensembl_gene_id ","ensembl_transcript_id ","coding "),filters ="ensembl_gene_id",values=c("ENSMUSG00000028661"),mart=ensembl) This will result in something like: ensembl_gene_id ATGGCCCCCGCCCGGGCCCGCCTGTCCCCCGCTCTCTGGGTCGTCACGGCCGCGGCGGCGGCCACCTGCG TGTCCGCGGGGCGCGGCGAAGTGAACTTGTTGGATACATCAACCATCCACGGAGACTGGGGCTGGCTCAC GTATCCCGCTCATGGGTGGGACTCCATCAACGAGGTAGACGAGTCCTTCCGGCCCATCCACACGTACCA .... ensembl_transcript_id coding ENSMUSG00000028661 ENSMUST00000030420 Hope this helps somebody. Cheers, Nenad Nenad Bartonicek PhD student, Enright group European Bioinformatics Institute Hinxton Cambridge CB10 1SD United Kingdom > sessionInfo() R version 2.11.0 (2010-04-22) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] BSgenome_1.16.5 seqinr_2.0-9 RMySQL_0.7-4 [4] DBI_0.2-5 RColorBrewer_1.0-2 biomaRt_2.4.0 [7] ShortRead_1.6.2 Rsamtools_1.0.5 lattice_0.18-8 [10] Biostrings_2.16.7 GenomicRanges_1.0.5 IRanges_1.6.8 [13] R.utils_1.4.3 R.oo_1.7.3 R.methodsS3_1.2.0 loaded via a namespace (and not attached): [1] Biobase_2.8.0 grid_2.11.0 hwriter_1.2 RCurl_1.4-2 tools_2.11.0 [6] XML_3.1-0 [[alternative HTML version deleted]]
biomaRt biomaRt • 1.1k views
ADD COMMENT
0
Entering edit mode
Steffen ▴ 500
@steffen-2351
Last seen 10.2 years ago
Hi Nenad, Thanks for reporting this. I'll look into what is causing this. Most likely it is a bug at the Ensembl BioMart server side and not the biomaRt package. Cheers, Steffen On Thu, Jul 8, 2010 at 9:24 AM, Nenad Bartonicek <nenad@ebi.ac.uk> wrote: > Hi, > > I would like to report a bug in biomaRt service. > If I use the function getBM and I am searching for UTRs or coding > sequence as well as identifiers, I get them in scrambled order. > Meaning, the content of columns does not correspond to their headers: > > library(biomaRt) > ensembl=useMart("ensembl", dataset="mmusculus_gene_ensembl") > getBM > (attributes > = > c > ("ensembl_gene_id > ","ensembl_transcript_id > ","coding > "),filters > ="ensembl_gene_id",values=c("ENSMUSG00000028661"),mart=ensembl) > > This will result in something like: > > ensembl_gene_id > > > ATGGCCCCCGCCCGGGCCCGCCTGTCCCCCGCTCTCTGGGTCGTCACGGCCGCGGCGGCGGCCACCTG CGTGTCCGCGGGGCGCGGCGAAGTGAACTTGTTGGATACATCAACCATCCACGGAGACTGGGGCTGGCTC ACGTATCCCGCTCATGGGTGGGACTCCATCAACGAGGTAGACGAGTCCTTCCGGCCCATCCACACGTACC A > .... > > ensembl_transcript_id coding > ENSMUSG00000028661 ENSMUST00000030420 > > Hope this helps somebody. > > Cheers, > > Nenad > > Nenad Bartonicek > PhD student, Enright group > European Bioinformatics Institute > Hinxton > Cambridge > CB10 1SD > United Kingdom > > > > sessionInfo() > R version 2.11.0 (2010-04-22) > x86_64-unknown-linux-gnu > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_GB.UTF-8 > [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices datasets utils methods base > > other attached packages: > [1] BSgenome_1.16.5 seqinr_2.0-9 RMySQL_0.7-4 > [4] DBI_0.2-5 RColorBrewer_1.0-2 biomaRt_2.4.0 > [7] ShortRead_1.6.2 Rsamtools_1.0.5 lattice_0.18-8 > [10] Biostrings_2.16.7 GenomicRanges_1.0.5 IRanges_1.6.8 > [13] R.utils_1.4.3 R.oo_1.7.3 R.methodsS3_1.2.0 > > loaded via a namespace (and not attached): > [1] Biobase_2.8.0 grid_2.11.0 hwriter_1.2 RCurl_1.4-2 tools_2.11.0 > [6] XML_3.1-0 > > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 835 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6