[biomaRt] getBM: Error when obtaining mim_morbid_description
2
0
Entering edit mode
@tibor-fulop-7138
Last seen 9.7 years ago
Finland

I am trying to obtain "mim_morbid_description" (among other attributes) from homo sapiens gene ensembl mart by using getBM function but I am getting this error:

> getBM(attributes=c("ensembl_gene_id","mim_morbid_description"), filters=c("hgnc_symbol"), values=c("brca2","lct"),mart=genemart, verbose=TRUE)
<?xml version='1.0' encoding='UTF-8'?><!DOCTYPE Query><Query  virtualSchemaName = 'default' uniqueRows = '1' count = '0' datasetConfigVersion = '0.6' header='0' requestid= 'biomaRt'> <Dataset name = 'hsapiens_gene_ensembl'><Attribute name = 'ensembl_gene_id'/><Attribute name = 'mim_morbid_description'/><Filter name = 'hgnc_symbol' value = 'brca2,lct' /></Dataset></Query>
#################
Results from server:
[1] "ENSG00000139618\t PANCREATIC CANCER, SUSCEPTIBILITY TO, 2\n;;PNCA2\n\nENSG00000139618\t GLIOMA SUSCEPTIBILITY 3; GLM3\n\nENSG00000139618\t BREAST-OVARIAN CANCER, FAMILIAL, SUSCEPTIBILITY TO, 2; BROVCA2\nBREAST CANCER, FAMILIAL, SUSCEPTIBILITY TO, 2, INCLUDED;;\nOVARIAN CANCER, FAMILIAL, SUSCEPTIBILITY TO, 2, INCLUDED\n\nENSG00000139618\t FANCONI ANEMIA, COMPLEMENTATION GROUP D1; FANCD1\n;;FAD1\n\nENSG00000139618\t FANCONI ANEMIA, COMPLEMENTATION GROUP A; FANCA\n;;FANCONI ANEMIA; FA\nFANCONI ANEMIA, ESTREN-DAMESHEK VARIANT, INCLUDED;;\nESTREN-DAMESHEK VARIANT OF FANCONI ANEMIA, INCLUDED;;\nESTREN-DAMESHEK VARIANT OF FANCONI PANCYTOPENIA, INCLUDED\n\nENSG00000139618\t BREAST CANCER\n;;BREAST CANCER, FAMILIAL\nBREAST CANCER, FAMILIAL MALE, INCLUDED\n\nLRG_293\t PANCREATIC CANCER, SUSCEPTIBILITY TO, 2\n;;PNCA2\n\nLRG_293\t GLIOMA SUSCEPTIBILITY 3; GLM3\n\nLRG_293\t BREAST-OVARIAN CANCER, FAMILIAL, SUSCEPTIBILITY TO, 2; BROVCA2\nBREAST CANCER, FAMILIAL, SUSCEPTIBILITY TO, 2, INCLUDED;;\nOVARIAN CANCER, FAMILIAL, SUSCEPTIBILITY TO, 2, INCLUDED\n\nLRG_293\t FANCONI ANEMIA, COMPLEMENTATION GROUP D1; FANCD1\n;;FAD1\n\nLRG_293\t FANCONI ANEMIA, COMPLEMENTATION GROUP A; FANCA\n;;FANCONI ANEMIA; FA\nFANCONI ANEMIA, ESTREN-DAMESHEK VARIANT, INCLUDED;;\nESTREN-DAMESHEK VARIANT OF FANCONI ANEMIA, INCLUDED;;\nESTREN-DAMESHEK VARIANT OF FANCONI PANCYTOPENIA, INCLUDED\n\nLRG_293\t BREAST CANCER\n;;BREAST CANCER, FAMILIAL\nBREAST CANCER, FAMILIAL MALE, INCLUDED\n\nENSG00000115850\t LACTASE DEFICIENCY, CONGENITAL\n;;ALACTASIA, CONGENITAL;;\nDISACCHARIDE INTOLERANCE II\n\n"
attr(,"Content-Type")
             
"text/plain"
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  line 2 did not have 2 elements

I guess that problem causes newline characters in mim-morbid-description strings in results from server but I do not know how to resolve this problem.

The sessionInfo() output:

R version 3.1.2 (2014-10-31)
Platform: x86_64-redhat-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BiocInstaller_1.16.1 biomaRt_2.22.0      

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.28.1 Biobase_2.26.0       BiocGenerics_0.12.1  bitops_1.0-6         DBI_0.3.1            GenomeInfoDb_1.2.4   IRanges_2.0.1        parallel_3.1.2       RCurl_1.95-4.5       RSQLite_1.0.0        S4Vectors_0.4.0      stats4_3.1.2        
[13] tools_3.1.2          XML_3.98-1.1   
biomart getbm error • 1.6k views
ADD COMMENT
0
Entering edit mode
Julian Gehring ★ 1.3k
@julian-gehring-5818
Last seen 5.5 years ago

You pinpointed the problem correctly, the newline characters make the import fail. This is because the data is returned from the server as a tab-separated file which then is passed on to `read.table`; however, this requires the data input to be table like. The newlines break that.

As a user, there is unfortunately not much you can do right now. Here some ideas:

Short-term solutions:

1) If your query is fairly simple as in the example, you can go the ensembl biomart site, construct your query interactively and save the data in a file. Then you can import that file in R manually, which should boil down to a version of `read.table`.

Longer-term solutions:

1) Contact the ensembl helpdesk and ask them to remove the newline characters. There seems no use in having them.

2) Contact the biomaRt maintainer and ask for a fix. Please note that this can be harder, since the input data from ensembl behaves strangely here, in my opinion.

ADD COMMENT
0
Entering edit mode
Hi Tibor and Julian, This is the format of the data that has supplied to us. We are in contact with OMIM about this issue. Unfortunately though it is too late to address this in the upcoming e79 release, due next week, but hopefully we'll fix it in e80. Sorry for the inconvenience. Cheers, Amonida -- Amonida Zadissa Ensembl Production team EMBL-EBI Wellcome Trust Genome Campus Hinxton CB10 1SD England On 27/02/2015 08:57, Julian Gehring [bioc] wrote: > Activity on a post you are following on support.bioconductor.org > <https: support.bioconductor.org=""> > > User Julian Gehring <https: support.bioconductor.org="" u="" 5818=""/> wrote > Answer: [biomaRt] getBM: Error when obtaining mim_morbid_description > <https: support.bioconductor.org="" p="" 65230="" #65248="">: > > You pinpointed the problem correctly, the newline characters make the > import fail. This is because the data is returned from the server as a > tab-separated file which then is passed on to `read.table`; however, > this requires the data input to be table like. The newlines break that. > > As a user, there is unfortunately not much you can do right now. Here > some ideas: > > Short-term solutions: > > 1) If your query is fairly simple as in the example, you can go the > ensembl biomart site, construct your query interactively and save the > data in a file. Then you can import that file in R manually, which > should boil down to a version of `read.table`. > > Longer-term solutions: > > 1) Contact the ensembl helpdesk and ask them to remove the newline > characters. There seems no use in having them. > > 2) Contact the biomaRt maintainer and ask for a fix. Please note that > this can be harder, since the input data from ensembl behaves strangely > here, in my opinion. > > ------------------------------------------------------------------------ > > You may reply via email or visit > A: [biomaRt] getBM: Error when obtaining mim_morbid_description >
ADD REPLY
0
Entering edit mode

Hello Amonida.

Thank you for the information. 

Best regards

Tibor

ADD REPLY
0
Entering edit mode
@tibor-fulop-7138
Last seen 9.7 years ago
Finland

Thank you for your advices. The short-term solution is not very suitable so I will try to contact the mentioned parties.

ADD COMMENT

Login before adding your answer.

Traffic: 794 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6