Would have liked to contribute this as a pull request, but the repo is read only. The getBM function in biomaRt.R uses read.table to produce the dataframe output after querying BioMart. There is no user accessible way to change the setting of the quote argument to read.table for getBM and it is defaulted to quote = "\"". When a queried BioMart field contains a single double quote character, read.table interprets all data that follows that character as a string rather than as the rest of the dataframe as it should.
Here is a SSCCE:
library("biomaRt")
ensembl = useMart("ensembl")
ensembl = useDataset("btaurus_gene_ensembl",mart=ensembl)
ensembl_ids = c("ENSBTAG00000045922", "ENSBTAG00000045923")
biomart_info = getBM(attributes=c('ensembl_gene_id', 'description'), filters='ensembl_gene_id', values=ensembl_ids, mart=ensembl)
biomart_info
[1] ensembl_gene_id description <0 rows> (or 0-length row.names)
Where ensemble gene id "ENSBTAG00000045923" contains the description:
MHC class II antigen; Putative MHC class II antigen"; Uncharacterized protein [Source:UniProtKB/TrEMBL;Acc:Q70IB5]
To fix this, line 379 in biomaRt.R should be:
getBM <- function(attributes, filters = "", values = "", mart, curl = NULL, checkFilters = TRUE, verbose=FALSE, uniqueRows=TRUE, bmHeader=FALSE, quote="\""){
and line 530 in biomaRt.R should be:
result = read.table(con, sep="\t", header=bmHeader, quote = quote, comment.char = "", check.names = FALSE, stringsAsFactors=FALSE)
Which will allow users to supply a quote argument of quote = "" when necessary to handle unclosed double quote characters.