Re : "Testing" the connection of getBM() ?
1
0
Entering edit mode
G M ▴ 20
@g-m-3388
Last seen 9.6 years ago
Dear Wolfgang, Thanks for answering so quickly. I'm not sure I understand the "vectorized queries". Actually, I submit vectors too in my getBM() command.... By the way, I used martDisconnect(ensembl) only to see if the connection problem came out of the way the programm exists the connection. Apparently not, and martDisconnect() is, allowing to the documentation, obsolete. I finally overcame the issue, thanks to your advice : I use now try(), with is enough for me, and my script is fine. It also handle my local connexion issues, which is great. My final script, getting list of genes from NimbleGen chip datas, on a chromosome 9 region : fenetre=255 chr=9 file="/myfile" db=useMart("ensembl") ensembl=useDataset("hsapiens_gene_ensembl",mart=db) filtres = c("chromosome_name","start","end") attributs = c("ensembl_gene_id","embl","description","start_position", "end_position") query <- function (chr,start, end) { valeurs = list(chr, start, end) getBM(mart=ensembl, filters=filtres,values=valeurs,attributes=attributs) } genes <- function(file){ data <- read.table(file, sep="\t", header=T) for(i in 1:nrow(file)){ genes=try(query(chr,file[i,1],file[i,2])) } Quite simple ... Best wishes, Marion ________________________________ De : Wolfgang Huber <huber@ebi.ac.uk> Cc : bioconductor@stat.math.ethz.ch Envoyé le : Mercredi, 8 Avril 2009, 18h00mn 33s Objet : Re: [BioC] "Testing" the connection of getBM() ? Dear Marion thanks for the feedback. Please have a look at the documentation of the biomaRt package (its vignette, and the man page of the getBM function) where you can learn that getBM supports vectorized queries, i.e. queries in which the "values" argument of getBM has multiple elements per attribute. This will be much faster and resource-efficient than the way you propose. Note that the BioMart webservice that you are connecting to is a public service that wants to be used by many people. What is your rationale for calling 'martDisconnect(ensembl)' within your 'query' function? Also, see "? tryCatch" for a general mechanism for catching errors without stopping the interpreter. Best wishes Wolfgang ------------------------------------------------ Wolfgang Huber, EMBL, http://www.ebi.ac.uk/huber G M wrote: > Hi all, > > I'm currently trying to write a script allowing me to get some informations about several (actually about 18000 per file) sequence intervals. My query works, but I can't throw it all my file long. > > Basically, I throw my request with getBM as many as my file got lines, but after a while, the connection cuts, and i have this error message : > Erreur dans postForm(paste(martHost(mart), "?", sep = ""), query = xmlQuery) : couldn't connect to host > > Is there a way to "test" the postForm method or the connection with biomaRt from R before throwing the request ? > > Here is my script : > > > file="/myfile" > > chr=9 > db=useMart("ensembl") > ensembl=useDataset("hsapiens_gene_ensembl",mart=db) > filtres = c("chromosome_name","start","end") > attributs = c("ensembl_gene_id","embl","description","start_position ","end_position") > > data <- read.table(file, sep="\t", header=T) > > query <- function (chr,start, end) { > valeurs = list(chr, start, end) > genes=getBM(mart=ensembl, filters=filtres,values=valeurs,attributes=attributs) > martDisconnect(ensembl) } > > genes=c() > for(i in 1:nrow(data)){ > > genes=query(chr,data[i,1],data[i,2]) } > > > > > Thanks for your help, > > Marion > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor [[alternative HTML version deleted]]
biomaRt biomaRt • 991 views
ADD COMMENT
0
Entering edit mode
Caroline ▴ 10
@caroline-3402
Last seen 9.6 years ago
hello, > I'm not sure I understand the "vectorized queries". Actually, I submit vectors too in my getBM() command.... I thought I understood this and now I think I don't - can someone clarify for me? I was under the impression that if I used a filter then the results I got back would be those for which *any* of the values of that filter were true. eg. using ensembl_gene_id and affy_hg_u133_plus_2 with values list(c(ensid_1, ensid_2), c(affyid_1, affyid_2)) then the results I'd get back would have either of the affy IDs and either of the ensembl IDs - the values aren't paired - a gene with ensid_1 and affyid_2 is a valid result? I know Biomart interprets the combination of filters chromosome_name, start and end as a range filter, but I'm not sure what it means to have multiple values for start and end - elements falling within any of the start or end values?. A quick test suggests that it returns anything falling between the smallest start and largest end position. So, how do you vectorize a range filter? I've looked at both the Vignette and the getBM docs and am none the wiser. Cheers, Cass.
ADD COMMENT

Login before adding your answer.

Traffic: 1030 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6