Trouble querying pubmed on strings
1
0
Entering edit mode
Ken Termiso ▴ 250
@ken-termiso-1087
Last seen 10.3 years ago
hi all, i'm trying to get a function working that queries pubmed with any string and returns pubMedAbst objects corrresponding to the pubmed article hits from the query string... this is my code so far, based partly from annotate's 'query.pdf' and also from the perl script from NCBI at http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html : library(annotate) library(XML) query <- "trk" pmSrch <- function(query) { utils <- "http://www.ncbi.nlm.nih.gov/entrez/eutils" esearch <- paste(utils, "/esearch.fcgi?" , "report=xml&mode=text&tool=bioconductor&", "db=Pubmed&retmax=1&usehistory=y&term=", query) esearch <- gsub(" ", "", esearch) cat(esearch, "\n") #return(esearch) # returns URL return(.handleXML(esearch)) } pms <- pmSrch(query) a <- xmlRoot(pms) numAbst <- length(xmlChildren(a)) numAbst arts <- vector("list", length = numAbst) absts <- rep(NA, numAbst) for (i in 1:numAbst) { arts[[i]] <- buildPubMedAbst(a[[i]]) absts[i] <- abstText(arts[[i]]) } i don't know perl and i end up with numAbst = 8 (regardless of the search string) and esearch = http://www.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?report=xml&mode =text&tool=bioconductor&db=Pubmed&retmax=1&usehistory=y&term=trk but typing : >arts[1] [[1]] An object of class 'pubMedAbst': Title: No Title Provided PMID: No PMID Provided Authors: No Author Information Provided Journal: No Journal Provided Date: Month Year simply gives me empty objects... i'd appreciate any help anyone can give. i am not familiar with XML... thanks in advance, ken
• 1.1k views
ADD COMMENT
0
Entering edit mode
Seth Falcon ★ 7.4k
@seth-falcon-992
Last seen 10.3 years ago
Hi Ken, On 4 Nov 2005, jerk_alert at hotmail.com wrote: > hi all, > > i'm trying to get a function working that queries pubmed with any > string and returns pubMedAbst objects corrresponding to the pubmed > article hits from the query string... > > this is my code so far, based partly from annotate's 'query.pdf' and > also from the perl script from NCBI at > http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html > pmSrch <- function(query) > { > utils <- "http://www.ncbi.nlm.nih.gov/entrez/eutils" > > esearch <- paste(utils, "/esearch.fcgi?" , > "report=xml&mode=text&tool=bioconductor&", > "db=Pubmed&retmax=1&usehistory=y&term=", query) > esearch <- gsub(" ", "", esearch) You might find the sep and collapse arguments to paste useful here. No need for gsub then. That would also allow you to make the query string a bit easier to read. > i don't know perl and i end up with numAbst = 8 (regardless of the > search string) and esearch = If you look at what you get back: lapply(xmlChildren(xmlRoot(pms)), xmlValue) And look at the last part of the Perl example [1], you will see that the search results have to be fetched in two steps. Here is a very rough cut of a function to fetch results after the first query: pmExtract <- function(pmSrchResult) { dom <- xmlRoot(pmSrchResult) searchData <- lapply(xmlChildren(dom), xmlValue) webEnv <- searchData$WebEnv queryKey <- searchData$QueryKey utils <- "http://www.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?" args <- c("rettype=abstract", "retmode=xml", "retstart=0", "retmax=3", "db=pubmed", paste("query_key", queryKey, sep="="), paste("WebEnv", webEnv, sep="=")) args <- paste(args, collapse="&") utils <- paste(utils, args, sep="") cat(utils, "\n") return(.handleXML(utils)) } So then you would do: res1 <- pmSearch("trk") res2 <- pmExtract(res1) ## process res2 to extract the XML abstracts, etc Hope that helps to get you going. Best, + seth [1] http://eutils.ncbi.nlm.nih.gov/entrez/query/static/eutils_example.pl
ADD COMMENT

Login before adding your answer.

Traffic: 834 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6