Entering edit mode
anamaria
▴
10
@anamaria-21976
Last seen 4.5 years ago
Hello,
I have two files (each has 300 lines)like this:
head 1g.txt
rs6792369
rs1414517
rs16857712
rs16857703
rs12239392
...
head 1n.txt
rs1042779
rs2360630
rs10753597
rs7549096
rs2343491
...
For each pair of rs# from those two files I can run this command in R
library(httr)
library(jsonlite)
library(xml2)
server <- "http://rest.ensembl.org"
ext <- "/ld/human/pairwise/rs6792369/rs1042779?population_name=1000GENOMES:phase_3:KHV"
r <- GET(paste(server, ext, sep = ""), content_type("application/json"))
stop_for_status(r)
head(fromJSON(toJSON(content(r))))
d_prime r2 variation1 variation2 population_name
1 0.975513 0.951626 rs6792369 rs1042779 1000GENOMES:phase_3:KHV
What I would like to do is to do is to run this command for every SNP in one list (1g.txt) to each SNP in another list (1n.txt). Where SNP# is rs# and output every line of result in list.txt
The process is illustrated in the attachment. https://imgur.com/a/adpCskU
Thanks! I was hoping that someone who did use bioconductor packages did encounter the same problem or if someone knows if there is a bioconductor packages that does this.
All you are doing is repeated
GET
requests using the Ensembl API. I suppose somebody might have coded that up in a package, but I have no idea why one might do such a thing.Do note that you are contemplating somewhere around 90,000
GET
requests (300 * 300), which is an exceeding large number, and which if you don't space (timewise) accordingly will almost surely get your IP address banned by somebody at Ensembl. Put a different way, there has to be a different way to get these data that doesn't involve something as inefficient as what you propose.Which is why I suggested Biostars.org, which is a better venue for questions like this. I would be surprised if Kevin Blighe or ATpoint haven't already answered something very similar over there already.
anamaria, have you not contacted Ensembl directly about this? They have a great support team.
yes I do understand this is a terrible solution. I will ask at Biostars.org