Accession ID to Chromosome Name and Start-End
2
0
Entering edit mode
@gundala-viswanath-2872
Last seen 10.3 years ago
Dear experts, Given the accession IDs such as these: How can I extract the "chromosome name", "start" and "end" position of each ID, with BioConductor. AB002292 AB002296 AB002298 AB002303 .. EF565109 K03493 L36149 M16404 X80391 Z25470 I tried this, but it gives me so many coordinates sets instead of just 3 (corresponding to query). > library(biomaRt) > acc <- c("AB002292", "X80391", "Z25470") > mart <- useMart("ensembl") > mart <-useDataset("hsapiens_gene_ensembl",mart) > t <- getBM(attributes=c("chromosome_name", "start_position", "end_position"), values=acc, mart=mart) > t Please kindly advice. - Gundala Viswanath Jakarta - Indonesia
• 805 views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 4 months ago
United States
On Fri, Oct 3, 2008 at 3:29 AM, Gundala Viswanath <gundalav at="" gmail.com=""> wrote: > Dear experts, > > Given the accession IDs such as these: > > How can I extract the "chromosome name", "start" and "end" position > of each ID, with BioConductor. > > AB002292 > AB002296 > AB002298 > AB002303 > .. > EF565109 > K03493 > L36149 > M16404 > X80391 > Z25470 > > I tried this, but it gives me so many coordinates sets instead of just > 3 (corresponding > to query). > >> library(biomaRt) >> acc <- c("AB002292", "X80391", "Z25470") >> mart <- useMart("ensembl") >> mart <-useDataset("hsapiens_gene_ensembl",mart) >> t <- getBM(attributes=c("chromosome_name", "start_position", "end_position"), values=acc, mart=mart) Close. You simply need to specify a filter on embl id. t <- getBM(attributes=c("chromosome_name", "start_position", "end_position",'embl'), filters=c('embl'), values=acc, mart=mart) I added a column in the output to show the embl id, also. Sean
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 hours ago
United States
Hi Gundala, Gundala Viswanath wrote: > Dear experts, > > Given the accession IDs such as these: > > How can I extract the "chromosome name", "start" and "end" position > of each ID, with BioConductor. > > AB002292 > AB002296 > AB002298 > AB002303 > .. > EF565109 > K03493 > L36149 > M16404 > X80391 > Z25470 > > I tried this, but it gives me so many coordinates sets instead of just > 3 (corresponding > to query). > >> library(biomaRt) >> acc <- c("AB002292", "X80391", "Z25470") >> mart <- useMart("ensembl") >> mart <-useDataset("hsapiens_gene_ensembl",mart) >> t <- getBM(attributes=c("chromosome_name", "start_position", "end_position"), values=acc, mart=mart) You need a filter argument as well. In addition, I usually like to put the input argument into the attributes as well, so you can line things up if there are certain IDs that don't return a result. > getBM( c("embl","chromosome_name","start_position","end_position"),"embl", acc, mart) embl chromosome_name start_position end_position 1 AB002292 8 1759549 1894206 2 X80391 17 3141679 3142644 3 Z25470 18 13815543 13816861 In this case it isn't necessary, but it might be for your whole vector of IDs. Best, Jim >> t > > Please kindly advice. > > > - Gundala Viswanath > Jakarta - Indonesia > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Hildebrandt Lab 8220D MSRB III 1150 W. Medical Center Drive Ann Arbor MI 48109-0646 734-936-8662
ADD COMMENT

Login before adding your answer.

Traffic: 458 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6