Question: biomaRt query: retrieve exon locations etc
1
gravatar for Tim Smith
6.3 years ago by
Tim Smith1.1k
Tim Smith1.1k wrote:
Hi All, Sorry for the naive question! I was trying to retrieve some coordinates (start and end positions) from biomaRt and I'm not sure if I'm doing things right… Problem definition: For a gene, retrieve the 5'UTR and exon coordinates for the most common isoform of the gene. *************** library(biomaRt) ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") getAtt <- c('chromosome_name', 'start_position', 'end_position', 'strand','exon_chrom_start','exon_chrom_end',         '5_utr_start','5_utr_end','3_utr_start','3_utr_end') elocs <- getBM(attributes=getAtt,filters="hgnc_symbol",value="ZMYM4",m art=ensembl) print(elocs) ************** However, this would give me the coordinates for all isoforms and it would be difficult to get the coordinates for the most common isoform. How can I identify the most common isoform? many thanks! [[alternative HTML version deleted]]
biomart • 2.0k views
ADD COMMENTlink modified 6.3 years ago by Steffen Durinck540 • written 6.3 years ago by Tim Smith1.1k
Answer: biomaRt query: retrieve exon locations etc
1
gravatar for Steffen Durinck
6.3 years ago by
Steffen Durinck540 wrote:
Hi Tim, There is no filter for getting the cannonical transcript only. Something that gets close is filtering for transcripts that have a ccds id, adding this to your query only returns one transcript: elocs <- getBM(attributes=c(getAtt,"ensembl_transcript_id"),filters=c("hgnc_sym bol","with_ccds"),value=list("ZMYM4",TRUE),mart=ensembl) Cheers, Steffen On Fri, Jan 25, 2013 at 12:46 PM, Tim Smith <tim_smith_666@yahoo.com> wrote: > Hi All, > > Sorry for the naive question! I was trying to retrieve some coordinates > (start and end positions) from biomaRt and I'm not sure if I'm doing things > right > > Problem definition: For a gene, retrieve the 5'UTR and exon coordinates > for the most common isoform of the gene. > > *************** > library(biomaRt) > ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl") > > getAtt <- c('chromosome_name', 'start_position', 'end_position', > 'strand','exon_chrom_start','exon_chrom_end', > '5_utr_start','5_utr_end','3_utr_start','3_utr_end') > > elocs <- > getBM(attributes=getAtt,filters="hgnc_symbol",value="ZMYM4",mart=ens embl) > print(elocs) > > ************** > > However, this would give me the coordinates for all isoforms and it would > be difficult to get the coordinates for the most common isoform. How can I > identify the most common isoform? > > many thanks! > > [[alternative HTML version deleted]] > > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 6.3 years ago by Steffen Durinck540
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 153 users visited in the last hour