Retrieve gene summary start/end in biomart query
1
0
Entering edit mode
rbenel ▴ 40
@rbenel-13642
Last seen 20 months ago
Israel

Hi,

I have a getBM query for which I want to retrieve all unique mouse genes. However I receive "duplicate" entries, as I will receive an entry for each start/end site that exists in the DB.

These are unique entries as the start/end site differ, but they have the same gene id/external gene name/mgi symbol etc...

Is there a way to get the "gene summary" per gene? In other words the max location on the chromosome?

MouseAnnotations <- getBM(attributes = c("ensembl_gene_id", "mgi_symbol", "ensembl_transcript_id",
                                               "chromosome_name", "external_gene_name",
                                               "gene_biotype", "strand", "transcript_start", "transcript_end"),
                                mart = mouse, 
                          uniqueRows = T)

Thanks!

biomaRt • 1.1k views
ADD COMMENT
0
Entering edit mode
Mike Smith ★ 6.5k
@mike-smith
Last seen 3 hours ago
EMBL Heidelberg

I'm not able to test this comprehensively today, but I think you're seeing this behaviour because you're asking for transcript level information, so you get start and end points for all transcripts of each gene.

Maybe try only asking for gene related attributes e.g. something like:

MouseAnnotations2 <- getBM(attributes = c("ensembl_gene_id", "mgi_symbol",
                                         "chromosome_name", "external_gene_name",
                                         "gene_biotype", "strand", "start_position", "end_position"),
                          mart = mouse, 
                          uniqueRows = T)
ADD COMMENT
0
Entering edit mode

Hi Mike,

I have tried the query without the transcript, and I believe there are cases where I will receive unique transcript start/end sites regardless.. Is there a way to ask for the longest or shortest isoform during the query?

ADD REPLY

Login before adding your answer.

Traffic: 842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6