Hi,
I have a getBM
query for which I want to retrieve all unique mouse genes. However I receive "duplicate" entries, as I will receive an entry for each start/end site that exists in the DB.
These are unique entries as the start/end site differ, but they have the same gene id/external gene name/mgi symbol etc...
Is there a way to get the "gene summary" per gene? In other words the max location on the chromosome?
MouseAnnotations <- getBM(attributes = c("ensembl_gene_id", "mgi_symbol", "ensembl_transcript_id",
"chromosome_name", "external_gene_name",
"gene_biotype", "strand", "transcript_start", "transcript_end"),
mart = mouse,
uniqueRows = T)
Thanks!
Hi Mike,
I have tried the query without the transcript, and I believe there are cases where I will receive unique transcript start/end sites regardless.. Is there a way to ask for the longest or shortest isoform during the query?