How can I extract the human paralogues from a list of genes using BioMart?
1
0
Entering edit mode
Jack • 0
@jack-14823
Last seen 6.7 years ago

Hi, I am new to using R and I am having difficulties finding a way to extract the human gene paralogues from the Ensembl database. I have a list of genes which I have already used to get various other bits of information on. I just can't work out how to get the paralogues. It would be very helpful if somebody could help me out.

 

Many thanks 

BioMart paralog genes • 3.2k views
ADD COMMENT
1
Entering edit mode
Mike Smith ★ 6.5k
@mike-smith
Last seen 6 hours ago
EMBL Heidelberg

First we'll set it up so we are using the Ensembl human genes mart:

library(biomaRt)
human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")

Now create a vector with the names of the genes we're interested in.  In this example we'll look for paralogs to a single gene.  If you've got more than one you can provide all of them at this step.  I'm also using the 'external gene name'.  If your list of genes is HGNC symbols, Entrez IDs, etc you'll have to choose the correct field in then final step.

gene_id <- "TBPB"

Then we submit the query to Ensembl BioMart.  filters is the field we want to search, values are the specific entries we want to look for, and attributes specifies the fields we want to get back.  So here we are searching for the gene called TBPB and getting back the gene name and Ensembl ID, plus the gene name and Ensembl ID for anything that is annotated as being a paralog.  If there is more than one paralog we will get more than one row in the entry.  If there are no paralogs then you'll get nothing back.

results <- getBM(attributes = c("ensembl_gene_id", 
                                "external_gene_name",
                                "hsapiens_paralog_ensembl_gene", 
                                "hsapiens_paralog_associated_gene_name"),
                 filters = "external_gene_name",
                 values = gene_id,
                 mart = human)

We can look at the result to see what was returned.

results
  ensembl_gene_id external_gene_name hsapiens_paralog_ensembl_gene hsapiens_paralog_associated_gene_name
1 ENSG00000042813               ZPBP               ENSG00000186075                                 ZPBP2
ADD COMMENT

Login before adding your answer.

Traffic: 452 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6