Trouble using biomaRt to pull pathway genes from GO
1
0
Entering edit mode
cstorer • 0
@e1a78c66
Last seen 15 months ago
United States

I am trying to retrieve full gene lists for a given GO id (like "GO:0060973"), limiting to human genes. The R code I am using is shown below. When run, just two genes are returned, but the AmiGO2 page for that pathway https://amigo.geneontology.org/amigo/term/GO:0060973 returns 20 or so genes.

I think this is because my query is only returning genes with a "GO class (direct)" exactly matching the queried pathway ("cell migration involved in heart development"), but I would like to also include the genes from sub-pathways shown on the AmiGO2 page (like "cell migration involved in coronary vasculogenesis").

I know I must be able to specify this in the filters somehow, but I have been searching for quite a while and just can't figure it out.

library("biomaRt")

ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl")

gene.data <- getBM(attributes=c('hgnc_symbol', 
'ensembl_transcript_id', 'go_id'), filters = 'go', values = 
'GO:0060973',mart=ensembl)

unique(gene.data$hgnc_symbol)

[1] "BVES"  "NDRG4"
GO biomaRt • 832 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 15 minutes ago
United States

Using biomaRt:

> mart <- useEnsembl("ensembl","hsapiens_gene_ensembl")
> getBM(c("ensembl_gene_id","hgnc_symbol", "go_id"), "go_parent_term", "GO:0060973", mart)
   ensembl_gene_id hgnc_symbol      go_id
1  ENSG00000101144        BMP7 GO:1905312
2  ENSG00000125378        BMP4 GO:1905312
3  ENSG00000078401        EDN1 GO:0003253
4  ENSG00000122691      TWIST1 GO:0003253
5  ENSG00000112276        BVES GO:0060973
6  ENSG00000019549       SNAI2 GO:0003273
7  ENSG00000075223      SEMA3C GO:1905312
8  ENSG00000151617       EDNRA GO:0003253
9  ENSG00000164107       HAND2 GO:0003253
10 ENSG00000113721      PDGFRB GO:0060981
11 ENSG00000148400      NOTCH1 GO:0003273
12 ENSG00000125848       FLRT3 GO:0003345
13 ENSG00000089225        TBX5 GO:0060980
14 ENSG00000110195       FOLR1 GO:0003147
15 ENSG00000110195       FOLR1 GO:0003253
16 ENSG00000166823       MESP1 GO:0003259
17 ENSG00000166823       MESP1 GO:0003260
18 ENSG00000166823       MESP1 GO:0060975
19 ENSG00000166341       DCHS1 GO:0003273
20 ENSG00000164093       PITX2 GO:0003253
21 ENSG00000103034       NDRG4 GO:0060973
22 ENSG00000106991         ENG GO:0003273
23 ENSG00000070831       CDC42 GO:0003253

Or a much faster way

> library(org.Hs.eg.db)
> select(org.Hs.eg.db, "GO:0060973", "SYMBOL", "GOALL")
'select()' returned 1:many mapping between keys and columns
        GOALL EVIDENCEALL ONTOLOGYALL SYMBOL
1  GO:0060973         IEA          BP   BMP4
2  GO:0060973         IEA          BP   BMP7
3  GO:0060973         IEA          BP  CDC42
4  GO:0060973         IEA          BP   EDN1
5  GO:0060973         IEA          BP  EDNRA
6  GO:0060973         IEA          BP    ENG
7  GO:0060973         ISS          BP  FOLR1
8  GO:0060973         ISS          BP NOTCH1
9  GO:0060973         ISS          BP PDGFRB
10 GO:0060973         ISS          BP  PITX2
11 GO:0060973         ISS          BP  SNAI2
12 GO:0060973         TAS          BP   TBX5
13 GO:0060973         IEA          BP TWIST1
14 GO:0060973         IMP          BP  DCHS1
15 GO:0060973         IEA          BP  HAND2
16 GO:0060973         ISS          BP SEMA3C
17 GO:0060973         IEA          BP   BVES
18 GO:0060973         ISS          BP  FLRT3
19 GO:0060973         ISS          BP  MESP1
20 GO:0060973         IEA          BP  NDRG4
21 GO:0060973         IDA          BP MIR1-1
ADD COMMENT

Login before adding your answer.

Traffic: 945 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6