Question: biomaRT biotype TFs only
0
gravatar for rbronste
20 months ago by
rbronste60
rbronste60 wrote:

Hi I am trying to retrieve from biomaRT a GRanges of only TFs with their gene identifier metadata and was not sure about how to do this, I assume its looking through the particular biotype but not sure if this applies to gene or specific transcripts? Thank you!

ADD COMMENTlink modified 20 months ago by Mike Smith4.0k • written 20 months ago by rbronste60
Answer: biomaRT biotype TFs only
1
gravatar for Mike Smith
20 months ago by
Mike Smith4.0k
EMBL Heidelberg / de.NBI
Mike Smith4.0k wrote:

I'm not sure biotype is quite the right field to query.  Perhaps a particular GO annotation would be appropriate.  GO:0003700 is for 'DNA binding transcription factor activity'.  It probably doesn't necessarily mean that something annotated with that is a transcription factor, but I guess it would be hard for a transcription factor to fall outside that classification.

You can query for genes annotated directly with that term in biomaRt with something like:

library(biomaRt)
ensembl_mart = useMart("ensembl", dataset = "hsapiens_gene_ensembl")

results <- getBM(attributes = c("ensembl_gene_id", 
                                "external_gene_name",
                                "chromosome_name", 
                                "start_position",
                                "end_position"),
                 filters = "go",
                 values = "GO:0003700",
                 mart = ensembl_mart)

If you want things annotated with that term, or anything below it in the ontology then it's slightly different:

results <- getBM(attributes = c("ensembl_gene_id", 
                                "external_gene_name",
                                "chromosome_name", 
                                "start_position",
                                "end_position"),
                 filters = "go_parent_term",
                 values = "GO:0003700",
                 mart = ensembl_mart)
ADD COMMENTlink modified 20 months ago • written 20 months ago by Mike Smith4.0k

Very helpful thank you! 

My aim here is to filter a full HT-seq list of counts for factors with DNA binding activity and do it in such a way as to grab only those above some cutoff, lets say just DNA binding factors with absolute counts above 100 for example. Do you think just importing that list as a data.frame into R and doing a simple intersect/matching operation would do it? Thanks again.

ADD REPLYlink modified 20 months ago • written 20 months ago by rbronste60

I guess filtering one list vs the other for 

"external_gene_name"
ADD REPLYlink written 20 months ago by rbronste60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 308 users visited in the last hour