Search
Question: Use biomaRt to get pre-calculated transcription factor binding sites (TFBS) for GRCh37
0
gravatar for goldberg.jm
3 months ago by
goldberg.jm10
goldberg.jm10 wrote:

Hi All,

I wish to use Bioconductor/biomaRt to get pre-calculated transcription factor binding site (TFBS) results for GRCh37.

To do this (for GRCh38) at the ensembl biomart interface (http://www.ensembl.org/biomart/martview/), under "-CHOOSE DATABASE-" I select "ENSEMBL REGULATION 92", and under "-CHOOSE DATASET-" I select "Human Binding Motifs (GRCh38.p12)".

For a convenient "Filter" I check "Multiple regions...", and enter "1:0:20000". For this test I left "Attributes" at default.

The result is:
http://www.ensembl.org/biomart/martview/19ba1c438ef96a5100531e91647ab2b5?VIRTUALSCHEMANAME=default&ATTRIBUTES=hsapiens_motif_feature.default.binding_motifs.binding_matrix_id|hsapiens_motif_feature.default.binding_motifs.chromosome_name|hsapiens_motif_feature.default.binding_motifs.chromosome_start|hsapiens_motif_feature.default.binding_motifs.chromosome_end|hsapiens_motif_feature.default.binding_motifs.score|hsapiens_motif_feature.default.binding_motifs.feature_type_name&FILTERS=hsapiens_motif_feature.default.filters.chromosomal_region."1:0:20000"&VISIBLEPANEL=resultspanel

Here is my specific question: how do I write a Bioconductor/biomaRt query to get me to the equivalent of "ENSEMBL REGULATION 92/Human Binding Motifs" for GRCh37?

Thank you!

Jon

ADD COMMENTlink modified 3 months ago by James W. MacDonald48k • written 3 months ago by goldberg.jm10

I do know how to use biomaRt to access archived versions of "Ensembl Genes..." (see code below), just not for "Ensembl Regulation..."

useMart(host='grch37.ensembl.org',biomart='ENSEMBL_MART_ENSEMBL',dataset='hsapiens_gene_ensembl') #
ADD REPLYlink written 3 months ago by goldberg.jm10
1
gravatar for James W. MacDonald
3 months ago by
United States
James W. MacDonald48k wrote:

If you can't go to the Ensembl Biomart site directly and do the query (and so far as I can tell, you can't), then you cannot do the query using biomaRt either. The latter is just a programmatic way for querying the former, so won't do anything that isn't available at the website.

ADD COMMENTlink written 3 months ago by James W. MacDonald48k
1

That said, you could consider using liftOver to convert the GRCh38 TFBS to the GRCh37 coordinates.

ADD REPLYlink written 3 months ago by James W. MacDonald48k

Thanks James. Before I try liftOver, I'll see if I can use TFBStools (https://bioconductor.org/packages/release/bioc/html/TFBSTools.html) to calculate the sites by applying PSSMs to the sequence.

Best,

Jon

ADD REPLYlink written 3 months ago by goldberg.jm10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 196 users visited in the last hour