Gene biotype for a given symbol
1
0
Entering edit mode
adai • 0
@adai-20833
Last seen 4.8 years ago

How can I get the gene_biotypes for a given gene symbol in R please? Preferably using Ensembl (which in turn is using Havana annotation).

Thank you.

ensembl • 991 views
ADD COMMENT
1
Entering edit mode
Johannes Rainer ★ 2.0k
@johannes-rainer-6987
Last seen 7 weeks ago
Italy

You can get that information using ensembldb:

Assuming you're working with human annotations and want to use Ensembl release 86:

library(EnsDb.Hsapiens.v86)
edb <- EnsDb.Hsapiens.v86

## Get the transcript biotype for the gene SMC4
genes(edb, filter = ~ symbol == "SMC4", return.type = "DataFrame")
DataFrame with 1 row and 10 columns
          gene_id   gene_name   gene_biotype gene_seq_start gene_seq_end
      <character> <character>    <character>      <integer>    <integer>
1 ENSG00000113810        SMC4 protein_coding      160399274    160434962
     seq_name seq_strand seq_coord_system      symbol entrezid
  <character>  <integer>      <character> <character>   <list>
1           3          1       chromosome        SMC4    10051

With return.type you can specify what return object the function should return (data.frame, DataFrame or the default GRanges). You could also define columns = "gene_biotype" to just return the biotype, gene ID and symbol:

genes(edb, filter = ~ symbol == "SMC4", return.type = "DataFrame", columns = "gene_biotype")
DataFrame with 1 row and 3 columns
    gene_biotype         gene_id      symbol
     <character>     <character> <character>
1 protein_coding ENSG00000113810        SMC4

For other Ensembl releases and species you can get the respective EnsDb database from AnnotationHub (see e.g. ensembldb vignette for details).

ADD COMMENT

Login before adding your answer.

Traffic: 548 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6