Search
Question: GenomicFeatures, makeTxDbFromGFF: select only certain types from a gtf file
0
gravatar for francesca casalino
2.7 years ago by
United States
francesca casalino50 wrote:

Hi,

I am trying to extract the promoter regions from gencode gtf file only for "lincRNA" and "antisense" (under "gene_type" column in the gtf file downloaded from the gencode website). 

I think I need to start by using GenomicFeatures to first import the gtf file:

    library("GenomicFeatures")
    # Download the version 19 gencode gtf file from gencode website, and then load here
    txdb = makeTxDbFromGFF("file.gtf", format="gtf")

But I don't think the column in the gtf file called "gene_type" is imported. 

How can I import and select the columns "lincRNA" and "antisense" from the TxDb file?

The gtf file looks like this:

#!genome-build GRCh37.p13
#!genome-version GRCh37
#!genome-date 2009-02
#!genome-build-accession NCBI:GCA_000001405.14
#!genebuild-last-updated 2013-09

chr16    HAVANA    gene    53069602    53086785    .    -    .    gene_id "ENSG00000261550.1"; transcript_id "ENSG00000261550.1"; gene_type "antisense"; gene_status "NOVEL"; gene_name "RP11-467J12.4"; transcript_type "antisense"; transcript_status "NOVEL"; transcript_name "RP11-467J12.4"; level 2; havana_gene "OTTHUMG00000173186.1";

chr5    HAVANA    transcript    10493639    10502840    .    -    .    gene_id "ENSG00000249396.1"; transcript_id "ENST00000515243.1"; gene_type "lincRNA"; gene_status "NOVEL"; gene_name "RP11-1C1.4"; transcript_type "lincRNA"; transcript_status "KNOWN"; transcript_name "RP11-1C1.4-001"; level 2; tag "basic"; havana_gene "OTTHUMG00000162051.1"; havana_transcript "OTTHUMT00000367039.1";

Thanks very much for your help.

ADD COMMENTlink modified 2.7 years ago by Hervé Pagès ♦♦ 13k • written 2.7 years ago by francesca casalino50
1
gravatar for Hervé Pagès
2.7 years ago by
Hervé Pagès ♦♦ 13k
United States
Hervé Pagès ♦♦ 13k wrote:

Hi Francesca,

Try to use import() from the rtracklayer package instead of makeTxDbFromGFF(). The former imports everything from the file while the latter selectively imports only the information it needs to build a TxDb object.

H.

ADD COMMENTlink written 2.7 years ago by Hervé Pagès ♦♦ 13k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 187 users visited in the last hour