Importing Gene Symbols with makeTxDbFromGFF
1
2
Entering edit mode
Dario Strbenac ★ 1.5k
@dario-strbenac-5916
Last seen 14 hours ago
Australia

I'd like to import the GENCODE Genes GFF3 file with its gene symbols. By using columns on the TxDb object, it is apparent that only the gene_id field is imported, which has entries such as ENSG00000000003.14.How can I also import the gene_name column, which has values like TSPAN6?

GenomicFeatures GFF3 • 3.4k views
ADD COMMENT
1
Entering edit mode
@valerie-obenchain-4275
Last seen 2.8 years ago
United States

The decision was made to not include a gene_name column in the TxDbs. This is explained on theĀ ?transcripts man page:

    Finally, \code{use.names=TRUE} cannot be used when grouping
    by gene \code{by="gene"}. This is because, unlike for the
    other features, the gene ids are external ids (e.g. Entrez
    Gene or Ensembl ids) so the db doesn't have a \code{"gene_name"}
    column for storing alternate gene names.

You can convert from Entrez or Ensembl ids to gene name with an OrgDb package:

> columns(org.Hs.eg.db)
 [1] "ACCNUM"       "ALIAS"        "ENSEMBL"      "ENSEMBLPROT"  "ENSEMBLTRANS"
 [6] "ENTREZID"     "ENZYME"       "EVIDENCE"     "EVIDENCEALL"  "GENENAME"    
[11] "GO"           "GOALL"        "IPI"          "MAP"          "OMIM"        
[16] "ONTOLOGY"     "ONTOLOGYALL"  "PATH"         "PFAM"         "PMID"        
[21] "PROSITE"      "REFSEQ"       "SYMBOL"       "UCSCKG"       "UNIGENE"     
[26] "UNIPROT" 

Valerie

ADD COMMENT
5
Entering edit mode

It's obviously unfortunate that the user starts with the gene names but then is forced to discard them and get them back. It would be nice if TxDb supported arbitrary meta columns, e.g., through a NoSQL or EAV approach.

ADD REPLY
1
Entering edit mode

Another solution is to read the file twice, once with makeTxDbFromGFF and a second time with import.gff3. Then, the matching of IDs is easy and doesn't miss those newly discovered genes which GENCODE has annotated with symbols.

ADD REPLY

Login before adding your answer.

Traffic: 557 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6