making tree from NCBI taxonomy data for ggtree
2
3
Entering edit mode
map2085 ▴ 40
@map2085-9227
Last seen 6.0 years ago
United States

I would like to use ggtree to annotate & visualize part of the taxonomy tree provided by NCBI.

The NCBI taxnonomy information is provided in a simple 2-column format:  "<taxid>   <parent taxid>"

(see file "nodes.dmp"  within the TAR  ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz )

 

However, all of the data import options for ggtree seem geared toward specialized software.   How can I import this "custom" node-association information NCBI taxonomy data and make a suitable tree for ggtree?

ggtree • 4.2k views
ADD COMMENT
3
Entering edit mode
Guangchuang Yu ★ 1.2k
@guangchuang-yu-5419
Last seen 17 hours ago
China/Guangzhou/Southern Medical Univer…

ggtree not only work for software output, but also with standard tree format, including newick, nexus, nhx, phylip, jplace.

It also work with S3/S4 tree objects generated by other packages, including phylo and multiPhylo (ape), obkData (OutbreakTools) and phyloseq (phyloseq).

 

NCBI taxonomy data is not tree file and we need to build a tree (newick or nexus) from the data before it can be visualized using ggtree.

 

To my knowledge, there is no R package can do this task. After searching it in github, I found a python script, https://github.com/bendmorris/taxiphy, that can read NCBI taxonomy and generate tree file in newick format. The output file can be parsed in R and visualized using ggtree.

 

 

 

ADD COMMENT
0
Entering edit mode
saladi • 0
@saladi-14598
Last seen 4.2 years ago

For future visitors, consider using ETE's NCBI Taxonomy function to write the tree with your desired nodes and then later draw/manipulate it using ggtree.

http://etetoolkit.org/docs/latest/tutorial/tutorial_ncbitaxonomy.html

ADD COMMENT

Login before adding your answer.

Traffic: 952 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6