Interprteting the phylogenomic profiles plot
3
0
Entering edit mode
Alexandros • 0
@cc06bc01
Last seen 16 months ago
Belgium

Hello,

I am trying to interpret the figure that is plotted by syntenet (https://github.com/almeidasilvaf/syntenet) that visualizes the phylogenomic profiles for the different species and groups; What exactly are the different colors corresponding to increasing numbers (e.g. red color -> +3)? In addition what does the clustering tree on top of the figure represent? Are these groups of clusters that are shared in different species? Some more infor in the documentation would help.

Best Alex

syntenet synteny Network • 2.5k views
ADD COMMENT
0
Entering edit mode
@fabricio_almeidasilva-14890
Last seen 8 months ago
Ghent, Belgium

Hi, Alexandros,

The heatmap is based on a matrix mij that displays the number of genes in synteny cluster j that can be found in species i. That's what colors represent. Thus, if a synteny cluster (column) is red for a particular species (row), it means that this species contains 3 or more genes in this synteny cluster.

The clustering on the columns is a simple hierarchical clustering (Ward's clustering on a matrix of Euclidean distances) to group similar synteny clusters together.

You might want to read the manuscript describing syntenet (https://www.biorxiv.org/content/10.1101/2022.08.16.504079v2) and previous papers on synteny networks (for instance, see figure 5 of https://www.pnas.org/doi/full/10.1073/pnas.1801757116, and https://www.nature.com/articles/s41467-021-23665-0).

Best, Fabricio

ADD COMMENT
0
Entering edit mode

Thanks Fabricio,

It is then somewhat strange that my heatmap includes a lot of light blue areas (also in the vignette example) which according to your description include only 1 gene per synteny cluster for this specific species. How can a region be part of a synteny cluster based on only one gene? Or maybe I did not fully understand your description..

Best Alex

ADD REPLY
1
Entering edit mode

Hi, Alex

I believe you have not fully understood the concept of synteny clusters. I think Figure 1 of https://www.nature.com/articles/s41467-021-23665-0 will be helpful.

Best, Fabricio

ADD REPLY
0
Entering edit mode

Hi Fabricio,

Many thanks I think I got this now. Initially I thought that a cluster includes all the genes of a single synteny block across the analyzed genomes but as far as I understand it now each cluster is composed of separate gene anchor pairs of the defined blocks extended over several genomes. Am I correct?

What does it mean then if you have more than 1 gene per cluster? Could it be whole genome duplication or even uncollapsed haplotypes in the genome assembly? If tandem arrays are collapsed and are not part of the individual clusters I assume they cannot account for the pattern of multiple genes per cluster.

Thanks Alex

ADD REPLY
0
Entering edit mode

Hi, Alex.

Your interpretation is correct. Multiple genes per cluster typically indicate polyploidization events (whole-genome duplication, triplication, etc.), but uncollapsed haplotypes are also a reasonable explanation.

As an example, in Figure 1B of the syntenet manuscript (https://www.biorxiv.org/content/10.1101/2022.08.16.504079v2), you can see some species with orange (2) and red (3+) for most of the synteny clusters, which is explained by the fact that these species are recent polyploids.

Best, Fabricio

ADD REPLY
0
Entering edit mode
@f1c1bda2
Last seen 2.0 years ago
Vietnam

I would admit that the trimodality of the density plot of those data makes them appear strange, but I have not had much experience with Nimblegen data, so perhaps this is to be anticipated. gartic phone

ADD COMMENT
0
Entering edit mode

Wow! This blog post is an absolute gem! The content is well-researched and beautifully articulated. Futbol Libre

ADD REPLY
0
Entering edit mode
@64e84694
Last seen 10 months ago
Brazil

In my file after going through the check_input() and creating the pdata <- process_input() the error below occurs and the seq_2 <- process_input(proteomes1, annotation1)$seq[1:4] if(diamond_is_installed()) { blast_list <- run_diamond(seq_2) } the result of blast_list is the source file data(annotation) and data(proteomes) your file, not my file

Error: The sequences are expected to be proteins but only contain DNA letters. Use the option --ignore-warnings to proceed. Error: The sequences are expected to be proteins but only contain DNA letters. Use the option --ignore-warnings to proceed. Error: The sequences are expected to be proteins but only contain DNA letters. Use the option --ignore-warnings to proceed. Error: The sequences are expected to be proteins but only contain DNA letters. Use the option --ignore-warnings to proceed. No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Creinhardtii_281 No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Creinhardtii_281 No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Creinhardtii_281 No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Creinhardtii_281 No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Czofingiensis_461 No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Czofingiensis_461 No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Czofingiensis_461 No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Czofingiensis_461 No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Dsalina_325_v1 No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Dsalina_325_v1 No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Dsalina_325_v1 No such file or directory Error: Error opening file C:\Users\lwand\AppData\Local\Temp\Rtmp0cCOSb/diamond/dbs/Dsalina_325_v1

ADD COMMENT
0
Entering edit mode

Hi, lwanderson8c

I noticed that your question is not related to this post. When asking questions, please open a new post with your own question, don't ask them as comments in other questions. Make sure to also include all steps you took to get to the error, otherwise people won't be able to reproduce your problem and help you.

The error message you pasted here seems self-explanatory. syntenet works with protein sequences, and you seem to have passed DNA sequences as input. Please read the "Importing data to the R session" section of the vignette (https://bioconductor.org/packages/release/bioc/vignettes/syntenet/inst/doc/syntenet.html#4_Importing_data_to_the_R_session), where you can find all the details about what the input data should look like and how to import data from FASTA and GFF files.

Best,

Fabricio

ADD REPLY

Login before adding your answer.

Traffic: 552 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6