Entering edit mode
Hi,
I am using the IDTaxa function to classify some 16S sequences (ASVs that I got out of dada2). I am using the SILVA trainingSet that is available on the DECIPHER website ("SILVA SSU r138.2 (modified) (299 MB)")
In the result I see things like "Pseudomonas_2" and "Acinetobacter_2", but if I search in SILVA these "_2" suffixes are not there. I am wondering why these suffixes are there? Is this documented somewhere?
I am considering to trim them because they are causing me some trouble in my pipeline.
Thanks!

This has to do with the fact that some taxonomies reuse names at the same rank level. Common examples in the SILVA taxonomy are "Incertae Sedis" and "uncultured". If only considering a single rank level (e.g., genus), these names would incorrectly collapse to the same taxon when they belong to different taxonomic lineages. Such taxa are appended with a unique number to avoid this issue.
The latest SILVA (v138.2) taxonomy contains many similarly named taxa (and likely sequences) belonging to different taxonomic lineages. I am guessing these are due to major taxonomic reassignments that left behind some sequences with the previous taxonomic name. For example, "Proteobacteria" and "Pseudomonadota" both contain "Gammaproteobacteria".
Some redundancies could also be due to alternative spellings (e.g., "Cyanobacteriia" or "Halobacterota") that result in bifurcating the same taxonomic lineage. SILVA has a lot of these, unfortunately.
You can trim the appended numbers as you suggested, but you will need to be careful with collapsing taxa from distinct taxonomic lineages.
Thanks for the quick reply!
If I understand correctly, I think if I keep track of the full lineage, I shouldn't have any risk of collapsing distinct taxa. So I will go for trimming the numbers.
I did some more investigation and although Proteobacteria is still listed in the SILVA taxonomy, it's not actually used in any of the lineages of the fasta file headers. So it sort of feels to me like you are "contaminating" a lot of your database with these suffixes for no reason.
You are correct. Thank you for noticing some of this 'pollution' with "_2" suffixes was unnecessary.
I posted an updated SILVA classifier on the DECIPHER website (here). Note that some "_2" suffixes are still required for the reasons mentioned above.
I hope that helps. Please let me know how it goes.