Dear MSstats Team / Community,
I am encountering an issue when using MSstatsPTM to process Arabidopsis thaliana phosphoproteomics data exported from Spectronaut.
Setup:
Search Database: TAIR10 (e.g., Protein IDs in format AT1G51370.1)
Software: Spectronaut (standard export)
Function: SpectronauttoMSstatsPTMFormat
The Problem: The SpectronauttoMSstatsPTMFormat function seems to be optimized for UniProt-style Accessions. When using TAIR10 Locus Tags, I face the following challenges:
FASTA Parsing Error: When providing a TAIR10 FASTA file, I receive the following Crucial Error Message: "Error in [.data.table(data, , c(protein_name_col, unmod_pep_col, mod_pep_col, ... : column not found: [Start]"
This error appears to be a downstream effect of the failed FASTA matching. Since the ID parsing logic (regex) fails to match the TAIR10 headers, the SiteLocator cannot run, and the required [Start] column is never generated.
Data Example:
PG.ProteinGroups / PG.ProteinAccessions: AT1G51370.1 AT1G01050;AT1G01050.2 AT1G01050.1;AT1G01050.2
EG.ModifiedSequence: LNLSTDHDDDNDDGDDGDDDQFAK
Question: I am looking for a robust way to make MSstatsPTM compatible with TAIR10/Ensembl-style FASTA files.
Thank you for your time and assistance!
Best regards,
