Hello,
I used MS-GF+ on the mzML file, using the .faa file as reference, and produce the .mzid file. All files are in the following link: https://figshare.com/s/b65fc594da19f0f9347f . The raw data was produced by Bruker Impact II (Q-TOF MS/MS), and was transformed into mzML file by ProteoWizard msconvert.
However, when I tried to use the function countIdentifications, an error popped up. I am not sure whether there was something wrong about the reducePSMs, joinSpectraData, or countIdentifications.
My codes were as below. The MS-GF+ was run in Mac terminal; other codes were in R Studio.
java -version
java version "1.8.0_421"
java -Xmx3500M -jar MSGFPlus.jar -s "HH090441864_2024-09-09_2648.mzML" -d "protein.faa" -tda 0 -inst 2 -e 1 -maxMissedCleavages 2 -o HH090441864_2.mzid
library(Spectra)
library(PSMatch)
sp<-Spectra("HH090441864_2024-09-09_2648.mzML")
id<-readPSMs("HH090441864_2.mzid")
id_filtered<-filterPSMs(id)
id_f_r<-reducePSMs(id_filtered)
sp_ident <- joinSpectraData(sp, id_f_r,
by.x = "spectrumId",
by.y = "spectrumID")
sp_ident_count <- countIdentifications(sp_ident)
#Error: BiocParallel errors
# 1 remote errors, element index: 1
# 0 unevaluated and other errors
# first remote error:
#Error in as.vector(x, mode): coercing an AtomicList object to an atomic vector is supported only for
#objects with top-level elements of length <= 1
Thank you in advance for any thoughts!
Thank you very much for the detailed explanation!! Sorry for the delay response. Indeed, there were multiple MS2 scan matching to more than one peptide- and strangely all peptides were ranked 1.
In fact, all of the matches in my original data were ranked 1, and when I did filterPSMs, no peak was removed due to rank >1. I also didn't use decoy database- can this be the reason why there's a problem with ranking?
I used MSGFPlus for generating my mzID file, and also tried posting the question there (https://github.com/MSGFPlus/msgfplus/issues/156#issuecomment-2380245837), but haven't got a definite solution yet (currently trying to create a new column for redundant sequence).
I used scan 30 as an example: