Missing SCANS in writeMgfData
1
0
Entering edit mode
@plbaldoni
Last seen 19 hours ago
Melbourne, Australia

Hi there,

I am trying to filter some mgf files with MSnbase. After reading in the mgf file with readMgfData(), I noticed that the output of writeMgfData() contains missing values in the SCANS field.

I believe this is due to the missingness in the acquisition number of the spectrum (see below).

Shouldn't the scan field be preserved in this case? What would be the best way to preserve the scan value from the input mgf in the output?

> library(MSnbase)
> 
> system('head ~/Downloads/example.mgf')
BEGIN IONS
TITLE=RawFile: YHE010_02_Slot1-1_1_2987 Charge: 2 FeatureIntensity: 107857 Feature#: 1601 RtApex: 132.65 Precursor: 16
INSTRUMENT=ESI-QUAD-TOF
PEPMASS=414.712204
CHARGE=2+
RTINSECONDS=132.65
SCANS=1601
204.1344 76 
209.1016 921 
209.1097 174 
> mgf <- readMgfData(filename = '~/Downloads/example.mgf')
> 
> mgf@featureData@data
                                                                                                                      TITLE   INSTRUMENT     PEPMASS
X1         RawFile: YHE010_02_Slot1-1_1_2987 Charge: 2 FeatureIntensity: 107857 Feature#: 1601 RtApex: 132.65 Precursor: 16 ESI-QUAD-TOF  414.712204
X2            RawFile: YHE010_02_Slot1-1_1_2987 Charge: 2 FeatureIntensity: 478 Feature#: 1602 RtApex: 121.43 Precursor: 16 ESI-QUAD-TOF  414.865694
X3   RawFile: YHE010_02_Slot1-1_1_2987 Charge: 2 FeatureIntensity: 59363 Feature#: 4574001 RtApex: 1956.81 Precursor: 45740 ESI-QUAD-TOF  813.431709
X4 RawFile: YHE010_02_Slot1-1_1_2987 Charge: 3 FeatureIntensity: 1022373 Feature#: 4976001 RtApex: 2009.84 Precursor: 49760 ESI-QUAD-TOF 1270.314164
   CHARGE RTINSECONDS   SCANS
X1     2+      132.65    1601
X2     2+      121.43    1602
X3     2+     1956.81 4574001
X4     3+     2009.84 4976001
> acquisitionNum(mgf)
X1 X2 X3 X4 
NA NA NA NA 
> 
> writeMgfData(object = mgf,con = '~/Downloads/output.mgf')
> system('head ~/Downloads/output.mgf')
COM=Experimentexported by MSnbase on Tue Nov  3 11:51:16 2020
BEGIN IONS
SCANS=NA
TITLE=msLevel 2; retentionTime 132.65; scanNum NA; scanIndex 1601; precMz 414.7122; precCharge 2
RTINSECONDS=132.65
PEPMASS=414.7122
CHARGE=2+
204.1344 76
209.1016 921
209.1097 174
> 
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] MSnbase_2.16.0      ProtGenerics_1.22.0 S4Vectors_0.28.0    mzR_2.24.0          Rcpp_1.0.5          Biobase_2.50.0      BiocGenerics_0.36.0

loaded via a namespace (and not attached):
 [1] BiocManager_1.30.10   pillar_1.4.6          compiler_4.0.2        plyr_1.8.6            iterators_1.0.13      zlibbioc_1.36.0      
 [7] tools_4.0.2           digest_0.6.27         ncdf4_1.17            MALDIquant_1.19.3     lifecycle_0.2.0       tibble_3.0.4         
[13] preprocessCore_1.52.0 gtable_0.3.0          lattice_0.20-41       pkgconfig_2.0.3       rlang_0.4.8           foreach_1.5.1        
[19] rstudioapi_0.11       dplyr_1.0.2           IRanges_2.24.0        generics_0.1.0        vctrs_0.3.4           grid_4.0.2           
[25] tidyselect_1.1.0      glue_1.4.2            impute_1.64.0         R6_2.5.0              XML_3.99-0.5          BiocParallel_1.24.0  
[31] limma_3.46.0          ggplot2_3.3.2         purrr_0.3.4           magrittr_1.5          scales_1.1.1          pcaMethods_1.82.0    
[37] codetools_0.2-16      ellipsis_0.3.1        MASS_7.3-53           mzID_1.28.0           colorspace_1.4-1      affy_1.68.0          
[43] doParallel_1.0.16     munsell_0.5.0         vsn_3.58.0            crayon_1.3.4          affyio_1.60.0
MSnbase • 823 views
ADD COMMENT
1
Entering edit mode
@laurent-gatto-5645
Last seen 11 hours ago
Belgium

I didn't get a notification to this message, so I answered first your question in the Github issue.

In a nutshell, SCAN and acquisition number are considered to be different variables. Unfortunately, there is not direct way to replace the acquisition number with the SCAN feature variable. More details on github.

As a side note, it's better to avoid mgf@featureData@data and use fData(mgf) instead.

ADD COMMENT

Login before adding your answer.

Traffic: 764 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6