qcmetrics package and longitudinal qc data
2
0
Entering edit mode
Eralp Dogu • 0
@eralp-dogu-12835
Last seen 6.8 years ago
TURKEY/Mugla/Mugla Sitki Kocman Univers…

I am developing an R package (MSstatsQC) for longitudinal assessment of QC performance. I have been trying to convert msfiles using qcmetrics objects to create a csv file that is compatible with my input format so that our package will be operable for msnbase users. My data format will be "time, peptide sequence, annotations, metric1, metric2, ..." Is there any way to pull out all those info. I could create a csv file and reach out to retention time and some other metrics but wasn't successful on other variables. 

qcmetrics • 1.2k views
ADD COMMENT
0
Entering edit mode
@laurent-gatto-5645
Last seen 3 days ago
Belgium

Your question is not clear to me. Could you clarify what you are trying to do and/or answer the following questions.

  • Are you replicating the example for raw MS data shown in the qcmetrics vignette, or are you creating you own QcMetric objects?
  • What is your input format, and what operability with MSnbase are you referring to?
  • Are you trying to use the qcmetrics package to generate quality reports, or are you interested in creating a table with these variables to analyse independently?

 

ADD COMMENT
0
Entering edit mode
Eralp Dogu • 0
@eralp-dogu-12835
Last seen 6.8 years ago
TURKEY/Mugla/Mugla Sitki Kocman Univers…

Hi Laurent,

Thanks a lot for your interest! We are developing an R package and a Shiny app for monitoring longitudinal QC data. We are creating control charts and other statistical methods to analyze instrument performance over time. We were able to generate csv files through Skyline that includes Acquired Time, Peptide seq, Annotations and any corresponding metrics of interest such that retention time and peak area for this particular peptide and time. Details are available via our paper and the shiny app. Trying the shiny app with the sample data will give a better idea about the input data we need.

1. Initially, I tried to replicate raw MS data in qcmetrics package. But the ultimate goal is to automatically create QCMetrics objects to be used in a converter function.

2. We want to create a converter for users that are using msnbase and/or qcmetrics packages. The converter will generate a csv file using qcmetrics objects. I think compatibility with qcmetrics package is a better approach but please let me know if you have any better ideas. The converter will be quite similar to what MSstats team did previously to convert but this time specifically for QC data. https://github.com/MeenaChoi/MSstats/blob/master/R/TransformMSnSet.R 

3. Rather than quality reports, I need a data table (csv) including each time point and peptide per metric. 

Here is my quick code chuck as the initial converter...

#' A function to convert MSnbase files to MSstatsQC format 
#'
#' @param msfile data file to be converted
#' @return A data frame that can be used with MSstatsQC
#' @keywords MSnbase, qcmetrics, input
#' @export
#' @import MSnbase
#' @import qcmetrics
#' @examples
#' dontrun{MSstatsQCdata<- MSnbaseToMSstatsQC(msfile)}

MSnbaseToMSstatsQC  <-  function(msfile) {
  
  data <- readMSData(msfile, verbose = FALSE)
  
  if (!inherits(data, "MSnExp")) {
    stop("Only MSnSet class can be converted to input format for MSstats.")
  }

  qc <- QcMetric(name = "NULL")
  
  #Examples of metrics that can be monotired ###############################
  RetentionTime <- rtime(data)
  PrecursorIntensity <- precursorIntensity(data)
  ##########################################################################
  qcdata(qc, "RetentionTime") <- RetentionTime
  qcdata(qc, "PrecursorIntensity") <- PrecursorIntensity
  
  MSstatsQCdata <- c()
  MSstatsQCdata <- data.frame(setNames(lapply(ls(qc@qcdata), get, envir=qc@qcdata), ls(qc@qcdata)))
  MSstatsQCdata <- data.frame(AcquiredTime=seq_along(RetentionTime), Precursor=NA, Annotations=NA, MSstatsQCdata)
  
  ## if there are any missing variable name, warn it and stop
  check.name <- c("AcquiredTime", "Precursor", "Annotations", "RetentionTime", "PrecursorIntensity")
  
  diff.name <- setdiff(check.name, colnames(MSstatsQCdata))
  if (length(diff.name) > 0){
    stop(paste("Please check the variable name. The provided variable name", paste(diff.name, collapse=","), "is not present in the data set.", sep=" "))
  }
  return(MSstatsQCdata)
}

Many thanks in advance,

Eralp

ADD COMMENT
0
Entering edit mode

Hi Eralp,

The function above looks sensible. I would suggest you use readMSData2 as it will be much faster (it won't read the raw data into memory), but you'll need to make sure you also define the MS level that you wish to read (by default, it reads all levels). Also, you can probably drop the c() initialisation.

If you have your code in a repo, it's would probably be easier to discuss the code there, rather than here, on the forum.

ADD REPLY
0
Entering edit mode

Thanks Laurent! We can discuss the code here. 

https://github.com/eralpdogu/MSstatsQC/blob/master/R/MSnbaseToMSstatsQC.R 

Back to my original question, do you know a way to pull out peptide sequences?

ADD REPLY
0
Entering edit mode

Peptide sequences are not available when one only calls readMSData[2], as this function only accesses data from the raw files. But it is possible to add the identifiction data to the raw MSnExp objects using addIdentificationData. Then, the identification results, including the peptide sequences, will be available in the feature data.

ADD REPLY
0
Entering edit mode

I will modify the converter and let you know. Is it possible to get info for other metrics such as peak area or FWHM?

ADD REPLY

Login before adding your answer.

Traffic: 490 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6