possible problem with mzR? cannot load Msnset for proteomics analysis
1
0
Entering edit mode
yaare • 0
@user-25000
Last seen 5 months ago
United States

Hi -- I am having trouble running a code that was working 6 months ago. I am getting an error message with mzR and wondering if this is the root of the problem. I tried updating mzR and bioconductor as stated in https://github.com/sneumann/mzR/wiki/mzR-Rcpp-compiler-linker-issue . I'm not sure why the code is not running correctly anymore. I'm seeing multiple errors downstream of loading the data file when using Limma that I was not seeing before. Any help is much appreciated. Thank you!

Loading required package: MSnbase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colMeans,
    colnames, colSums, dirname, do.call, duplicated, eval, evalq,
    Filter, Find, get, grep, grepl, intersect, is.unsorted, lapply,
    lengths, Map, mapply, match, mget, order, paste, pmax, pmax.int,
    pmin, pmin.int, Position, rank, rbind, Reduce, rowMeans, rownames,
    rowSums, sapply, setdiff, sort, table, tapply, union, unique,
    unsplit, which, which.max, which.min

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: mzR
Loading required package: Rcpp
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

    expand.grid

Loading required package: ProtGenerics

This is MSnbase version 2.8.3 
  Visit https://lgatto.github.io/MSnbase/ to get started.


Attaching package: ‘MSnbase’

The following object is masked from ‘package:stats’:

    smooth

The following object is masked from ‘package:base’:

    trimws

> setwd("C:/Users/xxx/Desktop/other_data")
Warning message:
In fun(libname, pkgname) :
  mzR has been built against a different Rcpp version (1.0.0)
than is installed on your system (1.0.6). This might lead to errors
when loading mzR. If you encounter such issues, please send a report,
including the output of sessionInfo() to the Bioc support forum at 
https://support.bioconductor.org/. For details see also
https://github.com/sneumann/mzR/wiki/mzR-Rcpp-compiler-linker-issue.
> 
> # ___________________________  FUNCTIONS TO LOAD THE DATA _________________________________
> 
> # The following functions load the protein tables exported from ProteomeDiscoverer 
> # and transform them into MSnbase objects.
> 
> # Load plotting packages
> library(ggplot2)
> library(RColorBrewer)
> library(reshape2)
> library(cowplot)
> 
> org.options <- options()
> 
> # create a nicer ggplot theme
> my_theme_bw <- theme_bw() +
+   theme(strip.background = element_blank(),
+         panel.border = element_rect(colour = "black"))
> 
> my_theme <- function(base_size = 12, base_family = "", flip_axis = F) {
+   flip_axis_theme <- theme(axis.text.x = element_text(angle = 45, hjust = 1))
+   
+   res <- theme_bw(base_size=base_size, base_family=base_family) %+replace% 
+     theme(strip.background = element_blank(),
+           panel.grid.major = element_blank(),
+           panel.grid.minor = element_blank(),
+           panel.border = element_blank(), 
+           axis.line = element_line(colour = "#333333")
+     )
+   
+   if (flip_axis) {
+     res <- res + flip_axis_theme
+   }
+   
+   return(res)
+ }
> 
> # add some convenience functions
> setPlotSize <- function(height, width) {
+   options(repr.plot.height=height, repr.plot.width = width)
+ }
> 
> resetPlotSize <- function() {
+   options(org.options)
+ }
> 
> showPlot <- function(plot.obj, width, height, filename = NULL) {
+   if (!is.null(filename) && dir.exists("Figures")) {
+     svg(paste0("/Users/xxx/from R/", filename, ".svg"), 
+         width=width, height=height)
+     print(plot.obj)
+     dev.off()
+   }
+   
+   setPlotSize(height, width)
+   print(plot.obj)
+ }
> 
> 
> # _______________________________  LOAD DATA ______________________________________
> 
> library(MSnbase)
> 
> loadPdResult <- function(quan.data.file) {
+   complete.data <- read.table(quan.data.file, header = T, sep = "\t", stringsAsFactors = F)
+   uses.clustering <- length(grep("-clustering-", quan.data.file)) > 0
+   unique.name <- gsub(".*/(.*)_Proteins\\.txt", "\\1", quan.data.file)[[1]]
+   
+   # only keep reliable ids
+   org.ids <- nrow(complete.data)
+   if ("Protein.FDR.Confidence.MS.Amanda.20" %in% colnames(complete.data)) {
+     # complete.data <- complete.data[complete.data$Protein.FDR.Confidence.MS.Amanda.20 == "High", ]    
+   } else  {
+     # complete.data <- complete.data[complete.data$Protein.FDR.Confidence == "High", ]
+   }
+   message("Retained ", nrow(complete.data), " / ", org.ids)
+   
+  
+   # only master protein
+   complete.data <- complete.data[complete.data$Master == "IsMasterProtein", ]
+   
+   if (length(grep("Peakjuggler", colnames(complete.data)) > 0)) {
+     quan.data <- complete.data[, grep("Peakjuggler\\.Area", colnames(complete.data))]
+     rownames(quan.data) <- complete.data[, "Accession"]
+     colnames(quan.data) <- gsub("Peakjuggler\\.Area\\.site[^.]*", unique.name, colnames(quan.data))
+   } else {
+     quan.data <- complete.data[, grep("apQuant\\.Area", colnames(complete.data))]
+     rownames(quan.data) <- complete.data[, "Accession"]
+     colnames(quan.data) <- gsub("apQuant\\.Area\\.site[^.]*", unique.name, colnames(quan.data))
+   }
+   
+   # create the metadata
+   samples <- data.frame(
+     row.names = colnames(quan.data),
+     engine = "MSAmanda",
+     dataset = gsub(".*-(site[^-.]*).*", "\\1", colnames(quan.data)),
+     run = gsub(".*\\.(6[A-E])\\..*", "\\1", colnames(quan.data)),
+     original_search_result = !uses.clustering,
+     replicate = as.numeric(gsub(".*\\.rep([1-3])", "\\1", colnames(quan.data)))
+   )
+   
+   # create the MSNSet object
+   qual <- data.frame()
+   
+   featureDataDf <- data.frame(
+     row.names = rownames(quan.data),
+     peptides.amanda = complete.data[, "Number.of.Peptides.MS.Amanda.20"],
+     pj.confidence = complete.data[, "apQuant.Confidence"],
+     n.psms = complete.data[, "Number.of.PSMs"],
+     stringsAsFactors = F)
+   
+   if ("Number.of.Peptides.Spectral.Cluster.Search" %in% colnames(complete.data)) {
+     featureDataDf$peptides.clustering <- complete.data[, "Number.of.Peptides.Spectral.Cluster.Search"]
+   } else {
+     featureDataDf$peptides.clustering <- 0
+   }
+   
+   colnames(featureDataDf) <- paste0(unique.name, colnames(featureDataDf))
+   
+   featureData <- new("AnnotatedDataFrame", data = featureDataDf)
+   
+   phenoData <- new("AnnotatedDataFrame", data = samples)
+   
+   pd.data <- new("MSnSet",
+                  qual = qual,
+                  exprs = as.matrix(quan.data),
+                  featureData = featureData,
+                  annotation = "No annotation",
+                  phenoData = phenoData)
+   
+   return(pd.data)
+ }
> 
> mergeMsnsets <- function(data1, data2) {
+   exp.cmb <- merge(exprs(data1), exprs(data2), by.x = 0, by.y = 0, all.x = T, all.y = T)
+   rownames(exp.cmb) <- exp.cmb[, 1]
+   exp.cmb[, 1] <- NULL
+   
+   fData.cmb <- merge(fData(data1), fData(data2), by.x = 0, by.y = 0, all.x = T, all.y = T)
+   rownames(fData.cmb) <- fData.cmb[, 1]
+   fData.cmb[, 1] <- NULL
+   colnames(fData.cmb)[colnames(fData.cmb) == "uniprot.x"] <- "uniprot"
+   fData.cmb[, grep("uniprot\\.", colnames(fData.cmb))] <- NULL
+   
+   pData.cmb <- rbind(pData(data1), pData(data2))
+   
+   msnset.cmb <- new("MSnSet",
+                     qual = data.frame(),
+                     exprs = as.matrix(exp.cmb),
+                     featureData = new("AnnotatedDataFrame", data = fData.cmb),
+                     annotation = "No annotation",
+                     phenoData = new("AnnotatedDataFrame", data = pData.cmb))
+   
+   return(msnset.cmb)
+ }
> 
> 
>  filenames <- list.files("other_data/", pattern = ".*_Proteins-AV\\.txt", full.names = T)
>  alvarez.data <- NA
>  for (f in filenames) {
+   message("Processing ", f, "...")
+   if (f == filenames[1]) {
+     alvarez.data <- loadPdResult(f)
+   } else {
+     tmp <- loadPdResult(f)
+     alvarez.data <- mergeMsnsets(alvarez.data, tmp)
+   }
+  }
> 
> alvarez.data$dataset <- as.data.frame("Alvarez")
Warning message:
In alvarez.data$dataset <- as.data.frame("Alvarez") :
  Coercing LHS to a list
> 
> 
> 
> 
> 
> 
> if (!requireNamespace("BiocManager", quietly = TRUE))
+     install.packages("BiocManager")
> BiocManager::install("mzR")
Bioconductor version 3.8 (BiocManager 1.30.10), R 3.5.3 (2019-03-11)
Installing package(s) 'mzR'
Warning: package ‘mzR’ is in use and will not be installed
Installation path not writeable, unable to update packages: class, cluster, KernSmooth, nlme, nnet,
  spatial, survival
Old packages: 'bit', 'bit64', 'data.table', 'e1071', 'gbm', 'gmm', 'gower', 'hexbin', 'igraph',
  'ipred', 'matrixStats', 'mclust', 'mlbench', 'pROC', 'proxy', 'RcppArmadillo', 'robustbase',
  'RSQLite', 'sampling'
Update all/some/none? [a/s/n]: 
n
> 
> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] limma_3.38.3        cowplot_1.1.1       reshape2_1.4.4      RColorBrewer_1.1-2 
 [5] ggplot2_3.3.3       MSnbase_2.8.3       ProtGenerics_1.14.0 S4Vectors_0.20.1   
 [9] mzR_2.16.2          Rcpp_1.0.6          Biobase_2.42.0      BiocGenerics_0.28.0

loaded via a namespace (and not attached):
 [1] tidyselect_1.1.0      purrr_0.3.4           lattice_0.20-41       colorspace_2.0-0     
 [5] vctrs_0.3.6           generics_0.1.0        vsn_3.50.0            utf8_1.1.4           
 [9] XML_3.99-0.3          rlang_0.4.10          pillar_1.5.1          glue_1.4.2           
[13] withr_2.4.1           DBI_1.1.1             BiocParallel_1.16.6   affy_1.60.0          
[17] affyio_1.52.0         foreach_1.5.1         lifecycle_1.0.0       plyr_1.8.6           
[21] stringr_1.4.0         mzID_1.20.1           zlibbioc_1.28.0       munsell_0.5.0        
[25] pcaMethods_1.74.0     gtable_0.3.0          codetools_0.2-18      IRanges_2.16.0       
[29] doParallel_1.0.16     fansi_0.4.2           preprocessCore_1.44.0 scales_1.1.1         
[33] BiocManager_1.30.10   impute_1.56.0         digest_0.6.27         stringi_1.5.3        
[37] dplyr_1.0.5           ncdf4_1.17            grid_3.5.3            cli_2.3.1            
[41] tools_3.5.3           magrittr_2.0.1        tibble_3.1.0          crayon_1.4.1         
[45] pkgconfig_2.0.3       MASS_7.3-53.1         ellipsis_0.3.1        assertthat_0.2.1     
[49] iterators_1.0.13      R6_2.5.0              MALDIquant_1.19.3     compiler_3.5.3
MSnbase mzR Bioconductor limma • 180 views
ADD COMMENT
0
Entering edit mode
@laurent-gatto-5645
Last seen 1 day ago
Belgium

The mzR message that you quote in your question isn't an error and can safely be ignored. As far as I can see, there is no other error in your script.

ADD COMMENT

Login before adding your answer.

Traffic: 252 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6