Entering edit mode
Dear developer,
I cannot find where there is the mistake in my code to run normalyzer, if you could bring some insight, I would be more than grateful for your help.
I am looking forward to hearing from you.
Best regards, Santos
# The code:
# Save the design matrix:
write.table(x = design, file = "./data/dietas_design_matrix.tsv",quote = F, row.names = F,
sep = "\t")
# Also the clean data matrix has to be provided in the correct format, so save it for later use:
Protein_Name <- rownames(exp_sample_clean_order)
LFQ_dietas_normalizacion <- data.frame(Protein_Name,exp_sample_clean_order)
write.table(LFQ_dietas_normalizacion, file = "./data/LFQ_dietas_normalizacion.tsv", quote = F,
row.names = F, sep = "\t")
# Normalization
NormalyzerDE::normalyzer(jobName = "normalizacion_dietas",
designPath = "./data/dietas_design_matrix.tsv",
dataPath = "./data/LFQ_dietas_normalizacion.tsv", outputDir = ".")
# The output:
You are running version 1.12.0 of NormalyzerDE
[Step 1/5] Load data and verify input
Input data checked. All fields are valid.
Sample check: More than one sample group found
Sample replication check: All samples have replicates
No RT column found, skipping RT processing
[Step 1/5] Input verified, job directory prepared at:./normalizacion_dietas
[Step 2/5] Performing normalizations
No RT column specified (column named 'RT') or option not specified Skipping RT normalization.
[Step 2/5] Done!
[Step 3/5] Generating evaluation measures...
Error in validObject(.Object) :
invalid class “NormalyzerEvaluationResults” object: invalid object for slot "lowVarFeaturesCVs" in class "NormalyzerEvaluationResults": got class "NULL", should be or extend class "numeric"
In addition: Warning messages:
1: In min(utils::head(rev(sort(referenceFDRWoNA)), n = fivePercCount)) :
no non-missing arguments to min; returning Inf
2: In findLowlyVariableFeaturesCVs(log2AnovaFDR, methodList) :
Too few successful ANOVA calculations to generate lowly variable features, skipping
sessionInfo( )
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] NormalyzerDE_1.12.0
loaded via a namespace (and not attached):
[1] colorspace_2.0-2 ellipsis_0.3.2
[3] class_7.3-20 htmlTable_2.4.0
[5] corpcor_1.6.10 XVector_0.34.0
[7] GenomicRanges_1.46.1 base64enc_0.1-3
[9] rstudioapi_0.13 proxy_0.4-26
[11] affyio_1.64.0 ggrepel_0.9.1
[13] RSpectra_0.16-0 fansi_1.0.2
[15] codetools_0.2-18 splines_4.1.2
[17] knitr_1.37 mixOmics_6.18.1
[19] Formula_1.2-4 cluster_2.1.2
[21] vsn_3.62.0 png_0.1-7
[23] BiocManager_1.30.16 compiler_4.1.2
[25] backports_1.4.1 assertthat_0.2.1
[27] Matrix_1.4-0 fastmap_1.1.0
[29] limma_3.50.1 cli_3.1.1
[31] htmltools_0.5.2 tools_4.1.2
[33] igraph_1.2.11 gtable_0.3.0
[35] glue_1.6.1 GenomeInfoDbData_1.2.7
[37] affy_1.72.0 reshape2_1.4.4
[39] dplyr_1.0.8 Rcpp_1.0.8
[41] carData_3.0-5 Biobase_2.54.0
[43] cellranger_1.1.0 raster_3.5-15
[45] vctrs_0.3.8 preprocessCore_1.56.0
[47] xfun_0.29 stringr_1.4.0
[49] lifecycle_1.0.1 RcmdrMisc_2.7-2
[51] terra_1.5-21 MASS_7.3-55
[53] zlibbioc_1.40.0 zoo_1.8-9
[55] scales_1.1.1 hms_1.1.1
[57] MatrixGenerics_1.6.0 parallel_4.1.2
[59] SummarizedExperiment_1.24.0 sandwich_3.0-1
[61] RColorBrewer_1.1-2 yaml_2.3.5
[63] gridExtra_2.3 ggplot2_3.3.5
[65] rpart_4.1.16 latticeExtra_0.6-29
[67] stringi_1.7.6 S4Vectors_0.32.3
[69] nortest_1.0-4 e1071_1.7-9
[71] checkmate_2.0.0 BiocGenerics_0.40.0
[73] BiocParallel_1.28.3 GenomeInfoDb_1.30.1
[75] rlang_1.0.1 pkgconfig_2.0.3
[77] matrixStats_0.61.0 bitops_1.0-7
[79] evaluate_0.15 lattice_0.20-45
[81] purrr_0.3.4 htmlwidgets_1.5.4
[83] tidyselect_1.1.2 plyr_1.8.6
[85] magrittr_2.0.2 R6_2.5.1
[87] IRanges_2.28.0 generics_0.1.2
[89] Hmisc_4.6-0 DelayedArray_0.20.0
[91] DBI_1.1.2 pillar_1.7.0
[93] haven_2.4.3 foreign_0.8-82
[95] survival_3.2-13 abind_1.4-5
[97] RCurl_1.98-1.6 sp_1.4-6
[99] nnet_7.3-17 tibble_3.1.6
[101] crayon_1.5.0 car_3.0-12
[103] rARPACK_0.11-0 utf8_1.2.2
[105] ellipse_0.4.2 rmarkdown_2.13
[107] jpeg_0.1-9 grid_4.1.2
[109] readxl_1.3.1 data.table_1.14.2
[111] forcats_0.5.1 digest_0.6.29
[113] tidyr_1.2.0 stats4_4.1.2
[115] munsell_0.5.0
Hi, and thanks for using NormalyzerDE!
I am not sure what is going wrong just by looking at your error messages. Sometimes, unexpected crashes can be encountered when data matrices for instance have very small amounts of valid data.
Are you able to provide an example of a crashing file? For instance a subset of the file you are using, or if the data is sensitive, a similarly formatted file. If so, then I could test further.
HI, thanks for your early respond and help!!!!
I have upload to my github a subset of my own data so you can use it. They consist of different cuantified proteins (columns) and a number of patients (rows). Here it is the link:
I hope you can find where is the problem and again thank you a lot for your help
Great, thanks!
By looking at your data, it seems like it already is log transformed. NormalyzerDE per default assumes non-log transformed data. You can run data which has been previously log transformed by using the flag "noLogTransform=true". Maybe that could solve your issue.
I.e. running it the following way:
If this doesn't solve it, let me know, and I'll try debugging it further.
Hey Jakob,
it doesn't look to sove it. Maybe is it due to a high presence of missing values? Let me know if you find where the issue is.
Again thank you a lot for your help!!!
OK, I see!
It looks like during the calculation of the indices used to display certain plots in the report, it is not able to generate enough ANOVA values for certain groups of samples. NormalyzerDE should be able to handle this by simply not showing that chart, but seems like it instead crashes entirely.
What you could do is to change the groups of samples you have provided in the design matrix, for instance by using fewer larger groups including more samples per group. This will have no impact on the normalizations themselves, only on certain quality plots.
If it only is the normalizations you are after, you can also run NormalyzerDE without these analyses using the "skipAnalysis=TRUE" flag. This will not yield the full report, but will still give you the normalized matrices (it might give you a partial report, I don't remember for sure). To re-emphasize - this will have no impact on the actual normalizations, which is what you will use in the subsequent steps.
Either way, I think you should run with the 'noLogTransform' option, as your data looks log transformed.
If the above suggestions does not solve your problem, and if you also provide the design matrix (the file you run together with the data matrix), then I could explore further.
Sorry about the troubles! I'll write an issue for further fixes for this.
Hey Jakob,
as you mentioned no report was generated, normalization took place successfully! I would love to get a report because I don't know which normalization method to choose for my data. The issue now is I can't include more samples per groups, the one I have is all I got, they are proteomic data with around 100 different proteins (features), 3 samples per groups corresponding to 3 patients. What should I do?
About the noLogTransform, I will use it for sure, I check it and they are log transformed. I have submitted the design matrix to my github in case you can guide me further.
Lastly, I can't thank you enough for your early and gentle help, Have a nice day!
Hey Jakob,
as you mentioned no report was generated, normalization took place successfully! I would love to get a report because I don't know which normalization method to choose for my data. The issue now is I can't include more samples per groups, the one I have is all I got, they are proteomic data with around 100 different proteins (features), 3 samples per groups corresponding to 3 patients. What should I do?
About the noLogTransform, I will use it for sure, I check it and they are log transformed. I have submitted the design matrix to my github in case you can guide me further.
Lastly, I can't thank you enough for your early and gentle help, Have a nice day!
Hi! A quick response. The grouping in the design matrix is just for visuals - it does not impact the normalizations and many of the normalization report plots (more than the colorings within the report).
It looks like you have eleven groups of three samples in the design matrix. First, you can see if you can spot if any of those groups seem to have very many missing values, and then omit it from the dataset before generating the evaluation.
I also think you can get a partial report by changing the "tinyRunThres" to larger than your dataset. A bit hacky, I know, but I think that would give you at least some of the charts.
Looking closer at your design and data matrix I am a bit confused. The sample column (i.e. "LFQ_C37BM" and so on) should exactly match column names in the data matrix. There I see "Actin" "cytoplasmid 1" and so on. So the data does not match up.
You could take a look at the following page and make sure you have the corresponding format:
http://130.235.214.136/help
Feel free to ask for further input!