Question: Somaticsignature NMF problem
0
gravatar for ns
2.5 years ago by
ns0
United States
ns0 wrote:

Hello, 

I am trying to run somaticsignatures on mutation data.  

When I run the commands:

snpvr_mm = motifMatrix(snpvr_motif, group = "study", normalize = TRUE)

gof_nmf = identifySignatures(snpvr_mm, 4, decomposition = nmfDecomposition)

I get the following error:

Error: NMF::nmf - Input matrix x contains at least one null or NA-filled row.

The session info for R is:

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggplot2_2.2.1              SomaticSignatures_2.8.4    VariantAnnotation_1.18.7   Rsamtools_1.24.0           Biostrings_2.40.2         
 [6] XVector_0.12.1             SummarizedExperiment_1.2.3 Biobase_2.32.0             GenomicRanges_1.24.3       GenomeInfoDb_1.8.7        
[11] IRanges_2.6.1              S4Vectors_0.10.3           BiocGenerics_0.18.0       

loaded via a namespace (and not attached):
 [1] httr_1.2.1                    foreach_1.4.3                 AnnotationHub_2.4.2           splines_3.3.1                 Formula_1.2-1                
 [6] shiny_1.0.0                   assertthat_0.1                interactiveDisplayBase_1.10.3 latticeExtra_0.6-28           RBGL_1.48.1                  
[11] BSgenome_1.40.1               RSQLite_1.1-2                 backports_1.0.5               lattice_0.20-34               biovizBase_1.20.0            
[16] digest_0.6.12                 RColorBrewer_1.1-2            checkmate_1.8.2               colorspace_1.3-2              ggbio_1.20.2                 
[21] htmltools_0.3.5               httpuv_1.3.3                  Matrix_1.2-8                  plyr_1.8.4                    OrganismDbi_1.14.1           
[26] XML_3.98-1.5                  biomaRt_2.28.0                zlibbioc_1.18.0               xtable_1.8-2                  scales_0.4.1                 
[31] BiocParallel_1.6.6            proxy_0.4-16                  htmlTable_1.9                 tibble_1.2                    pkgmaker_0.22                
[36] GenomicFeatures_1.24.5        nnet_7.3-12                   lazyeval_0.2.0                survival_2.40-1               magrittr_1.5                 
[41] mime_0.5                      memoise_1.0.0                 GGally_1.3.0                  doParallel_1.0.10             NMF_0.20.6                   
[46] foreign_0.8-67                graph_1.50.0                  BiocInstaller_1.22.3          registry_0.3                  tools_3.3.1                  
[51] data.table_1.10.4             gridBase_0.4-7                stringr_1.1.0                 munsell_0.4.3                 rngtools_1.2.4               
[56] cluster_2.0.5                 AnnotationDbi_1.34.4          ensembldb_1.4.7               pcaMethods_1.64.0             grid_3.3.1                   
[61] RCurl_1.95-4.8                iterators_1.0.8               dichromat_2.0-0               htmlwidgets_0.8               bitops_1.0-6                 
[66] base64enc_0.1-3               codetools_0.2-15              gtable_0.2.0                  DBI_0.5-1                     reshape_0.8.6                
[71] reshape2_1.4.2                R6_2.2.0                      GenomicAlignments_1.8.4       gridExtra_2.2.1               knitr_1.15.1                 
[76] rtracklayer_1.32.2            Hmisc_4.0-2                   stringi_1.1.2                 Rcpp_0.12.9                   rpart_4.1-10                 
[81] acepack_1.4.1                

 

Can the problem be identified?

Thanks!

somaticsignatures nmf • 956 views
ADD COMMENTlink modified 22 months ago by grimmmmer0 • written 2.5 years ago by ns0

Judging from "Error: NMF::nmf - Input matrix x contains at least one null or NA-filled row.", I would suspect that your input 'snpvr_mm' contains a full row, i.e. study, without any mutations. Can you check if this is the case? It is not possible to narrow it down further from the distance without knowing about the data.

Also, please consider updating your packages to the current Bioconductor release. The version you are using is not supported any more (see the Bioc help pages) and the new version of SomaticSignatures might behave differently.

ADD REPLYlink written 2.5 years ago by Julian Gehring1.3k

Hi Julian,

I checked my data table, and I do not have any studies with no  mutation calls (studies are in the columns).  I do have rows where the trinucleotide motifs are present that have 0 values.  See the example below

  DJFS_1 DJFS_10 DJFS_100 DJFS_102 DJFS_107
CA A.A 0 0 0 0 0.0625
CA A.C 0.181818 0 0 0 0
CA A.G 0 0 0 0 0
CA A.T 0 0 0 0 0
CA C.A 0 0 0 0 0
CA C.C 0.090909 0 0 0 0
CA C.G 0 0 0 0 0.0625
CA C.T 0 0 0 0 0
CA G.A 0 0 0 0 0
CA G.C 0 0 0 0 0
CA G.G 0 0 0 0 0.0625

 

If a small number of mutations are present in a sample, it is likely to have some rows with a 0 value. (e.g. CA A.G)

How do I overcome this problem?

If I artificially add in values for the missing rows, I get this error:


Error in rowQ(imat, ncol(imat)) : cannot handle missing values.
In addition: Warning message:
In .local(x, rank, method, ...) :
  NMF residuals: final objective value is NA

I have now updated bioconductor.  Can you please help figure out why this problem persists?

Thanks!

Natalie

 

ADD REPLYlink written 2.5 years ago by ns0

How many mutations do you have per sample/study? I.e. what does

rowSums(motifMatrix(snpvr_motif, group = "study", normalize = FALSE))

return?

ADD REPLYlink written 2.5 years ago by Julian Gehring1.3k
Answer: Somaticsignature NMF problem
0
gravatar for ns
2.5 years ago by
ns0
United States
ns0 wrote:

> rowSums(motifMatrix(snpvr_motif, group = "study", normalize = FALSE))
CA A.A CA A.C CA A.G CA A.T CA C.A CA C.C CA C.G CA C.T CA G.A CA G.C CA G.G CA G.T CA T.A CA T.C CA T.G CA T.T CG A.A CG A.C CG A.G CG A.T CG C.A CG C.C 
    15      4      8      5      6      9      7      6      5      1      4      3     16      4      6     14      8      4      7      4      4      0 
CG C.G CG C.T CG G.A CG G.C CG G.G CG G.T CG T.A CG T.C CG T.G CG T.T CT A.A CT A.C CT A.G CT A.T CT C.A CT C.C CT C.G CT C.T CT G.A CT G.C CT G.G CT G.T 
     0      6      6      1      3      1      6      3      0      3     12      6      6     13     17      3      9      4      5      1      5      6 
CT T.A CT T.C CT T.G CT T.T TA A.A TA A.C TA A.G TA A.T TA C.A TA C.C TA C.G TA C.T TA G.A TA G.C TA G.G TA G.T TA T.A TA T.C TA T.G TA T.T TC A.A TC A.C 
    11      3      3     12     14      6     13     11      3      5      5      7      6      4      4      7     10      6     10     16      7     13 
TC A.G TC A.T TC C.A TC C.C TC C.G TC C.T TC G.A TC G.C TC G.G TC G.T TC T.A TC T.C TC T.G TC T.T TG A.A TG A.C TG A.G TG A.T TG C.A TG C.C TG C.G TG C.T 
    15     10      4      3      2      2     10      4      7     12      9      7     13     20      2      4      7      2      1      1      6      1 
TG G.A TG G.C TG G.G TG G.T TG T.A TG T.C TG T.G TG T.T 
     2      1      7      2      2      0     10      2 
> colSums(motifMatrix(snpvr_motif, group = "study", normalize = FALSE))
DJFS_1 DJFS_2 DJFS_3 DJFS_4 DJFS_5 DJFS_6 DJFS_7 DJFS_8 DJFS_9 
    77     57     58     68    121     74     44     58     43 

ADD COMMENTlink written 2.5 years ago by ns0
Answer: Somaticsignature NMF problem
0
gravatar for Julian Gehring
2.5 years ago by
Julian Gehring1.3k
Julian Gehring1.3k wrote:

The error results from the fact that some mutational motifs have not been occurred in any of the samples, i.e. some of the rows of the motif matrix are fully zero. This prevents the decomposition of the matrix with the NMF. I'll try to capture this explicitly in the package and provide a more informative error message.

There are several potential approaches to address such cases, but looking at the excerpt from your data set, I doubt that any of them would be helpful here: The number of mutations per sample, and hence the signal of any mutational process, is very low, and I would not be confident that an analysis of this data would yield reliable or meaningful signatures. I would therefore suggest to think of other experimental designs for an analysis here, such as gathering data from more samples or pooling of samples to few distinct groups with higher number of variants.

ADD COMMENTlink written 2.5 years ago by Julian Gehring1.3k
Answer: Somaticsignature NMF problem
0
gravatar for grimmmmer
22 months ago by
grimmmmer0
grimmmmer0 wrote:

Are all of the other NMF-based tools similarly limited by this situation (e.g. WTSI Mutational Signature Framework, MutSpec, BayesNMF, signeR)?

SomaticSignatures would be a really great tool for unbiased classification of individual tumors into different groups, but given the discussion above, it seems this approach is limited to hindsight signatures of pre-defined groups of tumors. I am running into the same issue despite my large test dataset of ~4500 SNVs. 

ADD COMMENTlink modified 22 months ago • written 22 months ago by grimmmmer0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 229 users visited in the last hour