"Some normalization factors are zero" error on cn.mops
1
0
Entering edit mode
@stephen-piccolo-6761
Last seen 3.6 years ago
United States

I'm applying cn.mops to ~10 exome sequencing samples. I can apply getSegmentReadCountsFromBAM just fine. But when I try to apply exomecn.mops, I get an error saying that "Some normalization factors are zero! Remove samples or chromosomes for which the average read count is zero, e.g. chromosome Y." I modified my GenomicRanges object so that it excludes chrY. I've also removed any region or sample from my read counts object that contains zero reads on average. But I still get the error message. However, if I remove two of the samples that have a low overall read count, it works fine.

Am I missing something? Or is there some specific criteria I could use to identify samples/regions that should be excluded?

sessionInfo()

R version 3.2.1 (2015-06-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.2 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] magrittr_1.5         dplyr_0.4.2          cn.mops_1.14.1      
[4] GenomicRanges_1.20.5 GenomeInfoDb_1.4.1   IRanges_2.2.5       
[7] S4Vectors_0.6.2      Biobase_2.28.0       BiocGenerics_0.14.0 

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.0       Rsamtools_1.20.4  Biostrings_2.36.1 assertthat_0.1   
 [5] bitops_1.0-6      R6_2.1.0          DBI_0.3.1         zlibbioc_1.14.0  
 [9] XVector_0.8.0     tools_3.2.1 

cn.mops • 1.7k views
ADD COMMENT
1
Entering edit mode
@gunter-klambauer-5426
Last seen 3.2 years ago
Austria

Hello Stephen,

 

Thanks for using cn.mops! The default normalization method estimates the size factors based on the median read count per sample. Please do not check the average read counts PER REGION (rows of the read count matrix), but the average read counts PER SAMPLE (columns of the read count matrix).

If one sample has a median read count of 0 (i.e. more than 50% of the regions have zero reads), this might also be an indicator for low sample quality. Especially for exome sequencing this behaviour would be strange.

If the error persists, please send me the read count matrix via email.

 

Regards,

Günter

ADD COMMENT
0
Entering edit mode

Thanks for your response. It failed when I excluded samples with a median read count of 0. However, I tweaked it to exclude samples that had more than 40% zeroes, and that worked. Some of these samples have very low coverage, so it seems reasonable to exclude them anyway. But you might consider adding a parameter that allows users to remove such samples automatically. Thanks again.

ADD REPLY
0
Entering edit mode

Thanks for the feedback, Stephen! Let me know how you get along with the further analysis!

Regards,

Günter

ADD REPLY

Login before adding your answer.

Traffic: 863 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6