DNAcopy segment function not reproducible even when setting seed
0
0
Entering edit mode
BID • 0
@162ffc56
Last seen 5 weeks ago
NA

Hi,

I'm using the 'segment' function on my data. In the manual it's mentioned that a seed should be specified in order to make the results reproducible. However, even when I use a seed set.seed(73684) the results of segment aren't reproducible on my data. I've tested the examples mentioned in the manual. Those do generate reproducible results.

I've run the segment function 10 times on my data and I get 2 possible outcomes. In half of my cases, the segment detected on chr7 is listed as 1, while in the other half of the cases it's split over 3 segments

ID      chrom loc.start loc.end   num.mark seg.mean
sample     7         1 158100001     2899  -0.0293

versus

ID      chrom loc.start loc.end   num.mark seg.mean
sample      7         1  87650001     1579  -0.0334
sample      7  87700001  87950001        6   0.1839
sample      7  88050001 158100001     1314  -0.0254

These are the lines of my code where CNA and segment are called.

segcna <- CNA(allreads$LOG2RATIO, allreads$CHR, as.numeric(allreads$BIN.START), presorted = TRUE, sampleid=tmp.sampleid)
smoothcna <- smooth.CNA(segcna)
myseg <- segment(smoothcna, verbose = 1, p.method = "hybrid", min.width = 5, undo.splits="sdundo", undo.SD=20)

Is there any reason why, even when specifying a seed, the outcome of segment can be different when the same input is used?

sessionInfo( )
> print(sessionInfo())
R version 3.6.1 (2019-07-05)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 8 (Core)

Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=de_BE.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_BE.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_BE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_BE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] DNAcopy_1.60.0 stringr_1.4.0  ggplot2_3.2.1  dplyr_0.8.3    optparse_1.6.4

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.10      magrittr_2.0.3   tidyselect_0.2.5 munsell_0.5.0   
 [5] getopt_1.20.3    colorspace_1.4-1 R6_2.5.1         rlang_1.0.6     
 [9] tools_3.6.1      grid_3.6.1       gtable_0.3.0     cli_3.6.0       
[13] withr_2.5.0      lazyeval_0.2.2   assertthat_0.2.1 tibble_2.1.3    
[17] lifecycle_1.0.3  crayon_1.3.4     purrr_1.0.1      vctrs_0.5.2     
[21] glue_1.3.1       stringi_1.4.3    compiler_3.6.1   pillar_1.4.2    
[25] scales_1.0.0     pkgconfig_2.0.3
DNAcopy • 111 views
ADD COMMENT

Login before adding your answer.

Traffic: 602 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6