Hi - I'd like to apply CopywriteR to detect CNAs in my targeted Capture sequencing data based on a gene panel of about 150 kb - in the documentation, it is suggested to start with a bin size of 50 kb - what should I be looking for in the output to tell if this bin size best fits my data? Thanks!
With regards to your question, the MAD value is the median absolute deviation and is calculated using madDiff from the R-package matrixStats. It takes into account that there are changes in the ‘signal’, for instance when there is a gain or loss. This is for us an important measure for determining how good the quality of the copy number plot at the current resolution is. We think that MAD-values lower than 0.35 - 0.4 result in decent profiles. If the MAD_value is higher, there is always the option to decrease the resolution by increasing the bin size, with the advantage that the noisiness of the data (and thus the MAD-value) become less. So in sum, guided by the MAD-value, we decide what the best balance is between having a high resolution and having a less noisy profile. I hope that helps you to choose the correct bin size (resolution) for your data.