Question

CopywriteR - choosing optimal bin size for targeted sequencing panel

0

Entering edit mode

sshung • 0

@sshung-9774

Last seen 9.9 years ago

Hi - I'd like to apply CopywriteR to detect CNAs in my targeted Capture sequencing data based on a gene panel of about 150 kb - in the documentation, it is suggested to start with a bin size of 50 kb - what should I be looking for in the output to tell if this bin size best fits my data? Thanks!

copywriter bin size targeted sequencing • 1.8k views

ADD COMMENT • link updated 9.9 years ago by t.kuilman ▴ 170 • written 9.9 years ago by sshung • 0

score 0 · Answer 1 · 2016-02-23

With regards to your question, the MAD value is the median absolute deviation and is calculated using madDiff from the R-package matrixStats. It takes into account that there are changes in the ‘signal’, for instance when there is a gain or loss. This is for us an important measure for determining how good the quality of the copy number plot at the current resolution is. We think that MAD-values lower than 0.35 - 0.4 result in decent profiles. If the MAD_value is higher, there is always the option to decrease the resolution by increasing the bin size, with the advantage that the noisiness of the data (and thus the MAD-value) become less. So in sum, guided by the MAD-value, we decide what the best balance is between having a high resolution and having a less noisy profile. I hope that helps you to choose the correct bin size (resolution) for your data.