Search
Question: Copyhelper: Preprocession of mappability
0
gravatar for Felix
22 months ago by
Felix0
Felix0 wrote:

Hallo!

I'm trying to understand how the preprocessing of the mappability is implemented.

In the CopyhelpeR package the mappability is defined as followed:

"The mappability data were obtained by aligning all possible 51 base pair
genomic fragments using BWA (http://bio-bwa.sourceforge.net/). The
mappability of every fragment was binarized, and the mappability of a specific region
is taken as the average mappability of all fragments that fall into this region."

Now I'm wondering what the mappability of a fragment exactly is since
there is no such value defined in the SAM-Format and why you chose a bp length of 51.

Thanks,
Felix

ADD COMMENTlink modified 22 months ago by t.kuilman100 • written 22 months ago by Felix0
2
gravatar for t.kuilman
22 months ago by
t.kuilman100
Netherlands
t.kuilman100 wrote:

Hi Felix,

Thank you very much for your interest in CopywriteR. As a measure for mappability at position x we tested whether the 51 base pairs surrounding position x were uniquely mapped (mappability = 1) or not (mappability = 0). Since we use a binned approach we calculate the mappability for a specific region by averaging the individual mappabilities at all the positions contained within a particular bin. The approach was initially designed for single-end reads, but works well for paired-end reads too (to check this you can open the .png files in the CNAprofiles/qc folder).

With regard to your question why we chose 51 bp length: there is no particular reason why we chose this length and we could have chosen a bigger length as well. As far as I am aware all mappability data are (and should be) depending on read length / kmer size though. Unfortunately we cannot provide the helper files for commonly used read lengths / kmers due to space restrictions so we have settled for 51 bp. I know this is an imperfect solution so if you would have a better alternative I would be happy to know.

I hope this answers your question.

Best,

Thomas

ADD COMMENTlink written 22 months ago by t.kuilman100
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 352 users visited in the last hour