Question: cn.mops CNV calling for multiple samples
0
gravatar for joonan
4.0 years ago by
joonan0
United States
joonan0 wrote:

Hi, I had the test run with WGS daat of four individuals (from one family) and got pretty interesting result. But also reckoned that some cnv calls looked bit biased due to the samples that are from the same family. I am tempted to run multiple samples - increasing the number of families. Do you have any idea or minimum number required for multiple sample CNV calling using cn.mops?

Joon

 

 

cn.mops • 1.1k views
ADD COMMENTlink modified 3.9 years ago by Günter Klambauer540 • written 4.0 years ago by joonan0
Answer: cn.mops CNV calling for multiple samples
0
gravatar for Günter Klambauer
4.0 years ago by
Austria
Günter Klambauer540 wrote:

Hello Joon,

Thanks for your interest in cn.mops. I would recommend at least 6 samples for cn.mops. The more the better, of course. Note that the samples that you are adding should have exactly the same processing (sequencing technique, QC, readmapper, etc) as the ones you already have!

 

Let me know if you need more help with your analysis!

 

Regards,

Günter

ADD COMMENTlink written 4.0 years ago by Günter Klambauer540
Answer: cn.mops CNV calling for multiple samples
0
gravatar for joonan
3.9 years ago by
joonan0
United States
joonan0 wrote:

Thanks Gunter,

More questions I have:

1. How can I know the WL after I ran getReadCountsFromBAM without specifying WL? I found that the value after this function has an interval and this interval might be a WL, in the case below was 1000. Please correct me if I am wrong, or is there any function to retrieve the WL value?

          seqnames               ranges strand   | sample1.chr22.bam sample2.chr22.bam 
             <Rle>            <IRanges>  <Rle>   |                <integer>                <integer>                
      [1]       22         [   1, 1000]      *   |                        0                        0                        
      [2]       22         [1001, 2000]      *   |    

 

2. Do you have any suggestion for optimal WL for 30-50X WGS human data?

3. As suggest previously, I'd like to run cn.mops with WL from the lowest coverage sample(s) and wondering cn.mops package provides a function for get coverage.

4. Is there any function for parallel processing in cn.mops? I'd like to speed up the process and if no parallel processing provided, I am thinking running cn.mops by chromosomes.

 

 

Many thanks again.

 

 

 

 

ADD COMMENTlink written 3.9 years ago by joonan0
Answer: cn.mops CNV calling for multiple samples
0
gravatar for Günter Klambauer
3.9 years ago by
Austria
Günter Klambauer540 wrote:

Hello,

1.

width(bamDataRanges)[1]

2. There is a simple relation between coverage (C), window length (WL), read length (RL) and average number of reads per bin (m): C=m*RL/WL. cn.MOPS has a good performance if the average number of reads in a window is 50-100. Assuming your reads have read length of 100 and a coverage of 50X, this suggests a WL of 100bp-200bp.

3. cn.mops does not provide a function for calculating the coverage but it is easy to determine if you have already calculated the read counts:

colSums(as.matrix(values(bamDataRanges)))/sum(width(bamDataRanges))*readLength

You could - for example - do an initial run with a large WL, then calculate the read counts and coverage. Based on these coverage values you can determine a more suitable WL and recalculate the read counts. Probably there is a function to calculate the coverage in Rsamtools.

4. Yes, of course. Almost all functions have an argument "parallel" that you can set to the desired number of processing cores: cn.mops(..., parallel=24) would run cn.mops on 24 cores.

 

Thanks for your questions!

Regards,

Günter

ADD COMMENTlink written 3.9 years ago by Günter Klambauer540
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 162 users visited in the last hour