Hi,
I'm using cn.mops to detect CNVs from Targeted DNA Sequencing data by "TruSight One Sequencing Panel from Illumina" investigated for one patient and one control samples, on an Illumina MiSeq sequencer with 150 bp paired-end run.
By running the algorithm with the referencecn.mops function, it results #GRanges object with 67 ranges.
Then, by verifying, the resulting CNVs with the coverage of the enriched regions in the raw data,
I found that several detected CNVs aren't included in the targeted regions enriched by the panel!!
Please, How can you explain that?
Find below my code, what am I doing wrong?
Thanks,
Amal
#######
BAMFiles <- list.files(pattern=".bam$")
refSeqName=c(paste0("chr",1:22),"chrX","chrY")
bamDataRanges<-getReadCountsFromBAM(BAMFiles,sampleNames=paste("Sample",2:3), refSeqName= refSeqName,mode="paired",WL=5000)
case.ctrl2<-referencecn.mops(cases=bamDataRanges[,3],controls=bamDataRanges[,2]) case.ctrl2.ICN <- calcIntegerCopyNumbers(case.ctrl2)
(cnvs(case.ctrl2.ICN ))
#####
Thank you Günter for your reply, but I do not have a list of targeted regions provided as BED file, how can I retrieve it??
Thanks,
Best Regards
Hello Amal,
There is a workaround: Go back to the initial "
getReadCountsFromBAM
" and use a small segment size, e.g. "WL=50
". Then you get a read count matrix with segments/windows along the whole genome. Remove rows with zero read counts across all samples (these are the DNA segments that were not targeted). Then run "exomecn.mops
" on the the read count matrix with the remaining rows.Regards,
Günter
Hi Günter,
Thanks for your suggestion!
I tried but I think that I make a mistake somewhere?!
#results: GRanges object with 61913560 ranges and 3 metadata columns:
matrix <- DataRanges.smallWL[DataRanges.smallWL[,1:3] != 0, ]
#results: GRanges object with 0 ranges and 3 metadata columns!
I think your code for filtering the DNA segments does not work. I give you code how to do this on the data that comes with cn.MOPS:
The code above removes DNA segments where all samples have less than 70 reads. You should use a smaller number, e.g. 5 or 10, instead of 70.
Regards,
Günter
Dear Günter
Thanks a lot for all your suggestions that helped me cope! Just to remember, I'm working with targeted sequencing on two samples, one case and one control. I tried with your two previous proposals: 1/The first one:To generate the read counts from BAM Files comparing to whole genome use a small segment size
## I tried also with referencecn.mops:
>ref.DataRanges.smallWL.case_ctrl.idx<referencecn.mops(cases=DataRanges.smallWL.case_ctrl.idx[,2],controls= DataRanges.smallWL.case_ctrl.idx[,1]) >CN.ref.DataRanges.smallWL.case_ctrl.idx<calcIntegerCopyNumbers(ref.DataRanges.smallWL.case_ctrl.idx) >cnvs_CN.ref.DataRanges.smallWL.case_ctrl.idx<(cnvs(CN.ref.DataRanges.smallWL.case_ctrl.idx)) >cnvs_CN.ref.DataRanges.smallWL.case_ctrl.idx #GRanges object with 45 ranges and 4 metadata columns: # =>Of a first view, overlapping the two resulted GRanges objects only 5 CNVrs have the CNVs (the same starts)!
2/The second one: To generate the read counts from BAM Files basing on the list of targeted regions in the panel provided as a BED file.
#I tried also with referencecn.mops insted of exomecn.mops
#=> When overlapping the two resulted GRanges objects (with SegmentReadCounts) only 1 CNV was deduced with the same region! And it’s also the only CNVr between the four previous resulted CNV regions GRanges objects.
I expected more overlapping CNVs?! Really, I lost between the different methods, on which one I have to trust? And how to choose the best approach for my analysis?
Looking for youy reply, Many thanks, most grateful!
Best, Amal
Please, there is someone who can answer me for my previous comment??!
Hello Amal,
Any results that do not come from "referencecn.mops" do not make sense, since you only have two samples. If there is a CNV in one of the two samples, it is randomly assigned to one of the two... The assumption of cn.mops (including exomecn.mops) is that the majority of samples has copy number 2, but here -- if there is a CNV -- the "majority" can only be 50%. I would not recommend to use cn.mops below 8 samples.
To sum it up: you should trust only the results of "referencecn.mops".
Regards,
Günter
Dear Günter,
Many thanks for your reply.
I will take that into account!
Best,
Amal