Detection copy number variations with cn.MOPS package
1
0
Entering edit mode
lpascual ▴ 50
@lpascual-4906
Last seen 9.6 years ago
Dear group, I'm trying to detect copy number variations with the cn.MOPS package. I have eight different samples (coming from different individuals re-sequenced by Solexa). Sequences have been mapped against the reference genome with BWA, the genome coverage of my samples ranges from 13x to 6x. I have run the cn.mops using the package default parameters. However, when I run the algorithm it detects some CNV regions for which the copy number is 2 for all the individuals. Does anyone have an explanation for this result? I paste you here the code I have used and one example of a detected CNV where the copy number is 2. Thanks in advance Laura > sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C [3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8 [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Rsamtools_1.8.6 Biostrings_2.24.1 cn.mops_1.2.6 [4] GenomicRanges_1.8.12 IRanges_1.14.4 Biobase_2.16.0 [7] BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] bitops_1.0-4.1 stats4_2.15.1 zlibbioc_1.2.0 > Countsreads_SL2.40ch02<-getReadCountsFromBAM(BAMFiles, c("Cervil","Criollo_new","Ferum_new","LA0147","LA1420","Levovil","Plov div","Stupike"),refSeqName=c("SL2.40ch02"),mode="unpaired") > resCNMOPS_SL2.40ch02<-cn.mops(Countsreads_SL2.40ch02) > head(params(resCNMOPS_SL2.40ch02),n=11L) $method [1] "cn.mops" $folds [1] 0.025 0.500 1.000 1.500 2.000 2.500 3.000 3.500 4.000 $classes [1] "CN0" "CN1" "CN2" "CN3" "CN4" "CN5" "CN6" "CN7" "CN8" $priorimpact [1] 1 $cyc [1] 20 $normType [1] "poisson" $normQu [1] 0.25 $upperThreshold [1] 0.5 $lowerThreshold [1] -0.9 $minWidth [1] 3 $SegmentationParams character(0) > *Countsreads*_SL2.40ch02[1512] GRanges with 1 range and 8 elementMetadata cols: seqnames ranges strand | Levovil Criollo_new Ferum_new Stupike Plovdiv LA1420 Cervil LA0147 <rle> <iranges> <rle> | <integer> <integer> <integer> <integer> <integer> <integer> <integer> <integer> [1] SL2.40ch02 [*43819001, 43848000*] * | *112 403 164 **1306 1192 1181 907 617* seqlengths: SL2.40ch02 NA > *cnvs*(resCNMOPS_SL2.40ch02)[6] GRanges with 1 range and 4 elementMetadata cols: seqnames ranges strand | sampleName median mean *CN* <rle> <iranges> <rle> | <factor> <numeric> <numeric> *<character>* [1] SL2.40ch02 [*43819001, 43906000*] * | Criollo_new 0.5849618 0.63563 *CN2* seqlengths: SL2.40ch02 NA > *cnvr*(resCNMOPS_SL2.40ch02)[7] GRanges with 1 range and 8 elementMetadata cols: seqnames ranges strand | CN.Levovil CN.Criollo_new CN.Ferum_new CN.Stupike CN.Plovdiv CN.LA1420 CN.Cervil CN.LA0147 <rle> <iranges> <rle> | <factor> <factor> <factor> <factor> <factor> <factor> <factor> <factor> [1] SL2.40ch02 [*43819001, 43906000*] * | *CN2 CN2 CN2 CN2 CN2 CN2 CN2 CN2* seqlengths: SL2.40ch02 NA [[alternative HTML version deleted]]
Coverage cn.mops Coverage cn.mops • 1.2k views
ADD COMMENT
0
Entering edit mode
@gunter-klambauer-5426
Last seen 3.3 years ago
Austria
Hello Laura, For the "CNV regions" I join the individual CNV regions, when they cluster at some location. Therefore the CNV region can be longer than the individual CNVs. To obtain the copy number for this region I summarize the counts from the initial segments. Therefore it may be that the posterior probability of CN2 is the largest for all samples. The posterior probabilities for other copy numbers should also be high. I suggest you look in the slot "cnvs", "segmentation" or "integerCopyNumber" -- there you should find copy numbers different from 2. Another possibility is to lower the parameter "priorImpact". This parameter enforces the algorithm to explain the data by copy number 2. Lowering it leads to more detections. It should be adjusted to each data set. I hope this helps, Regards, G?nter On 12/07/2012 04:35 PM, lpascual wrote: > Dear group, > > I'm trying to detect copy number variations with the cn.MOPS package. I > have eight different samples (coming from different individuals > re-sequenced by Solexa). Sequences have been mapped against the > reference genome with BWA, the genome coverage of my samples ranges > from 13x to 6x. I have run the cn.mops using the package default parameters. > However, when I run the algorithm it detects some CNV regions for which > the copy number is 2 for all the individuals. Does anyone have an > explanation for this result? > I paste you here the code I have used and one example of a detected CNV > where the copy number is 2. > > Thanks in advance > > Laura > > > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C > [3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8 > [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] Rsamtools_1.8.6 Biostrings_2.24.1 cn.mops_1.2.6 > [4] GenomicRanges_1.8.12 IRanges_1.14.4 Biobase_2.16.0 > [7] BiocGenerics_0.2.0 > > loaded via a namespace (and not attached): > [1] bitops_1.0-4.1 stats4_2.15.1 zlibbioc_1.2.0 > > > Countsreads_SL2.40ch02<-getReadCountsFromBAM(BAMFiles, > c("Cervil","Criollo_new","Ferum_new","LA0147","LA1420","Levovil","Pl ovdiv","Stupike"),refSeqName=c("SL2.40ch02"),mode="unpaired") > > > resCNMOPS_SL2.40ch02<-cn.mops(Countsreads_SL2.40ch02) > > > head(params(resCNMOPS_SL2.40ch02),n=11L) > $method > [1] "cn.mops" > $folds > [1] 0.025 0.500 1.000 1.500 2.000 2.500 3.000 3.500 4.000 > $classes > [1] "CN0" "CN1" "CN2" "CN3" "CN4" "CN5" "CN6" "CN7" "CN8" > $priorimpact > [1] 1 > $cyc > [1] 20 > $normType > [1] "poisson" > $normQu > [1] 0.25 > $upperThreshold > [1] 0.5 > $lowerThreshold > [1] -0.9 > $minWidth > [1] 3 > $SegmentationParams > character(0) > > > *Countsreads*_SL2.40ch02[1512] > GRanges with 1 range and 8 elementMetadata cols: > seqnames ranges strand | Levovil Criollo_new > Ferum_new Stupike Plovdiv LA1420 Cervil LA0147 > <rle> <iranges> <rle> | <integer> <integer> <integer> <integer> > <integer> <integer> <integer> <integer> > [1] SL2.40ch02 [*43819001, 43848000*] * | *112 403 > 164 **1306 1192 1181 907 617* > > seqlengths: > SL2.40ch02 > NA > > > *cnvs*(resCNMOPS_SL2.40ch02)[6] > GRanges with 1 range and 4 elementMetadata cols: > seqnames ranges strand | sampleName > median mean *CN* > <rle> <iranges> <rle> | <factor> <numeric> <numeric> *<character>* > [1] SL2.40ch02 [*43819001, 43906000*] * | Criollo_new > 0.5849618 0.63563 *CN2* > > seqlengths: > SL2.40ch02 > NA > > > *cnvr*(resCNMOPS_SL2.40ch02)[7] > GRanges with 1 range and 8 elementMetadata cols: > seqnames ranges strand | CN.Levovil > CN.Criollo_new CN.Ferum_new CN.Stupike CN.Plovdiv CN.LA1420 CN.Cervil > CN.LA0147 > <rle> <iranges> <rle> | <factor> <factor> <factor> <factor> <factor> > <factor> <factor> <factor> > [1] SL2.40ch02 [*43819001, 43906000*] * | *CN2 CN2 > CN2 CN2 CN2 CN2 CN2 CN2* > > seqlengths: > SL2.40ch02 > NA > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6