Detection copy number variations with cn.MOPS package

0

Entering edit mode

lpascual ▴ 50

@lpascual-4906

Last seen 9.6 years ago

Dear group, I'm trying to detect copy number variations with the cn.MOPS package. I have eight different samples (coming from different individuals re-sequenced by Solexa). Sequences have been mapped against the reference genome with BWA, the genome coverage of my samples ranges from 13x to 6x. I have run the cn.mops using the package default parameters. However, when I run the algorithm it detects some CNV regions for which the copy number is 2 for all the individuals. Does anyone have an explanation for this result? I paste you here the code I have used and one example of a detected CNV where the copy number is 2. Thanks in advance Laura > sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C [3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8 [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] Rsamtools_1.8.6 Biostrings_2.24.1 cn.mops_1.2.6 [4] GenomicRanges_1.8.12 IRanges_1.14.4 Biobase_2.16.0 [7] BiocGenerics_0.2.0 loaded via a namespace (and not attached): [1] bitops_1.0-4.1 stats4_2.15.1 zlibbioc_1.2.0 > Countsreads_SL2.40ch02<-getReadCountsFromBAM(BAMFiles, c("Cervil","Criollo_new","Ferum_new","LA0147","LA1420","Levovil","Plov div","Stupike"),refSeqName=c("SL2.40ch02"),mode="unpaired") > resCNMOPS_SL2.40ch02<-cn.mops(Countsreads_SL2.40ch02) > head(params(resCNMOPS_SL2.40ch02),n=11L) $method [1] "cn.mops" $folds [1] 0.025 0.500 1.000 1.500 2.000 2.500 3.000 3.500 4.000 $classes [1] "CN0" "CN1" "CN2" "CN3" "CN4" "CN5" "CN6" "CN7" "CN8" $priorimpact [1] 1 $cyc [1] 20 $normType [1] "poisson" $normQu [1] 0.25 $upperThreshold [1] 0.5 $lowerThreshold [1] -0.9 $minWidth [1] 3 $SegmentationParams character(0) > *Countsreads*_SL2.40ch02[1512] GRanges with 1 range and 8 elementMetadata cols: seqnames ranges strand | Levovil Criollo_new Ferum_new Stupike Plovdiv LA1420 Cervil LA0147 <rle> <iranges> <rle> | <integer> <integer> <integer> <integer> <integer> <integer> <integer> <integer> [1] SL2.40ch02 [*43819001, 43848000*] * | *112 403 164 **1306 1192 1181 907 617* seqlengths: SL2.40ch02 NA > *cnvs*(resCNMOPS_SL2.40ch02)[6] GRanges with 1 range and 4 elementMetadata cols: seqnames ranges strand | sampleName median mean *CN* <rle> <iranges> <rle> | <factor> <numeric> <numeric> *<character>* [1] SL2.40ch02 [*43819001, 43906000*] * | Criollo_new 0.5849618 0.63563 *CN2* seqlengths: SL2.40ch02 NA > *cnvr*(resCNMOPS_SL2.40ch02)[7] GRanges with 1 range and 8 elementMetadata cols: seqnames ranges strand | CN.Levovil CN.Criollo_new CN.Ferum_new CN.Stupike CN.Plovdiv CN.LA1420 CN.Cervil CN.LA0147 <rle> <iranges> <rle> | <factor> <factor> <factor> <factor> <factor> <factor> <factor> <factor> [1] SL2.40ch02 [*43819001, 43906000*] * | *CN2 CN2 CN2 CN2 CN2 CN2 CN2 CN2* seqlengths: SL2.40ch02 NA [[alternative HTML version deleted]]

Coverage cn.mops Coverage cn.mops • 1.2k views

ADD COMMENT • link updated 11.4 years ago by Günter Klambauer ▴ 540 • written 11.4 years ago by lpascual ▴ 50

0

Entering edit mode

Günter Klambauer ▴ 540

@gunter-klambauer-5426

Last seen 3.3 years ago

Austria

Hello Laura, For the "CNV regions" I join the individual CNV regions, when they cluster at some location. Therefore the CNV region can be longer than the individual CNVs. To obtain the copy number for this region I summarize the counts from the initial segments. Therefore it may be that the posterior probability of CN2 is the largest for all samples. The posterior probabilities for other copy numbers should also be high. I suggest you look in the slot "cnvs", "segmentation" or "integerCopyNumber" -- there you should find copy numbers different from 2. Another possibility is to lower the parameter "priorImpact". This parameter enforces the algorithm to explain the data by copy number 2. Lowering it leads to more detections. It should be adjusted to each data set. I hope this helps, Regards, G?nter On 12/07/2012 04:35 PM, lpascual wrote: > Dear group, > > I'm trying to detect copy number variations with the cn.MOPS package. I > have eight different samples (coming from different individuals > re-sequenced by Solexa). Sequences have been mapped against the > reference genome with BWA, the genome coverage of my samples ranges > from 13x to 6x. I have run the cn.mops using the package default parameters. > However, when I run the algorithm it detects some CNV regions for which > the copy number is 2 for all the individuals. Does anyone have an > explanation for this result? > I paste you here the code I have used and one example of a detected CNV > where the copy number is 2. > > Thanks in advance > > Laura > > > sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: x86_64-unknown-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=fr_FR.UTF-8 LC_NUMERIC=C > [3] LC_TIME=fr_FR.UTF-8 LC_COLLATE=fr_FR.UTF-8 > [5] LC_MONETARY=fr_FR.UTF-8 LC_MESSAGES=fr_FR.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] Rsamtools_1.8.6 Biostrings_2.24.1 cn.mops_1.2.6 > [4] GenomicRanges_1.8.12 IRanges_1.14.4 Biobase_2.16.0 > [7] BiocGenerics_0.2.0 > > loaded via a namespace (and not attached): > [1] bitops_1.0-4.1 stats4_2.15.1 zlibbioc_1.2.0 > > > Countsreads_SL2.40ch02<-getReadCountsFromBAM(BAMFiles, > c("Cervil","Criollo_new","Ferum_new","LA0147","LA1420","Levovil","Pl ovdiv","Stupike"),refSeqName=c("SL2.40ch02"),mode="unpaired") > > > resCNMOPS_SL2.40ch02<-cn.mops(Countsreads_SL2.40ch02) > > > head(params(resCNMOPS_SL2.40ch02),n=11L) > $method > [1] "cn.mops" > $folds > [1] 0.025 0.500 1.000 1.500 2.000 2.500 3.000 3.500 4.000 > $classes > [1] "CN0" "CN1" "CN2" "CN3" "CN4" "CN5" "CN6" "CN7" "CN8" > $priorimpact > [1] 1 > $cyc > [1] 20 > $normType > [1] "poisson" > $normQu > [1] 0.25 > $upperThreshold > [1] 0.5 > $lowerThreshold > [1] -0.9 > $minWidth > [1] 3 > $SegmentationParams > character(0) > > > *Countsreads*_SL2.40ch02[1512] > GRanges with 1 range and 8 elementMetadata cols: > seqnames ranges strand | Levovil Criollo_new > Ferum_new Stupike Plovdiv LA1420 Cervil LA0147 > <rle> <iranges> <rle> | <integer> <integer> <integer> <integer> > <integer> <integer> <integer> <integer> > [1] SL2.40ch02 [*43819001, 43848000*] * | *112 403 > 164 **1306 1192 1181 907 617* > > seqlengths: > SL2.40ch02 > NA > > > *cnvs*(resCNMOPS_SL2.40ch02)[6] > GRanges with 1 range and 4 elementMetadata cols: > seqnames ranges strand | sampleName > median mean *CN* > <rle> <iranges> <rle> | <factor> <numeric> <numeric> *<character>* > [1] SL2.40ch02 [*43819001, 43906000*] * | Criollo_new > 0.5849618 0.63563 *CN2* > > seqlengths: > SL2.40ch02 > NA > > > *cnvr*(resCNMOPS_SL2.40ch02)[7] > GRanges with 1 range and 8 elementMetadata cols: > seqnames ranges strand | CN.Levovil > CN.Criollo_new CN.Ferum_new CN.Stupike CN.Plovdiv CN.LA1420 CN.Cervil > CN.LA0147 > <rle> <iranges> <rle> | <factor> <factor> <factor> <factor> <factor> > <factor> <factor> <factor> > [1] SL2.40ch02 [*43819001, 43906000*] * | *CN2 CN2 > CN2 CN2 CN2 CN2 CN2 CN2* > > seqlengths: > SL2.40ch02 > NA > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 11.4 years ago Günter Klambauer ▴ 540

Login before adding your answer.