Question

cn.mops segmentation error

0

Entering edit mode

jacorvar ▴ 40

@jacorvar-8972

Last seen 2.2 years ago

European Union

Hi,

I'm trying to perform a CNV analysis on 12 BAM files from WES data using cn.mops package. BAM files were aligned to GChr38, markduplicated and recalibrated following the GATK best practices.

BAMFiles <- grep('84|93|117|134|350|454', list.files(path = 'results/Preprocessing/', pattern = '*.recal.bam$', recursive = T, full.names = T), value = T)
segments <- read.table("data/exons_plus50.bed",sep="\t",as.is=TRUE)
gr <- GRanges(segments[,1],IRanges(segments[,2],segments[,3]))
gr2 <- tidyChromosomes(
    gr,
    keep.X = T,
    keep.Y = T,
    keep.M = FALSE,
    keep.nonstandard = F,
    genome = NULL
)
X <- getSegmentReadCountsFromBAM(BAMFiles,GR=gr2, parallel = 16)

The targets bed file (data/exons_plus50.bed) contains all the exons plus their flanking 50 bp for the corresponding human genome version. I had to exclude the M and the non-standard (haplotypes) chromosomes because the getSegmentReadCountsFromBAM function raises an error otherwise.

So far it ran as expected, without errors.

However, when running exomecn.mops function, it prompts the following:

resCNMOPS <- exomecn.mops(X, parallel = 16)
Normalizing...
Starting local modeling, please be patient...  
Reference sequence:  chr1
Reference sequence:  chr2
Reference sequence:  chr3
Reference sequence:  chr4
Reference sequence:  chr5
Reference sequence:  chr6
Reference sequence:  chr7
Reference sequence:  chr8
Reference sequence:  chr9
Reference sequence:  chrX
Reference sequence:  chrY
Reference sequence:  chr10
Reference sequence:  chr11
Reference sequence:  chr12
Reference sequence:  chr13
Reference sequence:  chr14
Reference sequence:  chr15
Reference sequence:  chr16
Reference sequence:  chr17
Reference sequence:  chr18
Reference sequence:  chr19
Reference sequence:  chr20
Reference sequence:  chr21
Reference sequence:  chr22
Starting segmentation algorithm...
Using "fastseg" for segmentation.
Error in .width_as_unnamed_integer(width, msg = "an end that is greater or equal to its start minus one") : 
  each range must have an end that is greater or equal to its start minus one

Setting the algorithm to DNACopy gives the same error.

I've tried with the lifted over (GRCh37 -> GRCh38) version of the baits BED file provided by the vendor (the original one has the hg19 version) for the corresponding kit (here) instead of the targets file (containing all exons), but the error persists.

I appreciate any clue about this issue.

sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] BRGenomics_1.6.0     rtracklayer_1.54.0   cn.mops_1.40.0       GenomicRanges_1.46.1
[5] GenomeInfoDb_1.30.1  IRanges_2.28.0       S4Vectors_0.32.4     BiocGenerics_0.40.0

cn.mops • 1.9k views

ADD COMMENT • link updated 3.4 years ago by gpovysil • 0 • written 3.5 years ago by jacorvar ▴ 40

0

Entering edit mode

Sorry for the delay. Based on the error message, it looks like at least one of your end coordinates is lower than the corresponding start position. The Truseq bed file I downloaded from the link you provided, does not have this issue, but maybe it is introduced in the liftover step. Can you share the actual bed file you used?

ADD REPLY • link 3.4 years ago gpovysil • 0

0

Entering edit mode

Sure, you can download for the next week on this link: https://we.tl/t-xFqjQLfzKl

ADD REPLY • link 3.4 years ago jacorvar ▴ 40

0

Entering edit mode

While there is interval in your bed file with higher start compared to end, some of the intervals are really small (5-10 bp). I would suggest you remove those and try again since read counting will not work well anyway for such small intervals. For a first test I would filter out all with a width of less than 100bp, you can try adding some of them back later if it works.

ADD REPLY • link 3.4 years ago gpovysil • 0