Search
Question: GOTHiC hg19 error
0
gravatar for Nathan Sheffield
21 months ago by
University of Virginia
Nathan Sheffield10 wrote:

Hi, I think I may have found a bug in GOTHiC. It does not seem to work with hg19, while it works fine with hg18, independent of the input, restriction site, etc.

You can reproduce this error by just using the vignette sample, but switching the genome from hg18 to hg19. The input files doesn't really matter I believe

dirPath = system.file("extdata", package="HiCDataLymphoblast")
fileName1 = list.files(dirPath, full.names=TRUE)[1]
fileName2 = list.files(dirPath, full.names=TRUE)[2]
binom=GOTHiC(fileName1,fileName2, sampleName="lymphoid_chr20",
res=1000000, BSgenomeName="BSgenome.Hsapiens.UCSC.hg19",
genome=BSgenome.Hsapiens.UCSC.hg19,
restrictionSite="A^AGCTT", enzyme="HindIII" ,cistrans="all", filterdist=10000,
DUPLICATETHRESHOLD=1, fileType="Bowtie", parallel=FALSE, cores=NULL)

 

Output:

Loading paired reads file ...
Loading mapped reads file ...
Computing binomial ...
Error in .Call2("solve_user_SEW0", start, end, width, PACKAGE = "IRanges") :
  solving row 837856: negative widths are not allowed

 

I traced the problem to:

GOTHiC:::.getRestrictionSitesFromBSgenome

When it tries to create this GRanges object, there are regions with ends > starts and so it can't create it

resGR <- GRanges(seqnames = chrs, ranges = IRanges(start = starts,
        end = ends))

If you look at:

which(ends-starts < 0)
                       chr1_gl000191_random  chr1_gl000192_random        chr4_ctg9_hap1  chr4_gl000193_random  chr4_gl000194_random         chr6_apd_hap1
              7127610               7128943               7135413               7148952               7160965               7172430               7183213
        chr6_dbb_hap3  chr8_gl000196_random  chr8_gl000197_random  chr9_gl000198_random  chr9_gl000201_random chr11_gl000202_random chr17_gl000203_random
              7207137               7213533               7214204               7214510               7215687               7216069               7217083
chr17_gl000204_random chr17_gl000205_random                                             chr18_gl000207_random chr19_gl000208_random                       
              7217456               7219252               7219253               7219266               7219558               7219938               7219998
chr21_gl000210_random                                                                                                                                     
              7220097               7220178               7220280               7220400               7220507               7220594               7220706
                                                                                                                                                          
              7220853               7220953               7221068               7221187               7221289               7221419               7221525
                                                   chrUn_gl000226                                                                          chrUn_gl000229
              7221604               7221696               7221792               7221793               7221894               7221999               7222104

 

Something wrong, I think, with the way it processes the genome, specific to hg19.

sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_AT.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_AT.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_AT.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] BSgenome.Hsapiens.UCSC.hg19_1.4.0 GOTHiC_1.2.2                     
 [3] data.table_1.9.4                  BSgenome_1.34.1                  
 [5] rtracklayer_1.26.3                Biostrings_2.34.1                
 [7] XVector_0.6.0                     GenomicRanges_1.18.4             
 [9] GenomeInfoDb_1.2.5                IRanges_2.0.1                    
[11] S4Vectors_0.4.0                   BiocGenerics_0.12.1              
[13] project.init_0.0.1                BiocInstaller_1.16.5             

loaded via a namespace (and not attached):
 [1] base64enc_0.1-2         BatchJobs_1.6           BBmisc_1.9             
 [4] Biobase_2.26.0          BiocParallel_1.0.3      bitops_1.0-6           
 [7] brew_1.0-6              checkmate_1.6.1         chron_2.3-47           
[10] codetools_0.2-14        colorspace_1.2-6        crayon_1.3.1           
[13] DBI_0.3.1               digest_0.6.8            fail_1.2               
[16] foreach_1.4.2           GenomicAlignments_1.2.2 ggplot2_1.0.1          
[19] grid_3.1.2              gtable_0.1.2            hwriter_1.3.2          
[22] iterators_1.0.7         lattice_0.20-33         latticeExtra_0.6-26    
[25] magrittr_1.5            MASS_7.3-43             memoise_0.2.1          
[28] munsell_0.4.2           plyr_1.8.3              proto_0.3-10           
[31] RColorBrewer_1.1-2      Rcpp_0.12.0             RCurl_1.95-4.7         
[34] reshape2_1.4.1          Rsamtools_1.18.3        RSQLite_1.0.0          
[37] scales_0.2.5            sendmailR_1.2-1         ShortRead_1.24.0       
[40] stringi_0.5-5           stringr_1.0.0           testthat_0.10.0        
[43] tools_3.1.2             XML_3.98-1.3            zlibbioc_1.12.0        
>

ADD COMMENTlink modified 21 months ago • written 21 months ago by Nathan Sheffield10

you can fix the bug, maybe, with something like this:

w = which(ends-starts > 0)
resGR <- GRanges(seqnames = chrs[w], ranges = IRanges(start = starts[w],
        end = ends[w]))
ADD REPLYlink written 21 months ago by Nathan Sheffield10
0
gravatar for Nathan Sheffield
21 months ago by
University of Virginia
Nathan Sheffield10 wrote:

Upgrading to the latest version solves the problem.

ADD COMMENTlink written 21 months ago by Nathan Sheffield10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 166 users visited in the last hour