diffHic: Error while counting in predefined regions
2
0
Entering edit mode
@manjulathimma-11250
Last seen 8.2 years ago

Hi Aaron, I am trying to count interacting fragments in pre-defined regions.

But there seems to an error.

redata <- connectCounts(input, hs.param, regions=gene.body)
Error in connectCounts(input, hs.param, regions = gene.body) :
  chromosome present in regions and not in fragments
In addition: Warning message:
In connectCounts(input, hs.param, regions = gene.body) :
  stranded region ranges have no interpretation, coercing unstrandedness

> input
[1] "/Users/thimmamp/temp_tobe_deleted/HiCMapped/HepG2_trimmed.h5" "/Users/thimmamp/temp_tobe_deleted/HiCMapped/Ago1_trimmed.h5"

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.9.5 (Mavericks)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 GenomicFeatures_1.24.5                  AnnotationDbi_1.34.4                    BiocInstaller_1.22.3                    diffHic_1.4.3                          
 [6] InteractionSet_1.0.4                    SummarizedExperiment_1.2.3              Biobase_2.32.0                          GenomicRanges_1.24.2                    GenomeInfoDb_1.8.3                     
[11] IRanges_2.6.1                           S4Vectors_0.10.2                        BiocGenerics_0.18.0                    

loaded via a namespace (and not attached):
 [1] edgeR_3.14.0            XVector_0.12.1          zlibbioc_1.18.0         GenomicAlignments_1.8.4 BiocParallel_1.6.3      BSgenome_1.40.1         lattice_0.20-33         tools_3.3.1            
 [9] grid_3.3.1              rhdf5_2.16.0            csaw_1.6.1              DBI_0.4-1               Rhtslib_1.4.3           Matrix_1.2-6            rtracklayer_1.32.2      bitops_1.0-6           
[17] biomaRt_2.28.0          RCurl_1.95-4.8          RSQLite_1.0.0           limma_3.28.17           Rsamtools_1.24.0        Biostrings_2.40.2       locfit_1.5-9.1          XML_3.98-1.4           
 

Would you please help me resolve this?

error • 1.5k views
ADD COMMENT
0
Entering edit mode

If you don't tag this with diffHic, I won't get notified of your post.

ADD REPLY
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 7 hours ago
The city by the bay

See C: diffHic:Fails to load bam file during preparePairs. Don't double-post in two locations, it's annoying.

ADD COMMENT
0
Entering edit mode
@manjulathimma-11250
Last seen 8.2 years ago

Hi Aaron,

Removing unwanted chromosome was already done.

 

frags <- frags[seqnames(frags) %in% paste0("chr",c(1:22,"X","Y"))]
hs.param <- reform(hs.param,fragments=frags)
#Ago1 <- preparePairs("/Users/thimmamp/temp_tobe_deleted/HiCMapped/Ago1_presplitmap_out_fixmate_nochrM.bam",hs.param,file="/Users/thimmamp/temp_tobe_deleted/HiCMapped/Ago1_diffhic.h5",dedup=TRUE,minq=10)
Ago1 <- preparePairs("/Users/thimmamp/temp_tobe_deleted/HiCMapped/Ago1_presplit_out_fixmate_nochrM_new.bam",hs.param,file="/Users/thimmamp/temp_tobe_deleted/HiCMapped/Ago1_diffhic.h5",dedup=TRUE,minq=10)
HepG2 <- preparePairs("/Users/thimmamp/temp_tobe_deleted/HiCMapped/HepG2_presplit_out_fixmate_nochrM_new.bam",hs.param,file="/Users/thimmamp/temp_tobe_deleted/HiCMapped/HepG2_diffhic.h5",dedup=TRUE,minq=10)


I  do not understand the need to do it again....

If I should redo this, how would I do it for 'input' object before invoking connectCounts?

 

ADD COMMENT
0
Entering edit mode

BTW sorry for double posting!

ADD REPLY
0
Entering edit mode

No, the error in connectCounts is caused by the fact that you've got chromosomes in regions that aren't in fragments. This requires removing the excess chromosomes from regions. In the code above, you're only removing entries in fragments with chromosomes that aren't in the BAM file, which is a completely different problem.

P.S. Post replies to answers as comments rather than as answers, unless you're actually answering the original post.

ADD REPLY
0
Entering edit mode

So How to identify and remove chromosomes in regions that arent in fragments?

ADD REPLY
0
Entering edit mode

Well, it's pretty simple. Try:

regions[seqnames(regions) %in% seqlevels(frags)]

... or something like that.

ADD REPLY
0
Entering edit mode

Sorry for being naive, since I am new to this HiC world!

When I run the command,

regions[seqnames(regions) %in% seqlevels(frags)]
Error in (function (classes, fdef, mtable)  :
  unable to find an inherited method for function ‘seqnames’ for signature ‘"standardGeneric"’

I am following your user guide and at sections 3.2.2 and 3.3

Could you please tell which object should I modify, in order to get rid of unwanted chromosome(s) in the regions file?

 

ADD REPLY
0
Entering edit mode

regions is a placeholder for your actual GRanges object. In your case, it is gene.body.

ADD REPLY
0
Entering edit mode

I did try to filter unwanted chromosomes,

> gene.body <- gene.body[seqnames(gene.body) %in% seqlevels(frags)]
> redata <- connectCounts(input, hs.param, regions=gene.body)
Error in connectCounts(input, hs.param, regions = gene.body) :
  chromosome present in regions and not in fragments
In addition: Warning message:
In connectCounts(input, hs.param, regions = gene.body) :
  stranded region ranges have no interpretation, coercing unstrandedness

Why this error is thrown again? Have I done it the right way?

 

ADD REPLY
0
Entering edit mode

frags should be the same as whatever hs.param$fragments is. You might also have to try:

gene.body <- gene.body[seqnames(gene.body) %in% 
    seqlevelsInUse(hs.param$fragments)]

This removes the entries in gene.body located on chromosomes that do not exist in hs.param$fragments. If it still doesn't work, help me out and try to diagnose the problem on your end; what chromosomes are present in seqnames(gene.body) that are not present in seqnames(hs.param$fragments)?

ADD REPLY
0
Entering edit mode

Hi Aaron,

Thanks for your reply. It is working but with a warning message.

>     gene.body <- gene.body[seqnames(gene.body) %in%
+     seqlevelsInUse(hs.param$fragments)]
> redata <- connectCounts(input, hs.param, regions=gene.body)
Warning message:
In connectCounts(input, hs.param, regions = gene.body) :
  stranded region ranges have no interpretation, coercing unstrandedness

Can I ignore this warning?

 

ADD REPLY
0
Entering edit mode

Yes. It just means that the strandedness of the entries in gene.body are ignored. This is sensible given that Hi-C works on the DNA and doesn't care about the direction of transcription.

ADD REPLY
0
Entering edit mode

Thanks very much!

ADD REPLY

Login before adding your answer.

Traffic: 452 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6