I have 3 annotated feature for my GI object and when I annotated them with annotateInteractions() and checked the interactions with isInteractionType() I got duplicate interactions in the data.frame. For example if a connection between 2 annotated regions are in anchor1 and anchor2, respectively, I'm also seeing the same interaction in anchor2 and anchor1. How can I have only the unique interactions?
I'm calling specific interactions between two objects like this:
Please use the GenomicInteractions tag for questions about GenomicInteractions, so that the maintainers get notified about them. It's only by chance that I was on the site to check something related to my own work today and saw this question.
I'm afraid that I'm still not really sure what you're trying to achieve, from this question and our email conversation. Nevertheless, I've put together some examples of how I would use linkOverlaps to make a data frame that contains information about all unique pairs of enhancers and promoters (for example) that are linked by interactions in a GenomicInteractions object. I was going to send this to you by email but I thought I'd put it here instead so that future questioners might also be able to find it.
These examples use the example data that's included in the package and used in the Hi-C analysis vignette.
As you can see, the numbers of interactions of each type returned by annotateInteractions and linkOverlaps are very different. This is because annotateInteractions is intended to assign a single class to each interacting anchor region (as explained in the vignette), not to identify all pairwise interactions between regions of interest.
More directly related to the question you're asking here, is it possible that your GenomicInteractions object includes interactions between regionA-regionB and interactions between regionB-regionA? (see example below, where there is an interaction between chr1:1-10--chr1:11-20 that is also present as the reversed interaction between chr1:11-20--chr1:1-10). If so, you can use the swapAnchors and unique functions to remove these before using annotateInteractions. Again, see example below for how to do this.
library(GenomicInteractions)
gi <- GInteractions(anchor1 = GRanges(seqnames = "chr1",
ranges = IRanges(start = c(1, 1, 11),
end = c(10, 10, 20))),
anchor2 = GRanges(seqnames = "chr1",
ranges = IRanges(start = c(11, 21, 1),
end = c(20, 30, 10))))
gi
#> GInteractions object with 3 interactions and 0 metadata columns:
#> seqnames1 ranges1 seqnames2 ranges2
#> <Rle> <IRanges> <Rle> <IRanges>
#> [1] chr1 1-10 --- chr1 11-20
#> [2] chr1 1-10 --- chr1 21-30
#> [3] chr1 11-20 --- chr1 1-10
#> -------
#> regions: 3 ranges and 0 metadata columns
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
swapAnchors(gi, mode = "order")
#> GInteractions object with 3 interactions and 0 metadata columns:
#> seqnames1 ranges1 seqnames2 ranges2
#> <Rle> <IRanges> <Rle> <IRanges>
#> [1] chr1 1-10 --- chr1 11-20
#> [2] chr1 1-10 --- chr1 21-30
#> [3] chr1 1-10 --- chr1 11-20
#> -------
#> regions: 3 ranges and 0 metadata columns
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
unique(swapAnchors(gi))
#> GInteractions object with 2 interactions and 0 metadata columns:
#> seqnames1 ranges1 seqnames2 ranges2
#> <Rle> <IRanges> <Rle> <IRanges>
#> [1] chr1 1-10 --- chr1 11-20
#> [2] chr1 1-10 --- chr1 21-30
#> -------
#> regions: 3 ranges and 0 metadata columns
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
If these examples don't help, perhaps you can create a reproducible example using the example data included in the package to demonstrate the problem. Otherwise it is very difficult to understand what you want to achieve and what the problem might be, without seeing your code or data.
Created on 2020-05-19 by the reprex package (v0.3.0)
As you can see, the numbers of interactions of each type returned by
annotateInteractions
andlinkOverlaps
are very different. This is becauseannotateInteractions
is intended to assign a single class to each interacting anchor region (as explained in the vignette), not to identify all pairwise interactions between regions of interest.More directly related to the question you're asking here, is it possible that your GenomicInteractions object includes interactions between regionA-regionB and interactions between regionB-regionA? (see example below, where there is an interaction between chr1:1-10--chr1:11-20 that is also present as the reversed interaction between chr1:11-20--chr1:1-10). If so, you can use the
swapAnchors
andunique
functions to remove these before usingannotateInteractions
. Again, see example below for how to do this.Created on 2020-05-19 by the reprex package (v0.3.0)
If these examples don't help, perhaps you can create a reproducible example using the example data included in the package to demonstrate the problem. Otherwise it is very difficult to understand what you want to achieve and what the problem might be, without seeing your code or data.
Thank you. That would work for me.
Great! Could you please mark the answer as accepted if it solves your problem? Thanks!