Is there a smart way of finding unique elements of a
GPos while taking
mcols into account?
unique() only uses
pos, and not
# Four unique positions when taking score-column into account: gp1 <- GPos(c("chr1:10", "chr1:10", "chr1:11", "chr1:11", "chr1:11")) score(gp1) <- c("A", "G", "A", "T", "T") # Unique doesn't see the score column unique(gp1)
After a bit of experimentation, first coercing to a
DataFrame seems to work, but seems a bit hacky: It gives me an warning saying
In .local(x, row.names, optional, ...) : 'optional' argument was ignored and I was unable to locate any documentation for it.
# Coerce to DataFrame DF <- as(gp1, "DataFrame") # Find unique rows unique(DF) # Extract the GPos unique(DF)$X
My use case is a very long GPos (> 30 million) with Ref/Alt alleles.