Visualizing CopywriteR's segmentation output in IGV
1
0
Entering edit mode
jcarter • 0
@jcarter-9812
Last seen 8.9 years ago

Excellent tool! I'm interested in methods to convert the data within CopywriteR's segmentation output from DNAcopy, segment.CNA.object, to something that could be saved and visualized as an *.igv file (or any other IGV compatible file format)? (Similar to the *.igv file for the raw log2 transformed read counts) 

copywriter dnacopy • 2.7k views
ADD COMMENT
1
Entering edit mode
t.kuilman ▴ 170
@tkuilman-6868
Last seen 2.6 years ago
Netherlands

Thanks! I wrote a small script for that (see below); I hope this is helpful. I have also written another function (OverviewPlot) to make heatmaps straight from a DNAcopy object. You can find it on our GitHub page. Please let us know if there is anything else we can help you with.

## Load segmented data
load("/PATH/TO/segment.Rdata")

## Replace chromosome names (23, 24) from DNAcopy object with "X" and "Y"
segment.CNA.object$output$chrom[segment.CNA.object$output$chrom == 23] <- "X"
segment.CNA.object$output$chrom[segment.CNA.object$output$chrom == 24] <- "Y"

## Split by factor
segment.CNA.object$output$ID <- as.factor(segment.CNA.object$output$ID)
segments <- split(segment.CNA.object$output, segment.CNA.object$output$ID)

library(GenomicRanges)
## Use the same GC/mappability file that you used in the CopywriteR analysis
load("PATH/TO/hg19_20kb/GC_mappability.rda")

seg.values <- lapply(segments, function(x, bins) {
    ## Create GRanges object
    x <- x[, c("chrom", "loc.start", "loc.end", "seg.mean")]
    colnames(x) <- c("Seqnames", "Start", "End", "Segmean")
    x[, "Start"] <- ceiling(x[, "Start"])
    x <- makeGRangesFromDataFrame(x, keep.extra.columns = TRUE)

    ## Find overlaps with original bins used for analysis
    overlaps <- findOverlaps(x, bins)

    ## Retrieve and return segmentation values for bins
    segment.values <- vector(length = length(bins))
    segment.values[subjectHits(overlaps)] <- unname(unlist(mcols(x[queryHits(overlaps)])))
    return(segment.values)
}, GC.mappa.grange)

## Reformatting
seg.values <- Reduce(function(x, y) {cbind(x, y)}, seg.values)
colnames(seg.values) <- levels(segment.CNA.object$output$ID)
bins <- as(GC.mappa.grange, "data.frame")[, c("seqnames", "start", "end")]
bins <- cbind(bins, paste0(bins$seqnames, ":", paste0(bins$start, "-", bins$end)))
colnames(bins) <- c("Chromosome", "Start", "End", "Feature")
seg.values <- cbind(bins, seg.values)

## Create tracking line for viewing options in IGV
igv.track.line <- "#track viewLimits=-3:3 graphType=heatmap color=255,0,0"

## Write output
write.table(igv.track.line, file = "PATH/TO/segment_values.igv", sep = "\t", row.names = FALSE, col.names = FALSE, quote = FALSE)
suppressWarnings(write.table(seg.values, file = "PATH/TO/segment_values.igv", append = TRUE, sep = "\t", row.names = FALSE, quote = FALSE))
ADD COMMENT
0
Entering edit mode

Thank you! I seem to be running into trouble at the seg.values part. 

> seg.values <- lapply(segments, function(x, bins) {
+   ## Create GRanges object
+   x <- x[, c("chrom", "loc.start", "loc.end", "seg.mean")]
+   colnames(x) <- c("Seqnames", "Start", "End", "Segmean")
+   x[, "Start"] <- ceiling(x[, "Start"])
+   x <- makeGRangesFromDataFrame(x, keep.extra.columns = TRUE)
+   
+   ## Find overlaps with original bins used for analysis
+   overlaps <- findOverlaps(x, bins)
+   
+   ## Retrieve and return segmentation values for bins
+   segment.values <- vector(length = length(bins))
+   segment.values[subjectHits(overlaps)] <- unname(unlist(mcols(x[queryHits(overlaps)])))
+   return(segment.values)
+ }, GC.mappa.grange)
There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)

 

All 50 warnings are the same (as above). And all of the elements in seg.values are vectors consisting of all 0s ("segments" object has data though). Not sure where I went wrong... Any thoughts? 

ADD REPLY
1
Entering edit mode

Seqlevels are the names of the chromosomes, and none of these are in common. Could it be that you have different chromosome notations (i.e., "chr1" vs "1") in your segment.Rdata file and your GC_mappability.rda file? You can check this by typing segment.CNA.object$output$chrom after loading segment.Rdata, and GC.mappa.grange after loading the GC_mappability.rda file.

ADD REPLY
0
Entering edit mode

PERFECT! That is exactly what happened! Fixed it with a 

segment.CNA.object$output$chrom <- paste("chr",segment.CNA.object$output$chrom,sep="")

Thank you so much!

ADD REPLY

Login before adding your answer.

Traffic: 971 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6