Search
Question: Visualizing CopywriteR's segmentation output in IGV
0
gravatar for jcarter
21 months ago by
jcarter0
jcarter0 wrote:

Excellent tool! I'm interested in methods to convert the data within CopywriteR's segmentation output from DNAcopy, segment.CNA.object, to something that could be saved and visualized as an *.igv file (or any other IGV compatible file format)? (Similar to the *.igv file for the raw log2 transformed read counts) 

ADD COMMENTlink modified 21 months ago by t.kuilman100 • written 21 months ago by jcarter0
1
gravatar for t.kuilman
21 months ago by
t.kuilman100
Netherlands
t.kuilman100 wrote:

Thanks! I wrote a small script for that (see below); I hope this is helpful. I have also written another function (OverviewPlot) to make heatmaps straight from a DNAcopy object. You can find it on our GitHub page. Please let us know if there is anything else we can help you with.

## Load segmented data
load("/PATH/TO/segment.Rdata")

## Replace chromosome names (23, 24) from DNAcopy object with "X" and "Y"
segment.CNA.object$output$chrom[segment.CNA.object$output$chrom == 23] <- "X"
segment.CNA.object$output$chrom[segment.CNA.object$output$chrom == 24] <- "Y"

## Split by factor
segment.CNA.object$output$ID <- as.factor(segment.CNA.object$output$ID)
segments <- split(segment.CNA.object$output, segment.CNA.object$output$ID)

library(GenomicRanges)
## Use the same GC/mappability file that you used in the CopywriteR analysis
load("PATH/TO/hg19_20kb/GC_mappability.rda")

seg.values <- lapply(segments, function(x, bins) {
    ## Create GRanges object
    x <- x[, c("chrom", "loc.start", "loc.end", "seg.mean")]
    colnames(x) <- c("Seqnames", "Start", "End", "Segmean")
    x[, "Start"] <- ceiling(x[, "Start"])
    x <- makeGRangesFromDataFrame(x, keep.extra.columns = TRUE)

    ## Find overlaps with original bins used for analysis
    overlaps <- findOverlaps(x, bins)

    ## Retrieve and return segmentation values for bins
    segment.values <- vector(length = length(bins))
    segment.values[subjectHits(overlaps)] <- unname(unlist(mcols(x[queryHits(overlaps)])))
    return(segment.values)
}, GC.mappa.grange)

## Reformatting
seg.values <- Reduce(function(x, y) {cbind(x, y)}, seg.values)
colnames(seg.values) <- levels(segment.CNA.object$output$ID)
bins <- as(GC.mappa.grange, "data.frame")[, c("seqnames", "start", "end")]
bins <- cbind(bins, paste0(bins$seqnames, ":", paste0(bins$start, "-", bins$end)))
colnames(bins) <- c("Chromosome", "Start", "End", "Feature")
seg.values <- cbind(bins, seg.values)

## Create tracking line for viewing options in IGV
igv.track.line <- "#track viewLimits=-3:3 graphType=heatmap color=255,0,0"

## Write output
write.table(igv.track.line, file = "PATH/TO/segment_values.igv", sep = "\t", row.names = FALSE, col.names = FALSE, quote = FALSE)
suppressWarnings(write.table(seg.values, file = "PATH/TO/segment_values.igv", append = TRUE, sep = "\t", row.names = FALSE, quote = FALSE))
ADD COMMENTlink modified 21 months ago • written 21 months ago by t.kuilman100

Thank you! I seem to be running into trouble at the seg.values part. 

> seg.values <- lapply(segments, function(x, bins) {
+   ## Create GRanges object
+   x <- x[, c("chrom", "loc.start", "loc.end", "seg.mean")]
+   colnames(x) <- c("Seqnames", "Start", "End", "Segmean")
+   x[, "Start"] <- ceiling(x[, "Start"])
+   x <- makeGRangesFromDataFrame(x, keep.extra.columns = TRUE)
+   
+   ## Find overlaps with original bins used for analysis
+   overlaps <- findOverlaps(x, bins)
+   
+   ## Retrieve and return segmentation values for bins
+   segment.values <- vector(length = length(bins))
+   segment.values[subjectHits(overlaps)] <- unname(unlist(mcols(x[queryHits(overlaps)])))
+   return(segment.values)
+ }, GC.mappa.grange)
There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In .Seqinfo.mergexy(x, y) :
  The 2 combined objects have no sequence levels in common. (Use
  suppressWarnings() to suppress this warning.)

 

All 50 warnings are the same (as above). And all of the elements in seg.values are vectors consisting of all 0s ("segments" object has data though). Not sure where I went wrong... Any thoughts? 

ADD REPLYlink modified 21 months ago • written 21 months ago by jcarter0
1

Seqlevels are the names of the chromosomes, and none of these are in common. Could it be that you have different chromosome notations (i.e., "chr1" vs "1") in your segment.Rdata file and your GC_mappability.rda file? You can check this by typing segment.CNA.object$output$chrom after loading segment.Rdata, and GC.mappa.grange after loading the GC_mappability.rda file.

ADD REPLYlink written 21 months ago by t.kuilman100

PERFECT! That is exactly what happened! Fixed it with a 

segment.CNA.object$output$chrom <- paste("chr",segment.CNA.object$output$chrom,sep="")

Thank you so much!

ADD REPLYlink written 21 months ago by jcarter0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 401 users visited in the last hour