Gviz: plotting mean values of biological replicates
3
0
Entering edit mode
@toomassilla-8371
Last seen 9.5 years ago

Hi!

Fist of all, Gviz i a great tool and even a 'wet lab guy' like myself can handle it. Thanks! However, I have a question.

I have been plotting some stranded RNA-seq data and it seems to work fine. I get nice graphs (histogram) where I see reads mapped to + or - strand. I have three biological replicates of 'control' and 'treated' samples and instead of plotting all of them separately, I would like to plot mean values of three replicates . How I do that? I thought that grouping could be solution, but unfortunately data grouping is not so well explained in the tutorial. Probably this is something very intuitive for R gurus :). Please advise.

Thanks!

Toomas       

gviz • 2.5k views
ADD COMMENT
0
Entering edit mode
@florianhahnenovartiscom-3784
Last seen 6.3 years ago
Switzerland

Hi Toomas,

you should give us a little more detail of what you are trying to do. Did you read in your RNA-seq data into an AlignmentsTrack object? Or did you already start from a coverage representation and use a DataTrack object to plot these histograms? 

Florian

ADD COMMENT
0
Entering edit mode

Hi!

Thanks for the reply. I read my data as DataTrack object that has importFunction to count strand information

EGFP_C= DataTrack('/Users/RNA_seq/C/C_aligned.sorted.bam', genome="hg19", chromosome=myChr, importFunction=strandedBamImport, stream=TRUE, legend=TRUE, col=c("cornflowerblue","purple"), groups=c("Forward","Reverse"),ylim=c(-200,50),name='EGFP_C')

ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

Have you read pages 20-23 of the Gviz user's guide?

ADD COMMENT
0
Entering edit mode
@aliaksei-holik-4992
Last seen 8.9 years ago
Spain/Barcelona/Centre for Genomic Regu…

Hi Toomas.

I've had the same problem with plotting ChIP-seq data. I got around it by using GenomicAlignments tools, which IMO gives you a bit more flexibility than purely relying on Gviz. The general workflow is:

Read the region of interest from all the bam files -> Calculate the coverage over that region -> Divide by the library size to convert your coverage into cpm -> Calculate the average coverage -> Make a DataTrack object and plot in Gviz.

Here's an example code based on two replicates over two conditions:

library(Gviz)
library(GenomicRanges)
library(GenomicAlignments)
# Get BAM file names and respective library sizes
bams <- list.files(".", ".bam$")
lib.size <- c(10e6, 11e6, 12e6, 13e6)
names(lib.size) <- bams
# Set target range
target.range <- c(5e6, 6e6)

# Set the chromosome and genome
chr <- "chr1"
gen <- "mm10"
## Make a GRanges object to cover the region of interest
z <- GRanges(chr, IRanges(target.range[1], target.range[2]))

## Read in the region of interest from all the bam files
param <- ScanBamParam(which=z, flag=scanBamFlag(isDuplicate=F))
x <- lapply(bams, function(x) readGAlignments(x, param=param))
names(x) <- bams

## Calculate the read coverage over the region of interest
xcov <- lapply(x, coverage)
 
# Library size normalisation to convert into cpm
xcov <- lapply(bams, function(x) xcov[[x]]/lib.size[x]*1e6)

## Calculate average coverage for two control and two experimental samples
xcov[[1]] <- (xcov[[1]] + xcov[[2]])/2
xcov[[2]] <- (xcov[[3]] + xcov[[4]])/2
## Subset to average coverages only.

xcov <- xcov[1:2]
grps <- names(xcov) <- c("Control", "Experiment")
 
## Convert the coverage object into GRanges object
xgr <- lapply(xcov, function(x) as(x, "GRanges"))
## Set the colours for Control and KO panels respectively
clr <- c("darkred", "darkblue")
names(clr) <- grps

## Generate DataTrack objects for reads
tracks <- lapply(grps, function(x) DataTrack(xgr[[x]][xgr[[x]] %over% z],
                         background.title=clr[x], genome=gen, name=x, chromosome = chr))
names(tracks) <- grps
track.list <- unlist(lapply(grps, function(x) c(tracks[[x]])))
## Plot the tracks along with the genome axis and annotation track - ideally would want to collapse
## the annotation track
plotTracks(track.list, from = target.range[1], to = target.range[2],
           background.panel = "#fffff6",
           lwd=2, col.line="darkblue", ylim=c(0, 1), type=c("hist"),
           col.histogram = "#f5a122", fill.histogram = "#f5a122", window = -1, windowSize = 1000)

Disclaimer: Reading in of bam files appears to have issues with R3.2.0. Apparently it's been fixed in R3.2.1, but I can't verify this. The code was written for and works with R3.1.1

Good luck!

ADD COMMENT

Login before adding your answer.

Traffic: 344 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6