Hello,
I have been trying to setup a HTML report of DESEQ2 analysis using ReportingTools and have encountered some snags.
My DESEQ2 analysis has 5 conditions and I want to create a report for each contrast. After reading some posts on this site I was able to get that to happen. Basically I can get logfold change of contrast I want but the graphs created by ReportingTools include all conditions.
If possible I would like to show only box plots of the two conditions that are in my contrast and in order C(my control sample), then CE (treatment sample):
## annotate using biomaRt add.anns <- function(df, mart, ...) { nm <- rownames(df) anns <- getBM( attributes = c("ensembl_gene_id", "mgi_symbol", "description"), filters = "ensembl_gene_id", values = nm, mart = mart) anns <- anns[match(nm, anns[, 1]), ] colnames(anns) <- c("Ensembl", "Symbol", "Gene Description") df <- cbind(anns, df[, 1:ncol(df)]) rownames(df) <- nm df } # to add links to ensembl ensemblLinks <- function(df, ...){ naind <- is.na(df$Ensembl) df$Ensembl <- hwrite(as.character(df$Ensembl), link = paste0("http://www.ensembl.org/Mus_musculus/Gene/Summary?g=", as.character(df$Ensembl)), table = FALSE) df$Ensembl[naind] <- "" return(df) } # to add links to Wikigenes wikiLinks <- function(df, ...){ naind <- is.na(df$Symbol) df$Symbol <- hwrite(as.character(df$Symbol), link = paste0("https://www.wikigenes.org/?search=", as.character(df$Symbol)), table = FALSE) df$Symbol[naind] <- "" return(df) } #More colorful graphs theme <- reporting.theme.alternate() lattice.options(default.theme = theme) #Publish report des2Report <- HTMLReport(shortName = "C vs CE RNASeq Analysis",title = "C vs CE RNA-Ssq Analysis of differential expression using DESeq2", reportDirectory = "./reports") publish(dds.dse, des2Report, .modifyDF=list(add.anns, modifyReportDF, ensemblLinks, wikiLinks), mart = m, pvalueCutoff=0.05, contrast = c("condition","CE","C"), resultsName = "CvsCE", factor = colData(dds.dse)$condition, reportDir="./reports") finish(des2Report)
I think it has something to do with:
factor = colData(dds.dse)$condition
But nothing I have tried works.
Any help/hints would be greatly appreciated.
Also is it possible to alter the plots to be made with box plot of ggplot2 instead?
I want to make something like this (Without the other conditions as stated above):
I think ReportingTools calls the following function from "plotting_functions.R", found in ReportingTools source, to make a graph:
.make.gene.plots <- function(df, expression.dat, factor, figure.directory, ylab.type = "Expression Value", scales = list(), par.settings = list(), xlab = NULL, ...){ scales <- c(scales, list(x = list(rot = 45))) if(is(expression.dat, "CountDataSet")){ ## Get the normalized counts, but add a pseudocount to all of the ## entries, because we're going to plot on a log-scale expression.dat <- counts(expression.dat, normalized=TRUE)+1 scales <- c(scales, list(y = list(log = 10))) } else if(inherits(expression.dat, "eSet")){ expression.dat <- exprs(expression.dat) } else if(is(expression.dat, "DGEList")){ ## If we get a DGEList, get the normalized cpm value. Again, we add a ## pseudocount to avoid problems with zero count data expression.dat <- cpm(expression.dat$counts) + 1 } else if(is(expression.dat, "data.frame")){ ## If it's a data.frame, try to coerce it to a matrix, ## but this might not work. expression.dat <- as.matrix(expression.dat) } else if(is(expression.dat, "DESeqDataSet")){ expression.dat <- counts(expression.dat, normalized = TRUE) + 1 scales <- c(scales, list(y = list(log = 10))) } else if(is.null(expression.dat)) { stop("No expression data was provided. Nothing to plot.") } if(any(!rownames(df) %in% rownames(expression.dat))){ stop("Can't find expression data for some features\n") } for(probe in rownames(df)){ if("Symbol" %in% colnames(df)){ ylab <- paste(df[probe, 'Symbol'], ylab.type) } else { ylab <- paste(probe, ylab.type) } bigplot <- stripplot(expression.dat[ probe, ] ~ factor, panel = panel.boxandstrip, groups = factor, ylab = ylab, scales = scales, par.settings = par.settings, xlab = xlab) miniplot <- .remove.axis.and.padding(bigplot) minipng.filename <- paste("mini", probe ,"png", sep='.') minipng.file <- file.path(figure.directory, minipng.filename) png(minipng.file, height=40, width=200) grid.newpage() pushViewport(viewport(angle = 270, height = unit(220, 'points'), width = unit(44, 'points'), name = "VP")) print(miniplot, newpage = FALSE) upViewport() dev.off() pdf.filename <- paste("boxplot", probe, "pdf", sep=".") pdf.file <- file.path(figure.directory, pdf.filename) pdf(pdf.file, height=4.5, width=4.5) print(bigplot) dev.off() } }
I tried changing
bigplot <- stripplot(expression.dat[ probe, ] ~ factor, panel = panel.boxandstrip, groups = factor, ylab = ylab, scales = scales, par.settings = par.settings, xlab = xlab)
to:
geneCounts <- plotCounts(expression.dat, gene=probe, intgroup=c("condition"), returnData=TRUE) geneCounts$count<- geneCounts$count+1 bigplot<-ggplot(geneCounts, aes(x=condition, y=count, fill=condition)) + scale_y_log10() + stat_boxplot(geom ='errorbar') + geom_boxplot()+ geom_point( position=position_dodge(width=0.75), size=3)+ ggtitle(gene) + theme(text = element_text(color="#000000", size=25))+ theme(plot.title = element_text(color="#000000", face="bold", size=32, hjust = 0.5)) + theme(axis.title = element_text(color="#000000", face="bold", size=28))
This did not work for me. If anyone has any ideas on how to do this or if it's impossible/not worth it please let me know...
Thank you for your time!