edgeR: How to display/plot individual miRNAs found to be differentialy expressed?
1
0
Entering edit mode
Sarah • 0
@2357cabb
Last seen 5 weeks ago
Germany

I have miRNA data and I am analyzing it for differential expression using edgeR. The data consists of four different groups (A,B,C,D) that each consist of five replicates.

I have a few miRNAs, that turned out to be differentially expressed, and I would like to make a plot for each individually to show the expression.

This is the plot that I am currently generating:

Each point stands for one sample (replicate). The y axis displays the count of the miRNA in the sample and the x axis displays the group that the replicate belongs to. Currently, I am just plotting the counts from the dataDGE object that I have used throughout the analysis. (filterByExpr(), calcNormFactors and estimateDisp() have been "applied").

Do I need to apply cpm() to the counts? Is there a better plot type or option to display my results?

mirna = "mirna"
levels = c("A", "B", "C", "D")
plot_data <- data.frame(counts = strtoi(dataDGE$counts[mirna, ]), groups = factor(dataDGE$sample$group, levels = levels)) png(paste("name.pdf")) p <- ggplot(plot_data) + geom_point(aes(x = groups,y = counts)) print(p) dev.off()  Disclaimer: This is not a plot of real data, I randomly inserted some points. miRNAData miRNA DifferentialExpression edgeR • 152 views ADD COMMENT 2 Entering edit mode @james-w-macdonald-5106 Last seen 16 hours ago United States Yes, you should convert to logCPM before plotting. You might consider using the Glimma package, which is quite nice for that sort of thing. 0 Entering edit mode Thank you. I have changed my code to the following:  logCPM <- cpm(dataDGE, prior.count=2, log=TRUE) rownames(logCPM) <- rownames(dataDGE) colnames(logCPM) <- colnames(dataDGE)  mirna = "mirna" levels = c("A", "B", "C", "D") plot_data <- data.frame(counts = logCPM[mirna, ], groups = factor(dataDGE$sample\$group, levels = levels))
png(paste("name.pdf"))
p <- ggplot(plot_data) +
geom_point(aes(x = groups,y = counts))
print(p)
dev.off()


I took those first three lines from this paper: https://f1000research.com/articles/5-1438/v2

The calculated the logCPM counts for displaying a heatmap.

Chen Y, Lun ATL and Smyth GK. From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline [version 2; peer review: 5 approved]. F1000Research 2016, 5:1438 (https://doi.org/10.12688/f1000research.8987.2)