Tutorial:Read data file containing both counts and annotation into edgeR
Entering edit mode
Last seen 1 hour ago
WEHI, Melbourne, Australia

This post is in response to a number of emails and posts asking about reading data into edgeR and producing RPKM.

Reading a data file containing both counts and annotation

Suppose we start with a tab-delimited file counts.txt like this:

data file

The file contains counts but also gene IDs and an annotation column. To read this into edgeR:

Data <- read.delim("counts.txt", sep="\t", row.names=1)
y <- DGEList(Data, annotation="Length")

To normalize the library sizes and compute a matrix of RPKM values:

y <- normLibSizes(y)
RPKM <- rpkm(y)

To make a PCA plot of log-RPKM values

logRPKM <- rpkm(y, log=TRUE)
plotMDS(logRPKM, gene.selection="common")

To make an MDS plot from the log-RPKM values

plotMDS(logRPKM, gene.selection="pairwise")

Creating a DGEList from featureCounts

If the count matrix is created using Rsubread::featureCounts then the output can be transformed to a DGEList directly, without any need for intermediate data files:

fc <- featureCounts( ... )
y <- featureCounts2DGEList(fc)

The resulting DGEList object will automatically include annotation columns including chromosome and gene length.

rpkm edgeR featureCounts • 449 views

Login before adding your answer.

Traffic: 699 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6