Tutorial:Read data file containing both counts and annotation into edgeR
0
0
Entering edit mode
@gordon-smyth
Last seen 9 minutes ago
WEHI, Melbourne, Australia

This post is in response to a number of emails and posts asking about reading data into edgeR and producing RPKM.

Reading a data file containing both counts and annotation

Suppose we start with a tab-delimited file counts.txt like this:

data file

The file contains counts but also gene IDs and an annotation column. To read this into edgeR:

library(edgeR)
Data <- read.delim("counts.txt", sep="\t", row.names=1)
y <- DGEList(Data, annotation="Length")

To normalize the library sizes and compute a matrix of RPKM values:

y <- normLibSizes(y)
RPKM <- rpkm(y)

To make a PCA plot of log-RPKM values

logRPKM <- rpkm(y, log=TRUE)
plotMDS(logRPKM, gene.selection="common")

To make an MDS plot from the log-RPKM values

plotMDS(logRPKM, gene.selection="pairwise")

Creating a DGEList from featureCounts

If the count matrix is created using Rsubread::featureCounts then the output can be transformed to a DGEList directly, without any need for intermediate data files:

fc <- featureCounts( ... )
y <- featureCounts2DGEList(fc)

The resulting DGEList object will automatically include annotation columns including chromosome and gene length.

rpkm edgeR featureCounts • 1.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 566 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6