Question

edgeR and lack of counts ID on CPM matrix

0

Entering edit mode

abrantes.patricia ▴ 10

@abrantespatricia-13177

Last seen 6.8 years ago

Dear all,

I'm using edgeR and I need to have a final cpm matrix with annotated IDs. Is there a way to not loose the iD information after doing

y<-calcNormFactors(y) and logCPM<-cpm(y, log=TRUE, prior.count=2)?

I really need to cross information from logCPM with other data...

Thanks in advance,

Patrícia

edgeR • 1.4k views

ADD COMMENT • link 6.8 years ago abrantes.patricia ▴ 10

score 0 · Answer 1 · 2017-07-20

0

Entering edit mode

Aaron Lun ★ 28k

@alun

Last seen 15 hours ago

The city by the bay

If you're talking about Ensembl or Entrez IDs (or anything involving a single string), just store them as the rownames of y. These will be preserved in the output of cpm.

ADD COMMENT • link 6.8 years ago Aaron Lun ★ 28k

score 0 · Answer 2 · 2017-07-20

0

Entering edit mode

abrantes.patricia ▴ 10

@abrantespatricia-13177

Last seen 6.8 years ago

Thanks Aaron for your quick reply.

Unfortunately that it's not what I get...

My input file has that information (annotated counts.matrix) but after doing the y<-calcNormFactors(y) and logCPM<-cpm(y,....) those columns disappear and only columns with logCPM for each of the samples are maintained.

ADD COMMENT • link 6.8 years ago abrantes.patricia ▴ 10

0

Entering edit mode

Reply to posts with "add comment", not "add your answer", unless you're answering your own question.

You need to add the row names to the matrix (and thus the DGEList). Doing the following:

y <- matrix(rnbinom(10000,mu=5,size=2),ncol=4)
rownames(y) <- paste0("X", seq_len(nrow(y)))
d <- DGEList(counts=y)
d <- calcNormFactors(d)
logCPM <- cpm(d)

... gives me logCPM with row and column names. So you must be doing something different.

ADD REPLY • link 6.8 years ago Aaron Lun ★ 28k