Question: edgeR subsetting DGEList by column/sample
gravatar for mnaymik
2.2 years ago by
United States
mnaymik10 wrote:

I saw this post from a while ago regarding a similar issue: edgeR: Problem with subsetting a DGEList in latest package version


sample                                                   lib.size    norm.factor  type    time

preExercise_TAGGCTGACTTGAG.1      856    1.1020236     B  pre
preExercise_TCCATCCTCGTTAG.1     1033    1.2198739     B  pre
pbmc001_TTGAGGACTTTCAC.1          703    1.2050717     B  pre
pbmc001_AGTCGCCTGCTTAG.1         1230    1.0304974     B  post
pbmc001_TACTACACAGCACT.1         1053    0.9790636     C  post
pbmc001_TAAACAACCCTTAT.1          895    1.1032946     D  pre


I am trying to do differential expression of things only of type 'B', with the time frame as the group 'post vs pre'. I though the easiest way would be to just subset d via:

d.B = d[,grep('B',d$samples$type)]

But I get the error:

Error in `$<`(`*tmp*`, "group", value = integer(0)) : 
  replacement has 0 rows, data has 226

Is there a proper way of doing differential expression on just a subset of the DGEList? 

I got around this by employing the method from the post Iinked:


Then replacing test$samples with its proper subset from d:


This just seems sort of hacky...

ADD COMMENTlink modified 2.2 years ago by Gordon Smyth35k • written 2.2 years ago by mnaymik10

Something seems strange.

Can you start a new R session, call library(edgeR) and then come back here to update your question with the contents provided by copy/pasting the output of  sessionInfo()

ADD REPLYlink written 2.2 years ago by Steve Lianoglou12k

> library(edgeR)

Loading required package: limma

> sessionInfo()

R version 3.3.1 (2016-06-21)

Platform: x86_64-apple-darwin13.4.0 (64-bit)

Running under: OS X 10.11.5 (El Capitan)



[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8


attached base packages:

[1] stats     graphics  grDevices utils     datasets  methods   base     


other attached packages:

[1] edgeR_3.14.0  limma_3.28.14

ADD REPLYlink written 2.2 years ago by mnaymik10

Can you post a minimal working example of this behaviour?

ADD REPLYlink written 2.2 years ago by Aaron Lun21k
gravatar for Steve Lianoglou
2.2 years ago by
Steve Lianoglou12k wrote:

Now that you've verified you're running the latest version of edgeR, I've looked a bit more closely at your example and error.

It seems that you have somehow constructed a DGEList (d) with a $samples data.frame that doesn't have a group column -- what were the commands you used to construct d?

In any event, try adding a group column, like so:

d$samples <- transform(y$samples, group=paste(type, time, sep="_"))

Then try subsetting by columns again ...

Also, adding such a group column can be useful in  your downstream analysis since you can now analyze your experiment as a one-way layout:

design <- model.matrix(~ 0 + group, d$samples)

You can then construct contrasts with makeContrasts that are easy-to-interpret arithmetic over the columns of design.


ADD COMMENTlink modified 2.2 years ago • written 2.2 years ago by Steve Lianoglou12k

Since I was using the time column as my group I had set the samples$group=NULL. Later I had been setting group = time which if I do before subsetting It works just fine. I did not realize group was that sensitive. Thanks!

ADD REPLYlink written 2.2 years ago by mnaymik10

Or just

d$samples$group <- paste(type, time, sep=".")

would also do the job.

ADD REPLYlink modified 2.2 years ago • written 2.2 years ago by Gordon Smyth35k
gravatar for Gordon Smyth
2.2 years ago by
Gordon Smyth35k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth35k wrote:

A DGEList object needs to satisfy some minimum conditions to be a valid object. If you change a DGEList object so that it no longer satisfies these minimum conditions, then operations such as subsetting can no longer be guaranteed to work.

help("DGEList-class") explains what a DGEList object is assumed to contain. It explains that 'group', 'lib.size' and 'norm.factors' are compulsory columns for the d$samples data.frame, so you cannot remove them.

ADD COMMENTlink written 2.2 years ago by Gordon Smyth35k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 282 users visited in the last hour