Search
Question: edgeR subsetting DGEList by column/sample
0
gravatar for mnaymik
16 months ago by
mnaymik10
United States
mnaymik10 wrote:

I saw this post from a while ago regarding a similar issue: edgeR: Problem with subsetting a DGEList in latest package version

>d$samples[1:6,]

sample                                                   lib.size    norm.factor  type    time

preExercise_TAGGCTGACTTGAG.1      856    1.1020236     B  pre
preExercise_TCCATCCTCGTTAG.1     1033    1.2198739     B  pre
pbmc001_TTGAGGACTTTCAC.1          703    1.2050717     B  pre
pbmc001_AGTCGCCTGCTTAG.1         1230    1.0304974     B  post
pbmc001_TACTACACAGCACT.1         1053    0.9790636     C  post
pbmc001_TAAACAACCCTTAT.1          895    1.1032946     D  pre

...

I am trying to do differential expression of things only of type 'B', with the time frame as the group 'post vs pre'. I though the easiest way would be to just subset d via:

d.B = d[,grep('B',d$samples$type)]

But I get the error:

Error in `$<-.data.frame`(`*tmp*`, "group", value = integer(0)) : 
  replacement has 0 rows, data has 226

Is there a proper way of doing differential expression on just a subset of the DGEList? 

I got around this by employing the method from the post Iinked:

B=grep('B',d$samples$type)
test=DGEList(d$counts)
test=test[,B]

Then replacing test$samples with its proper subset from d:

test$samples=d$samples[Bcells,]

This just seems sort of hacky...

ADD COMMENTlink modified 16 months ago by Gordon Smyth32k • written 16 months ago by mnaymik10

Something seems strange.

Can you start a new R session, call library(edgeR) and then come back here to update your question with the contents provided by copy/pasting the output of  sessionInfo()

ADD REPLYlink written 16 months ago by Steve Lianoglou12k

> library(edgeR)

Loading required package: limma

> sessionInfo()

R version 3.3.1 (2016-06-21)

Platform: x86_64-apple-darwin13.4.0 (64-bit)

Running under: OS X 10.11.5 (El Capitan)

 

locale:

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

 

attached base packages:

[1] stats     graphics  grDevices utils     datasets  methods   base     

 

other attached packages:

[1] edgeR_3.14.0  limma_3.28.14

ADD REPLYlink written 16 months ago by mnaymik10

Can you post a minimal working example of this behaviour?

ADD REPLYlink written 16 months ago by Aaron Lun17k
3
gravatar for Steve Lianoglou
16 months ago by
Genentech
Steve Lianoglou12k wrote:

Now that you've verified you're running the latest version of edgeR, I've looked a bit more closely at your example and error.

It seems that you have somehow constructed a DGEList (d) with a $samples data.frame that doesn't have a group column -- what were the commands you used to construct d?

In any event, try adding a group column, like so:

d$samples <- transform(y$samples, group=paste(type, time, sep="_"))

Then try subsetting by columns again ...

Also, adding such a group column can be useful in  your downstream analysis since you can now analyze your experiment as a one-way layout:

design <- model.matrix(~ 0 + group, d$samples)

You can then construct contrasts with makeContrasts that are easy-to-interpret arithmetic over the columns of design.

 

ADD COMMENTlink modified 16 months ago • written 16 months ago by Steve Lianoglou12k

Since I was using the time column as my group I had set the samples$group=NULL. Later I had been setting group = time which if I do before subsetting It works just fine. I did not realize group was that sensitive. Thanks!

ADD REPLYlink written 16 months ago by mnaymik10

Or just

d$samples$group <- paste(type, time, sep=".")

would also do the job.

ADD REPLYlink modified 16 months ago • written 16 months ago by Gordon Smyth32k
0
gravatar for Gordon Smyth
16 months ago by
Gordon Smyth32k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth32k wrote:

A DGEList object needs to satisfy some minimum conditions to be a valid object. If you change a DGEList object so that it no longer satisfies these minimum conditions, then operations such as subsetting can no longer be guaranteed to work.

help("DGEList-class") explains what a DGEList object is assumed to contain. It explains that 'group', 'lib.size' and 'norm.factors' are compulsory columns for the d$samples data.frame, so you cannot remove them.

ADD COMMENTlink written 16 months ago by Gordon Smyth32k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 102 users visited in the last hour