Question: collapseReplicates in DESeq2
0
gravatar for Catalina Aguilar Hurtado
4.4 years ago by
United States

Hi All,

I am trying to collapse technical replicates in DESeq2. Already had a look at the manual, but still is not clear to me how to do it. I did try run it but got some errors, need to understand how it works properly.

If I want to collapse A1 with A1.1, B1 with B1.1, C1 and C1.1 , and D2 with D2.1

dds <- DESeqDataSetFromMatrix(
  countData = countdata,
  colData = coldata,
  design = ~ Subject + Treatment)
dds

> coldata
     Subject Treatment Time
A1         1        35    1
A1.1       1        35    1
A2         2        35    1
A3         3        35    1
A4         4        35    1
A5         5        35    1
B1         1        25    1
B1.1       1        25    1
B2         2        25    1
B4         4        25    1
B5         5        25    1
C1         1        35   24
C1.1       1        35   24
C2         2        35   24
C3         3        35   24
C4         4        35   24
C5         5        35   24
D2         2        25   24
D2.1       2        25   24
D4         4        25   24
D5         5        25   24
>

dds$Subject <- factor(sample(paste0("Subject",rep(1:22, c(1,1,2,3,4,5,1,1,2,3,4,5,1,1,2,3,4,5,2,2,4,5)))))??

dds$run <- paste0("run",1:??)

ddsColl <- collapseReplicates(dds, dds$Subject, dds$run)

From the example in the manual: paste0("run",1:12), means now there are 12 rows in the coldata?

## Collapse replicates in manual

dds <- makeExampleDESeqDataSet(m=12)

# make data with two technical replicates for three samples
dds$sample <- factor(sample(paste0("sample",rep(1:9, c(2,1,1,2,1,1,2,1,1)))))
dds$run <- paste0("run",1:12)

ddsColl <- collapseReplicates(dds, dds$sample, dds$run)

##

Also will like to know after if after I collapse the replicates, I need to modify my target file and run DESeqDataSetFromMatrix again??

Thanks,

Catalina

> R.Version()
$platform
[1] "x86_64-apple-darwin10.8.0"

$arch
[1] "x86_64"

$os
[1] "darwin10.8.0"

$system
[1] "x86_64, darwin10.8.0"

$status
[1] ""

$major
[1] "3"

$minor
[1] "1.0"

$year
[1] "2014"

$month
[1] "04"

$day
[1] "10"

$`svn rev`
[1] "65387"

$language
[1] "R"

$version.string
[1] "R version 3.1.0 (2014-04-10)"

$nickname
[1] "Spring Dance"

deseq2 collapsereplicates • 3.8k views
ADD COMMENTlink modified 4.4 years ago by Michael Love26k • written 4.4 years ago by Catalina Aguilar Hurtado50
Answer: collapseReplicates in DESeq2
0
gravatar for Michael Love
4.4 years ago by
Michael Love26k
United States
Michael Love26k wrote:

if we look up the help:

?collapseReplicates

There is information about these arguments:

groupby:     a grouping factor, as long as the columns of object

run:     optional, the names of each unique column in object. if provided, a new column runsCollapsed will be added to the colData which pastes together the names of run

And also information about the result:

Value:     the object with as many columns as levels in groupby.

So, you should make a new column which uniquely identifies the libraries which were sequenced more than once (this is what we refer to as a technical replicate). It looks like this would be:

dds$id <- factor(paste0(dds$subject, dds$treatment, dds$time))

Then provide dds$id to the 'groupby' argument.

You should not run a constructor function (like DESeqDataSetFrom*) after you've run collapseReplicates().

ADD COMMENTlink modified 4.4 years ago • written 4.4 years ago by Michael Love26k

Hi Michael,

when you defne 'groupby' with dds$id I don't understand where do you tell which samples to collapse? Like is my case A1 with A1.1, B1 with B1.1, C1 with C1.1 , and D2 with D2.1 that are my technical replicates. Would I need to specify that?

 

Thanks

ADD REPLYlink written 4.4 years ago by Catalina Aguilar Hurtado50
1

It collapses by the levels in the factor variable 'groupby'.

That is why the output has as many columns as levels in 'groupby'. 

For example, if the original counts matrix has 5 columns, and groupby is A, A, A, B, C, then it adds the counts from columns 1-3 to produce a column "A", and the final count table will have columns A, B, C.

ADD REPLYlink written 4.4 years ago by Michael Love26k

Thanks Michael, now I understand I don't need to define which columns to collapse, but need to change my replicates to have the same ID.

From the example in: ?collapseReplicates I couldn't understand which were the three samples and it was confusing me.

# make data with two technical replicates for three samples
dds$sample <- factor(sample(paste0("sample",rep(1:9, c(2,1,1,2,1,1,2,1,1)))))

ADD REPLYlink written 4.4 years ago by Catalina Aguilar Hurtado50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 157 users visited in the last hour