Question

DESeq2: design - 1 ctl, 2 different treated

0

Entering edit mode

charlesh • 0

@charlesh-13279

Last seen 5.6 years ago

Hi;
Novice at creating a design for DESEq2

We have 3 conditions each with replicates:

CTL(untreated – 4 replicates)
Ce (treated w/ cerium – 4 replicates)
nCe (treated w/ modified cerium – 4 replcates)

Sequences were off 2 separate machines, and different lanes

We’d like to compare each data set to each other, but really the goal is to identify genes that are DE in the nCe samples compared to all others. We’d like to control for variation due to sequencers / lanes if possible.

We’ve created the summarizeOverlaps object

se <- summarizeOverlaps(features=ebg, reads=bamfiles, mode="Union",singleEnd=FALSE, ignore.strand=TRUE, fragments=TRUE )

We are contemplating how to set up the design / contrasts.

We’ve read post re: LRT / ANOVA but are still a bit unsure

One idea LRT analysis, controlling for sequencer:

"condition" is defined in the sampleTable (ctl, ce, nce), as is "flowcell" for each sample

dds = DESeq(se, test = "LRT", full=~flowcell + condition, reduced = ~ flowcell)

Would this be an appropriate analysis that would identify genes DE in nCe vs others?

thanks

Charles

deseq2 • 1.5k views

ADD COMMENT • link 6.9 years ago charlesh • 0

0

Entering edit mode

Thanks Michael - we'll give that a try.

Charles

ADD REPLY • link 6.9 years ago charlesh • 0

score 0 · Answer 1 · 2017-06-17

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 16 hours ago

United States

That works. It will find differences if C is DE relative to A and B, B to A and C, A to B and C, or if they are all distinct. In each of these cases, if you look at one group in particular, say, C, it is DE from at least one other group.

ADD COMMENT • link 6.9 years ago Michael Love 41k

0

Entering edit mode

Michael

I received an update regarding the details of the experiment, and the design now incorporates the fact that some samples had phosphate, others not.

Sample Name	Source	condition	phosphate	pH	sequencer	flowcell	lane
CTL1	McGill	untr	absent	7	D00279	C9VUBANXX	1
CTL2	McGill	untr	absent	7	D00279	C9VUBANXX	2
CTL3	McGill	untr	absent	7	D00279	C9VUBANXX	1
CTL4	McGill	untr	absent	7	D00279	C9VUBANXX	2
CTL1-1	IRIC	untr	present	7	HWI-ST942	C3DWVACXX	7_8
CTL1-3	IRIC	untr	present	7	HWI-ST942	C3DWVACXX	7_8
CTL2-3	IRIC	untr	present	7	HWI-ST942	C3DWVACXX	7_8
Ce-1	IRIC	ce	present	7	HWI-ST942	C3DWVACXX	7_8
Ce-3	IRIC	ce	present	7	HWI-ST942	C3DWVACXX	7_8
Ce2	McGill	ce	absent	7	D00279	C9VUBANXX	2
Ce3	McGill	ce	absent	7	D00279	C9VUBANXX	2
Ce4	McGill	ce	absent	7	D00279	C9VUBANXX	1
nCe-1	IRIC	nce	present	7	HWI-ST942	C3DWVACXX	7_8
nCe-3	IRIC	nce	present	7	HWI-ST943	C3DWVACXX	7_8

We had merged some fastq files for the same biological sample, but they were sequenced on separate lanes (7_8).

Should we split these apart to yield lan7, and lane8 sequences for same sample?

What we would like to do is:

identify DE genes in ctl vs Ce
identify DE genes in clt vs nCe
identify DE genes in Ce vs nCe
control for phosphate, sequencer variation

Our initial plan was trun:

dds = DESeq(se, test = "LRT", full=~flowcell + condition, reduced = ~ flowcell)

to add a control for phosphate, would we:

dds = DESeq(se, test = "LRT", full=~flowcell + phosphate + condition, reduced = ~ flowcell + phosphate)

thanks

Charles

ADD REPLY • link 6.8 years ago charlesh • 0

0

Entering edit mode

So I would recommend a different setup if you want to make these comparisons.

First, you should add together the lanes which represent additional sequencing of the same library, we call these technical replicates. You can use the collapseReplicates() function in DESeq2.

You can use a design of ~phosphate + condition, and then use standard contrasts with the results function, e.g. for your first comparisons it would look like:

dds <- DESeq(dds)
res <- results(dds, contrast=c("condition","Ce","ctl"))

Then for additional comparisons, you don't rerun DESeq(), just build a new results table:

res2 <- results(dds, contrast=c("condition","nCe","ctl"))

By the way, If "ctl" stands for control, note that the standard way to represent a fold change is to put control in the denominator, not the numerator, that is, put control at the end of the contrast argument, so you get fold changes of Ce / control.

ADD REPLY • link 6.8 years ago Michael Love 41k

0

Entering edit mode

Thanks again for your advice Michael!

nomenclature: yes 'CTL' does stand for control, thanks for pointing out it needs to be last arg (denominator).

re: technical replicates

Good to know about collapseReplicates(). All the samples listed are biological reps - separate libraries. For example the library for CTL1-1 was sequenced on 2 lanes, and the fastq's from both lanes were merged. Is this what you were suggesting - merge technical rep's?

So, would it be correct then to create the DESeq2 object using data as is, ie no need to merge?

Charles

ADD REPLY • link 6.8 years ago charlesh • 0

0

Entering edit mode

Yes, no need to merge if you already did so

ADD REPLY • link 6.8 years ago Michael Love 41k

0

Entering edit mode

Running analysis ran OK

design of ~phosphate + condition

dds <- DESeq(dds)

The design was to control for phosphate , however looking at how samples cluster (MDS plot), however shows that not all groups (untr, ce, nce) cluster as desired, ie not all untr cluster together.

Biological variability at its best/worst I suspect.

Are there downstream techniques to deal with this?

Is there a legitimate way to evaluate samples for removal from analyses?

Charles

ADD REPLY • link 6.8 years ago charlesh • 0

0

Entering edit mode

better plot image

ADD REPLY • link 6.8 years ago charlesh • 0

0

Entering edit mode

I have a recent answer here on when to consider an outlier sample worthy of removal, like this week or last. Basically only if it really stands out from the entire dataset and I usually also look for fastqc type indicators.

ADD REPLY • link 6.8 years ago Michael Love 41k

0

Entering edit mode

Michael

I found the post - thanks!

A: Sample not clustering as expected in DESeq2

Charles

ADD REPLY • link 6.8 years ago charlesh • 0