Differential Expression Analysis on a Subset Derived from a Previous Contrast in edgeR or DESeq2
2
0
Entering edit mode
@corinne_hutfilz-22548
Last seen 22 months ago

Starting context: I'm pretty intro-level at bioinformatics. I'm using ATACseq datasets. 4 groups (multiple replicates per group) of the following format:

Group 1 = ConditionA:TreatmentX, Group 2 = ConditionA:TreatmentY, Group 3 = ConditionB:TreatmentX, Group 4 = ConditionB:TreatmentY

First, I want to obtain the set of genomic regions that do NOT change in accessibility between groups 1 and 2 (call this {1⋂2}). Is this possible? The vignettes for DESeq2 and edgeR seem to only describe how to obtain regions that differ between groups.

Second, assuming I now have {1⋂2}, I next need to obtain the genomic regions that differ between {1⋂2} and the combination of genomic regions in groups 3 and 4 (call this {3⋃4}). I think I could do this in a similar fashion to edgeR's example, ((drug.2hr - drug.0hr) - (placebo.2hr - placebo.0hr)). Except in my case, I need the opposite function of (drug.2hr - drug.0hr), and a new function for (placebo.2hr - placebo.0hr) that effectively combines the regions defined in both placebo.2hr and placebo.0hr. Something like ((drug.2hr + drug.0hr) - (placebo.2hr OR placebo.0hr)).

Third, I don't know how to obtain {3⋃4}. Should I simply consider all replicates of group 3 as though they were more replicates of group 4, then generate a consensus peakset from that?

One option I considered was to find a way to combine the resulting datasets of {{1⋂2} - 3} and {{1⋂2} - 4}, but while this skirts my third problem, I still don't know how to describe to edgeR or DESeq2 the logical relationship of ⋂.

deseq2 design formula design matrix edgeR ATAC • 211 views
0
Entering edit mode
@mikelove
Last seen 2 days ago
United States

The DESeq2 vignette does have a section on how to define an altHypothesis which is that the change is small in absolute value. It is also mentioned in the paper and in ?results.

For simplicity, for working with DESeq2 I'd recommend to take the intersection of FDR sets. So you can find the set that satisfies (1) and then other criteria. If you want to contrast the average of two groups in DESeq2 you can use contrast and listValues, see ?results. I think this answers (2) and (3).

0
Entering edit mode
@gordon-smyth
Last seen 2 hours ago
WEHI, Melbourne, Australia

Starting context

Let me see if I have this correct. You seem to be making comparisons between four groups, which correspond to all possible combinations of two conditions (A/B) and two treatments (X/Y). The ":" notation in your question is confusing, because R uses ":" to represent interactions, but I am assuming that your Group1 simply means treatment X applied to condition A.

In edgeR, it would be usual to set it up like this:

Group <- factor(paste(Condition,Treatment,sep="."))
design <- model.matrix(~0+Group)
colnames(design) <- levels(Group)


Then the four groups are called A.X, A.Y, B.X and B.Y respectively.

You can easily make any comparison between the groups, for example:

Cont <- makeContrasts(
A.YvsA.X = A.Y - A.X,
B.YvsB.X = B.Y - B.X,
levels = design)


defines comparisons between Y and X for conditions A and B separately.

First

You want genomic regions that don't change between A.X and A.Y. In edgeR you can use

fit <- glmQLFit(y, design)
test1 <- glmQLFTest(fit, contrast=Cont[,"A.YvsA.X"])
result1 <- decideTests(test1)


Then the regions that don't change are those for which result1==0.

Second

I'm going to assume that you want to find regions that change between Y vs X for condition B but not for condition A. You can get those by:

test2 <- glmQLFTest(fit, contrast=Cont[,"B.YvsB.X"])
result2 <- decideTests(test2)
tab <- topTags(test2, n=Inf, sort="none")
MyRegions <- tab[ !result1 & result2, ]


I don't really follow the rest of your question. In particular, I have no idea what you mean by {3⋃4}. But you surely can apply the above principles to get any combination of regions that you need.