Question: DESeq2 2x2 factorial design: find genes unique to factor A?
0
gravatar for Keith Hughitt
25 days ago by
Keith Hughitt120
United States
Keith Hughitt120 wrote:

Greetings!

I have a dataset comparing the effects of two different treatments on cell line line, including both control and combination treatment samples:

  • Vehicle
  • Treatment A
  • Treatment B
  • Treatment A + B

One of my goals is to determine which genes uniquely respond to each of the treatments, e.g. A & !B & !AB.

A simple approach to this is to perform the individual contrasts and then exclude genes that were found to be significant in the "other" treatment groups/

For example, two find to genes that respond uniquely to treatment A, using a treatment coding, one could do:

dds <- DESeqDataSetFromFeatureCounts(..., design=~treatment)
dds <- DESeq(dds)

res_A  <- lfcShrink(dds, contrast = c('treatment', 'A', 'Vehicle'))
res_B  <- lfcShrink(dds, contrast = c('treatment', 'B', 'Vehicle'))
res_AB <- lfcShrink(dds, contrast = c('treatment', 'AB', 'Vehicle'))

ind <- (res_A$padj < 0.01) & (res_B$padj > 0.01) & (res_AB$padj > 0.01)

For the purposes of determine which genes respond uniquely to treatment A, this works fine.

However, suppose I then want to perform something like GSEA on the results to determine functions uniquely associated with treatment A. The P-values for the contrast res_A above relate only to the question of which genes respond to treatment A, not which genes uniquely respond to treatment A.

Is there a better way to frame this contrast to determine P-values that correspond to the unique response of the samples to treatment A?

-- update 2019/11/18 --

Rather than attempting to answer the question using a single contrast, another approach I've considered is using the altHypothesis parameter for DESeq's results() function to separately look for genes not differentially expressed under each of the treatment conditions, e.g.:

not_A <- results(dds, name = 'A_vs_Vehicle', lfcThreshold = 0.5, altHypothesis = "lessAbs")
not_B <- results(dds, name = 'B_vs_Vehicle', lfcThreshold = 0.5, altHypothesis = "lessAbs")
not_AB <- results(dds, name = 'AB_vs_Vehicle', lfcThreshold = 0.5, altHypothesis = "lessAbs")

The genes of interest (those affected uniquely by treatment A) could then be found by taking the intersection of significant genes in res_A & not_B & not_AB.

The benefit here us that the later two contrasts also return P-values corresponding to the genes that do not appear to be differentially expressed under each of the alternative treatment conditions.

One could then do something like taking the mean of the three P-values or use Fisher's Method to combine them, in order to rank the genes by their ~ "combined evidence for unique expression under treatment A".

The derived P-values would not have any precise meaning, and an equal weighting like this might give to much priority to the (2/3) "not" contrasts. One could perhaps perform a weighted average and with 0.5 / 0.25 / 0.25 to emphasize the role of the first contrast, but this too seems to be moving even further any from any sort of rigorous treatment, but perhaps this would be sufficient to get to the original goal of ranking genes?

--

Apologies if this has already been answered elsewhere on this forum or others -- I did find quite a few discussions relating to some of the concepts in this thread (combining contrasts, etc.), however, I could not find any guidance for this particular problem.

Any advice would be greatly appreciated.

Thanks! Keith

ADD COMMENTlink modified 24 days ago by Michael Love26k • written 25 days ago by Keith Hughitt120
Answer: DESeq2 2x2 factorial design: find genes unique to factor A?
2
gravatar for Michael Love
24 days ago by
Michael Love26k
United States
Michael Love26k wrote:

There’s not a single contrast approach to define that set of requirements, and yes you can flip the alternate hypothesis to define not DE. The final pvalue as constructed with Fishers method will tend to reject if any of the individual hypotheses are not true. You could also consider taking the max pvalue to require that all must reject. That max is not uniform under the null but depending on application you may not need it to be. You can also transform it to uniform if needed.

ADD COMMENTlink written 24 days ago by Michael Love26k

Great! Thanks for the clarification and useful suggestions, and, more generally, for taking to time to regularly assist DESeq users - it is very much appreciated.

ADD REPLYlink modified 24 days ago • written 24 days ago by Keith Hughitt120

Thanks for the comment :)

ADD REPLYlink written 24 days ago by Michael Love26k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 139 users visited in the last hour