Using contrast argument of DESeq2 to estimate linear combination coefficients
Entering edit mode
Last seen 7.1 years ago

Dear Community, 

I'm characterizing transcriptional responses to transcription factors (TFs) overexpression using RNA-seq.

In brief, I have 5 groups of samples (x, y, z, xyz, Ctrl), where I provide either x, y or z TFs individually, or simultaneously using a multicistronic vector (xyz); Ctrl represents a control vector infection. Each group has 5 replicates. Data has been filtered for 0 counts removal. 

So far, I've used the contrast argument as follows:



DEcomposite<-results(dds, contrast=LC, listValues = c(1,-1/3), alpha=0.05, lfcThreshold = 1, altHypothesis = "greaterAbs", independentFiltering = T)

This allowed me to identify the DE genes between the average of the transcriptional effects of x, y, z and their simultaneous effect (xyz).

Currently, I'm trying to estimate the "dose" of x,y and z that better resemble the transcriptional profile of simultaneous xyz overexpression. Which basically, translates to identify the coefficients for the results function argument listValues that minimize the differentially expressed genes.

Does anybody know how to approach this problem?

Thanks in advance!

deseq2 contrast • 1.3k views
Entering edit mode
Last seen 1 day ago
United States

I don't see how to do this within the DESeq2 context. Let me see if I can turn it into a mathematic formulation:

Suppose you collapse the 5 replicates into an average vector X, Y, Z, and XYZ (these are vectors over genes).

You want to find alpha, beta, gamma > 0 to minimize:

d(alpha X + beta Y + gamma Z, XYZ)

where d(., .) is some distance.

Entering edit mode

Thank you Michael.

Your formulation describes somehow what I'm looking for, the problem is that the "distance" I'm willing to use is the number of DE transcripts identified with DESeq2. 

I thought to write a function which iterates n times the results extraction while modifying the contrast definition, such as numerator remains constant, while the members of denominator are multiplied with set of alpha, beta, gamma coefficients randomly chosen between 0 and 1. This to populate a list of n elements containing the number of DE transcripts, and the coefficients used at each iteration step.  However, I've realized that this isn't easily implemented, because we can just provide one single coefficient respectively for the numerator and the denominator arguments through listValues

Do you envision any strategy to get around this? Or do you think that using DESeq2 in this way is just hopeless? 

Entering edit mode
I personally would prefer to use a distance rather than a testing framework for this. You might find the DESeq2 transformations useful though.

Login before adding your answer.

Traffic: 806 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6