Characterization of simultaneous transcription factors over expression
1
0
Entering edit mode
@sebastianocurreli-9937
Last seen 4.3 years ago

Dear community,

I’m pretty new to computational biology and to transcriptomics. I’m currently trying to characterize how the overexpression of three transcription factors (x, y, z) acts on the transcriptome of a cell line.

The design involves 5 groups of samples (x, y, z, xyz, Ctrl), each group containing 5 replicates. Data has been filtered for 0 counts removal.

As far as I've noticed each of the TFs induces or represses some genes, as highlighted by differential gene expression analysis performed using DESeq2 (using the Ctrl condition as reference). I’ve runt PCA analysis using FactoMineR on all the expressed genes, with count values transformed as regularized log as suggested in DESeq2 pipeline. Plotting the samples over the first three PC, I’ve noticed something pretty curious.

Basically it appears as each TFs (x, y, z) is “pulling” the Ctrl state along a certain “dragging” path, while the path related to the simultaneous overexpression (xyz) seems to be the sum of the three (x+y+z) actions. I’m wondering whether any of you can suggest any method to test this hypothesis.

Thanks in advance for any suggestion.

Sebastiano

deseq2 factominer PCA • 649 views
1
Entering edit mode
@ryan-c-thompson-5618
Last seen 13 months ago
Scripps Research, La Jolla, CA

Unfortunately, it's hard to tell what's going on in a 2D screenshot of a 3D PCA plot. But do you perhaps mean that the xyz group seems to be the average of the x, y, and z groups, rather than the sum? In any case, it's not exactly clear what hypothesis you want to test. Do you want to identify genes for which the the xyz effect is significantly different from the sum/average of the individual x, y, and z effects? If so, then you can test the contrast of xyz - (x+y+z)/3 (if you want the sum rather than the average, simply omit the division by 3). Do you just want to know if the combined xyz effect is correlated with the sum/average of the individual effects? You can fit a limma-voom model to the counts and use genas. If neither of these is what you are looking for, can you be more specific about your hypothesis?

0
Entering edit mode

Dear Ryan,

thank you for your reply. You are right, unfortunately I've not been precise in the description of my hypothesis, I apologize for that. Anyway, your suggestions hit exactly the mark.

I'm interested in both, first, identifying genes which significantly differ between xyz expression and their individual average ((x,y,z)/3), and second to test how the combined overexpression of x,y,z relate to their individual effects.

Following your indication, to test whether the average of the individual overexpression of x, y, and z ((x+y+z)/3) differs from the simultaneous delivery of xyz, I should use the contrast argument of the result function of DEseq2. Unfortunately, I'm not aware of how to do it. So far I've been using the contrast argument to call for pairwise comparisons (see code below):

condition<-c("xyz","xyz","xyz","xyz","xyz","x","x","x","x","x",

"ctrl","ctrl","ctrl","ctrl","ctrl","y","y","y","y","y",
"z","z","z","z","z")

sampleFiles <- grep("Sample",list.files(directory),value=TRUE)
sampleTable <- data.frame(sampleName = sampleFiles,
fileName = sampleFiles,
condition = condition)

ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,
directory = directory,design= ~ condition)
ddsHTSeqfilt <- ddsHTSeq[ rowSums(counts(ddsHTSeq)) > 1, ] #prefilter: remove 0 counts
ddsHTSeqfilt$condition <- relevel(ddsHTSeqfilt$condition, ref="ctrl") #set ctrl as the reference for the differential expression analysis

dds <- DESeq(ddsHTSeqfilt) #calling DESeq2 to run differential expression analysis
rld <- rlog(dds, blind=FALSE) # perfrom the regularized log transform

DExyz<-results(dds, contrast=c("condition","xyz","ctrl"), alpha=0.05, lfcThreshold = 1, altHypothesis = "greaterAbs", independentFiltering = T) # example 1 of pairwise comparison

DEx<-results(dds, contrast=c("condition","x","ctrl"), alpha=0.05, lfcThreshold = 1, altHypothesis = "greaterAbs", independentFiltering = T) # example 2 of pairwise comparison

Can you please post an example of code?

Regarding the usage of Limma Voom I'll get to the studies and then I'll write back.

Thanks a lot.

Sebastiano

1
Entering edit mode

I'm more familiar with edgeR and limma, which use a different method of specifying contrasts that lets you simply specify the arithmetic expressions as I have written them above. I think with DESeq2 you need to construct a numeric vector with -1/3 for x, y, and z and +1 for xyz (and 0 for control, since it's not involved in the contrast). See the DESeq2 help page for results.

1
Entering edit mode

In DESeq2 you can specify a list where the first character vector is the numerator terms and the second character vector is the denominator terms. Then you specify listValues, e.g. c(1, -1/3)

0
Entering edit mode

Thank you Michael,

numerator<-c("conditionxyz")
denominator<-c("conditionx","conditiony","conditionz")

LC<-list(numerator,denominator)

DEcomposite<-results(dds, contrast=LC, listValues = c(1,-1/3), alpha=0.05, lfcThreshold = 1, altHypothesis = "greaterAbs", independentFiltering = T)

I think that this solved my linear combination problem, if I get it right, it is testing the significantly DE gene between the simultaneous expression of xyz and the "average" of the individual delivery as the following linear combination (1/3*(x,y,z)).


1
Entering edit mode

Yes, that's correct.