Interaction categorical/continuous variable DESeq2
2
1
Entering edit mode
Hugo Varet ▴ 80
@hugo-varet-6301
Last seen 4.9 years ago
France
Dear list, dear Mike Love, I am using DESeq2 to model counts from an unusual type of experiment and I have a question about the strategy I employed. The experiment consisted in sequencing 33 samples for which we have the following information: - group (16 samples from group A and 17 from group B) - a continuous variable X almost uniform (variable of interest) I have to add the group to the design formula because I know it has a strong effect on the counts. Then, as my goal is to detect genes which vary with the continous variable X in the same way within both groups A and B, I want to exclude genes for which there is an interaction between group and X. The design is thus ~ group + X + group:X and I used the following lines to test the interaction: dds <- DESeqDataSetFromMatrix(countData=counts, colData=target, design = ~ group + X + group:X) dds <- estimateSizeFactors(dds) dds <- estimateDispersions(dds) dds <- nbinomWaldTest(dds) res <- results(dds, name="groupB.X") sum(res$padj<=0.05, na.rm=TRUE) hist(res$padj) As I found no significant interaction (the minimum adjusted p-value is about 0.6), I decided to remove the interaction term from the design and to use ~ group + X. I can then test for the coefficients of X. If I do not detect any significant interaction, I think it is due to a lack of power. So, can I use the additive model ~ group + X even if it will not be correct for genes which actually have an interaction? Many thanks in advance, Hugo PS: I am using R 3.1.1 and DESeq2 1.4.5 _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
Sequencing DESeq2 • 2.6k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 6 hours ago
United States
hi Hugo, On Mon, Sep 15, 2014 at 9:41 AM, Hugo Varet <hugo.varet@pasteur.fr> wrote: > Dear list, dear Mike Love, > > I am using DESeq2 to model counts from an unusual type of experiment and I > have a question about the strategy I employed. The experiment consisted in > sequencing 33 samples for which we have the following information: > - group (16 samples from group A and 17 from group B) > - a continuous variable X almost uniform (variable of interest) > > I have to add the group to the design formula because I know it has a > strong effect on the counts. Then, as my goal is to detect genes which vary > with the continous variable X in the same way within both groups A and B, I > want to exclude genes for which there is an interaction between group and > X. The design is thus ~ group + X + group:X and I used the following lines > to test the interaction: > > dds <- DESeqDataSetFromMatrix(countData=counts, colData=target, design = > ~ group + X + group:X) > dds <- estimateSizeFactors(dds) > dds <- estimateDispersions(dds) > dds <- nbinomWaldTest(dds) > res <- results(dds, name="groupB.X") > sum(res$padj<=0.05, na.rm=TRUE) > hist(res$padj) > > As I found no significant interaction (the minimum adjusted p-value is > about 0.6), I decided to remove the interaction term from the design and to > use ~ group + X. I can then test for the coefficients of X. > > If I do not detect any significant interaction, I think it is due to a > lack of power. So, can I use the additive model ~ group + X even if it will > not be correct for genes which actually have an interaction? > ​I'd recommend examining the size of the estimated interaction terms in the model​ including the interaction: hist(res$log2FoldChange) if these are all small, then you might prefer the simpler model, ~ group + X. ​Just a reminder about using a continuous covariate with log link GLM: you are looking for genes which closely follow the rule counts = 2^(a + bX). Such a gene would follow the rule: if going from X=0 to X=0.5 produces a fold change of 1.5, going from X=0.5 to X=1 also produces a fold change of 1.5, and X=0 to X=1 produces a fold change of 1.5^2. ​Mike​ > > Many thanks in advance, > > Hugo > > PS: I am using R 3.1.1 and DESeq2 1.4.5 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane. > science.biology.informatics.conductor > [[alternative HTML version deleted]] _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
@wolfgang-huber-3550
Last seen 11 days ago
EMBL European Molecular Biology Laborat…
Cher Hugo sorry if I missed something, but why not fit the model with interactions and test for the coefficient of the ‘X' main effect? (see arguments ‘name’, ‘contrast’ of the ‘results’ function). What you propose below seems not wrong, but perhaps unnecessarily complicated. Best wishes Wolfgang Il giorno 15 Sep 2014, alle ore 15:41, Hugo Varet <hugo.varet@pasteur.fr> ha scritto: > Dear list, dear Mike Love, > > I am using DESeq2 to model counts from an unusual type of experiment and I have a question about the strategy I employed. The experiment consisted in sequencing 33 samples for which we have the following information: > - group (16 samples from group A and 17 from group B) > - a continuous variable X almost uniform (variable of interest) > > I have to add the group to the design formula because I know it has a strong effect on the counts. Then, as my goal is to detect genes which vary with the continous variable X in the same way within both groups A and B, I want to exclude genes for which there is an interaction between group and X. The design is thus ~ group + X + group:X and I used the following lines to test the interaction: > > dds <- DESeqDataSetFromMatrix(countData=counts, colData=target, design = ~ group + X + group:X) > dds <- estimateSizeFactors(dds) > dds <- estimateDispersions(dds) > dds <- nbinomWaldTest(dds) > res <- results(dds, name="groupB.X") > sum(res$padj<=0.05, na.rm=TRUE) > hist(res$padj) > > As I found no significant interaction (the minimum adjusted p-value is about 0.6), I decided to remove the interaction term from the design and to use ~ group + X. I can then test for the coefficients of X. > > If I do not detect any significant interaction, I think it is due to a lack of power. So, can I use the additive model ~ group + X even if it will not be correct for genes which actually have an interaction? > > Many thanks in advance, > > Hugo > > PS: I am using R 3.1.1 and DESeq2 1.4.5 > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor _______________________________________________ Bioconductor mailing list Bioconductor@r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 279 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6