Problems making contrasts
2
0
Entering edit mode
@ingrid-h-g-stensen-1971
Last seen 10.2 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20080221/ 287f6262/attachment.pl
• 670 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 9 hours ago
United States
Hi Ingrid, I haven't used makeContrasts() for a while now, so I'm not sure I can help with that. However, it isn't difficult to construct your contrast matrix by hand. nam <- colnames(design) contrast <- matrix(c(1,-1,0,0,0,0,1,-1,0.5,-0.5,0.5,-0.5), ncol = 3, dimnames = list(nam,c(paste(nam[c(1,3)], nam[c(2,4)], sep = "-"), "Stimulated-Unstimulated"))) You might get the same result by dividing by two in your call to makeContrasts() rather than four. Best, Jim Ingrid H. G. ?stensen wrote: > Hi > > I have some problems making my contrast matrix. > > I have the following design matrix: > P_s P_us D_s D_us > S1 1 0 0 0 > S2 1 0 0 0 > S3 0 1 0 0 > S4 0 1 0 0 > S5 0 0 1 0 > S6 0 0 1 0 > S7 0 0 0 1 > S8 0 0 0 1 > > > Where P = patiens and D = donor, s = stimulated and us = unstimulated > > What I want is to find the following differences: > The differences between stimulated and unstimulated in the patients group, and the differences between stimulated and unstimulated in the donor group. This I am able to make, the two first contrasts. > > But then I also want to see the difference between the two treatmens undepended of samples: stimulated vs unstimulated. > In other words: (P_s and D_s) vs (P_us and D_us). Is my last contrast correct or should I do something else? > > contrast.matrix <- > makeContrasts(P_s-P_us, D_s-D_us, (P_s-P_us + D_s-D_us)/4, levels = designMa) > > Regards, > Ingrid > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD COMMENT
0
Entering edit mode
@ingrid-h-g-stensen-1971
Last seen 10.2 years ago
An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20080222/ 9b36f598/attachment.pl
ADD COMMENT
0
Entering edit mode
Hi Ingrid, It would help here if you showed the contrast matrices you have produced. You should be getting different numbers of differentially expressed probesets if the contrast matrices are different. As for the differences you see with the different design matrices, this is because you are testing two different hypotheses. In the first case you are testing to see if the average expression of the stimulated samples is different from the unstimulated, and the yardstick you are using to determine if there is a difference is based on the variability within each group. In the second design matrix you are again testing to see if the average expression of the stimulated samples is different from the unstimulated, but this time the yardstick you are using to determine if there is a difference is based on the variability within the stimulated and unstimulated samples, where you are pooling the patients and donors. So in the first case you are saying that you have four groups, but want to see if the average expression for two of the groups is different from the other two. In the second case you are saying you only have two groups and you want to know if the expression is different between them. Although the first approach is statistically valid and a common thing to do, it does suffer from the fact that the mean is not robust to outliers. For instance, let's say the average of your four groups for a particular probeset is like this: H_s HC_s T_s TC_s 10.1 4.3 4.1 3.7 And the SSE from your model is relatively small (this value being based on the 'average' variability of the samples within each of the four groups, indicating that the replicates for each group are very similar). Now in this case you might get a significant t-statistic, because the numerator of your statistic will be 3.1, and if the SSE is sufficiently small you will get a large t-stat. However, if you pool the H_s and T_s samples (and the HC_s and TC_s samples), the variability for this group will be really high (because you have three values around 10 and three around 4). Because of this, the denominator of the t-stat will be much larger and you will likely no longer achieve significance. So it depends on what you are looking for. The average expression between the stimulated and unstimulated groups is certainly different, but in this case this difference is driven solely by the H_s group. This may well be why you get far fewer probesets in the second model than the first. Instead of doing either of these models, you might consider using decideTests() with method="nestedF". This will help you to capture those probesets that are significant in both H_s vs HC_s and T_s vs TC_s, but may not have similar expression levels. You might also be interested in the interaction, which would pick up the case that I outlined above, where one sample type is affected differently from the other when subjected to treatment. Best, Jim Ingrid H. G. ?stensen wrote: > Hi > > Now I have tried to use my formula (dividing on 2, 4 and nothing), what James suggested and also made a new design matrix. > > When I divided on 2, 4 or nothing, or used James suggestion I got the same results: > >> designMa > H_s HC_s T_s TC_s > H 1 0 0 0 > H 1 0 0 0 > H 1 0 0 0 > HC 0 1 0 0 > HC 0 1 0 0 > HC 0 1 0 0 > T 0 0 1 0 > T 0 0 1 0 > T 0 0 1 0 > TC 0 0 0 1 > TC 0 0 0 1 > TC 0 0 0 1 > >> oppsum > H_s - HC_s T_s - TC_s (H_s - HC_s + T_s - TC_s) > -1 733 874 1077 > 0 47292 47065 46631 > 1 676 762 993 >> oppsum > H_s - HC_s T_s - TC_s (H_s - HC_s + T_s - TC_s)/2 > -1 733 874 1077 > 0 47292 47065 46631 > 1 676 762 993 > >> oppsum1 > H_s - HC_s T_s - TC_s (H_s - HC_s + T_s - TC_s)/4 > -1 733 874 1077 > 0 47292 47065 46631 > 1 676 762 993 > > > But when I made a new matrix: > >> designMa > s us > H 1 0 > H 1 0 > H 1 0 > HC 0 1 > HC 0 1 > HC 0 1 > T 1 0 > T 1 0 > T 1 0 > TC 0 1 > TC 0 1 > TC 0 1 > > > contrast.matrix <- makeContrasts(s-us, levels = designMa) > > I got a different answer: > >> oppsum2 > s - us > -1 8 > 0 48657 > 1 36 > > > My question now is: Why and what is the right solution? And why divide on 2 or 4 (this I read in the limma user guide, section 8.7) > > Regards, > Ingrid > > Hi Ingrid, > > I haven't used makeContrasts() for a while now, so I'm not sure I can > help with that. However, it isn't difficult to construct your contrast > matrix by hand. > > nam <- colnames(design) > contrast <- matrix(c(1,-1,0,0,0,0,1,-1,0.5,-0.5,0.5,-0.5), ncol = 3, > dimnames = list(nam,c(paste(nam[c(1,3)], nam[c(2,4)], > sep = "-"), "Stimulated-Unstimulated"))) > > You might get the same result by dividing by two in your call to > makeContrasts() rather than four. > > Best, > > Jim > > > > > Ingrid H. G. ?stensen wrote: >> Hi >> >> I have some problems making my contrast matrix. >> >> I have the following design matrix: >> P_s P_us D_s D_us >> S1 1 0 0 0 >> S2 1 0 0 0 >> S3 0 1 0 0 >> S4 0 1 0 0 >> S5 0 0 1 0 >> S6 0 0 1 0 >> S7 0 0 0 1 >> S8 0 0 0 1 >> >> >> Where P = patiens and D = donor, s = stimulated and us = unstimulated >> >> What I want is to find the following differences: >> The differences between stimulated and unstimulated in the patients group, and the differences between stimulated and unstimulated in the donor group. This I am able to make, the two first contrasts. >> >> But then I also want to see the difference between the two treatmens undepended of samples: stimulated vs unstimulated. >> In other words: (P_s and D_s) vs (P_us and D_us). Is my last contrast correct or should I do something else? >> >> contrast.matrix <- >> makeContrasts(P_s-P_us, D_s-D_us, (P_s-P_us + D_s-D_us)/4, levels = designMa) >> >> Regards, >> Ingrid >> >> >> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -------------------------------------------------------------------- ---- > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician Affymetrix and cDNA Microarray Core University of Michigan Cancer Center 1500 E. Medical Center Drive 7410 CCGC Ann Arbor MI 48109 734-647-5623
ADD REPLY

Login before adding your answer.

Traffic: 831 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6