Multiple comparisons: inquiries on level factors and wording used in resultsNames
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Hi Mike, Sorry to bother you about multiple comparisons again. I???d like to inquire about certain wording/phrases used in resultsNames(dds) to better understand the types of comparisons made. My questions will be based on the output from the resultsNames(dds4): please see the output. I am not entirely clear about the codes/wording used in the description. For example, the word ???versus??? used in genotypeC_vs_GenotypeB. Is it implying a comparison of tempHigh versus tempLow between genotypeC versus genotype B for all time points? Another confusing word is period. What does it mean genotypeC.TempHi? Is it implying a comparison of genotype C at tempHi over all time points? How should I interpret all these comparisons? Also, when I called results(dds), you said it would compare effect of tempHi versus tempLow over GenotypeB, over all time points. If I???d like to make multiple comparisons of tempHi versus tempLow for more than 2 genotypes simultaneously against each other, why is it still important to set a base level? If I set a base level, all the comparisons will be made against the base level, not against each other. To illustrate, I am interested in comparing genotypeA, genotypeB, and genotypeC. If I set genotypeA as my base level, I will be comparing genotypeB and genotypeC against genotypeA. I won???t be making any comparison genotypeB to genotypeC. Please explain more on the level of factors. I am a little bit confused with that in terms of multiple comparisons. To remind you again, here???s my experimental design: Genotypes: 4 different genotypes Timepoint: 3 different timepoints (6h, 12h, and 24h) Temperature: Low and high temperatures 3 biological replicates for each condition. Thank you for your quick responses! Regards, Yoong -- output of sessionInfo(): dds3 = DESeqDataSetFromMatrix(countData = allData, colData = colData, design = ~genotype+time+temp+genotype:temp+ time:temp) dds4= DESeq(dds3,betaPrior=FALSE) resultsNames(dds4) "Intercept" "genotypeC_vs_ GenotypeB " "genotypeA_vs_ GenotypeB" "genotype_GenotypeD_vs_GenotypeB" "time_24h_vs_12h" "time_6h_vs_12h" "temp_TempHi_vs_TempLow" "genotypeC.TempHi" "genotypeA.TempHi" "genotypeD.TempHi" "time24h.TempHi" "time6h.TempHi" results(dds4,name="temp_TempHi_vs_TempLow") -- Sent via the guest posting facility at bioconductor.org.
• 876 views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 4 minutes ago
United States
hi Yoong, I think you would benefit from speaking to a statistician in your local institution. These are not questions specific to our software, but questions more generally about the meaning of terms in linear models and generalized linear models. These can be difficult concepts and it is not so easy to answer these questions over email. I try to answer some questions in line below: On Sun, Jan 26, 2014 at 9:06 PM, Yoong [guest] <guest@bioconductor.org>wrote: > > Hi Mike, > > Sorry to bother you about multiple comparisons again. > > I’d like to inquire about certain wording/phrases used in > resultsNames(dds) to better understand the types of comparisons made. > > My questions will be based on the output from the resultsNames(dds4): > please see the output. I am not entirely clear about the codes/wording used > in the description. For example, the word ‘versus’ used in > genotypeC_vs_GenotypeB. ​The "_vs_baselevel" characters are added to the label by DESeq2 to remind users who are perhaps less familiar with linear modeling, that the coefficient is specified relative to a base level. The model.matrix() function in R would just label this coefficient "genotypeC". > Is it implying a comparison of tempHigh versus tempLow between genotypeC > versus genotype B for all time points? ​no.​ It is the effect of genotype C relative to genotype B, controlling for whatever other variables you have in your design formula. > Another confusing word is period. What does it mean genotypeC.TempHi? Is > it implying a comparison of genotype C at tempHi over all time points? How > should I interpret all these comparisons? > ​The period is the character used by model.matrix() to combine the levels of an interaction term. This example is the interaction term of genotype C and temp Hi.​ An interaction term is an additional term in the generalized linear model, which can be used to test if the effect of genotype C and temp Hi is only multiplicative: fold change of genotype C and temp Hi ?=? fold change of genotype C * fold change of temp Hi > Also, when I called results(dds), you said it would compare effect of > tempHi versus tempLow over GenotypeB, over all time points. Sorry, looking at your previous email, I think I was confused which model you were referring to. Calling results(dds) will compare the effect of temp Hi over Low, controlling for all other variables in your design formula, so the effect of tempHi over Low over *all* genotypes and *all* time points. If I’d like to make multiple comparisons of tempHi versus tempLow for more > than 2 genotypes simultaneously against each other, why is it still > important to set a base level? If I set a base level, all the comparisons > will be made against the base level, not against each other. No, the above point should clarify this. > To illustrate, I am interested in comparing genotypeA, genotypeB, and > genotypeC. If I set genotypeA as my base level, I will be comparing > genotypeB and genotypeC against genotypeA. I won’t be making any comparison > genotypeB to genotypeC. ​To compare genotype B against genotype C, use the contrast argument to results​: results(dds, contrast=c("genotype","B","C") > Please explain more on the level of factors. I am a little bit confused > with that in terms of multiple comparisons. > ​The conceptual basis of linear models, factors and interaction terms ​is a bit difficult to explain over email. Again, I think it would be best for you to find a statistician at your local institution who might be able to better explain these concepts in person. Mike > > To remind you again, here’s my experimental design: > Genotypes: 4 different genotypes > Timepoint: 3 different timepoints (6h, 12h, and 24h) > Temperature: Low and high temperatures > 3 biological replicates for each condition. > > Thank you for your quick responses! > > Regards, > Yoong > > > -- output of sessionInfo(): > > dds3 = DESeqDataSetFromMatrix(countData = allData, colData = colData, > design = ~genotype+time+temp+genotype:temp+ time:temp) > > dds4= DESeq(dds3,betaPrior=FALSE) > > resultsNames(dds4) > > "Intercept" "genotypeC_vs_ GenotypeB " "genotypeA_vs_ > GenotypeB" > "genotype_GenotypeD_vs_GenotypeB" "time_24h_vs_12h" > "time_6h_vs_12h" > "temp_TempHi_vs_TempLow" "genotypeC.TempHi" "genotypeA.TempHi" > "genotypeD.TempHi" "time24h.TempHi" "time6h.TempHi" > > results(dds4,name="temp_TempHi_vs_TempLow") > > -- > Sent via the guest posting facility at bioconductor.org. > [[alternative HTML version deleted]]
ADD COMMENT

Login before adding your answer.

Traffic: 930 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6