Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.3 years ago
The dataset has 1000 genes and contains 24 samples with two mouse
strains tested (129 and B6) and six brain regions. There are two
replicates for each region.
The ANOVA was performed as follows:
sdata<-read.table("http://www.chibi.ubc.ca/wp-
content/uploads/2013/02/
sandberg-sampledata.txt", header=T, row.names=1)
strain <- gl(2,12,24, label=c("129","bl6"))
region <- gl(6,2,24, label=c("ag", "cb", "cx", "ec", "hp", "mb"))
# define ANOVA function
aof <- function(x) {
m<-data.frame(strain,region, x);
anova(aov(x ~ strain + region + strain*region, m))
}
# apply analysis to the data and get the pvalues.
anovaresults <- apply(sdata, 1, aof)
pvalues<-data.frame( lapply(anovaresults, function(x) {
x["Pr(>F)"][1:3,] }) )
# Get the genes with good region effect pvalues.
reg.hi.p <-t(data.frame(pvalues[2, pvalues[2,] < 0.0001 & pvalues[3,]
> 0.1]))
reg.hi.pdata <- sdata[ row.names(reg.hi.p), ]
A significant p-value resulting from a 1-way ANOVA test would indicate
that a gene is differentially expressed in at least one of the groups
analyzed. Now that there are more than two groups being analyzed,
however, the 1-way ANOVA does not specifically indicate which pair of
groups exhibits statistical differences. I know that Post Hoc tests
can be applied in this specific situation to determine which specific
pair/pairs are differentially expressed in each of the regions (
irrespective of the strains). I would like to know how to apply the
Tukey's HSD using R in this case to find out which of these genes (
the ones with good region effect pvalues) are expressed in which
region ( for instance in which brain region like "ag","cb","cx and so
on).
-- output of sessionInfo():
R version 2.15.2 (2012-10-26)
Platform: i686-redhat-linux-gnu (32-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=C LC_NAME=C
LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] GOstats_2.24.0 RSQLite_0.10.0 DBI_0.2-5
graph_1.36.2 Category_2.22.0 AnnotationDbi_1.20.5
affy_1.36.1
[8] Biobase_2.16.0 BiocGenerics_0.4.0 R.utils_1.23.2
R.oo_1.13.0 R.methodsS3_1.4.2
loaded via a namespace (and not attached):
[1] affyio_1.22.0 annotate_1.36.0 AnnotationForge_1.0.3
BiocInstaller_1.8.3 genefilter_1.40.0 GO.db_2.8.0
[7] GSEABase_1.18.0 IRanges_1.16.6 parallel_2.15.2
preprocessCore_1.18.0 RBGL_1.34.0 splines_2.15.2
[13] stats4_2.15.2 survival_2.36-14 tools_2.15.2
XML_3.9-4 xtable_1.6-0 zlibbioc_1.4.0
--
Sent via the guest posting facility at bioconductor.org.