**10**wrote:

I have Illumina HT-12 microarray data from a crossover study. There are samples at study start, after treatment period 1, and after treatment period 2. All subjects had samples at each time point. All subjects had treatment A and treatment B (half of the subjects had Aduring treatment period 1 followed by B during treatment period 2; half had B first followed by A). We are interested in the contrast of Treatment A relative to baseline (study enrollment timepoint, before any treatment) and Treatment B relative to baseline.

I noticed that the p-values reported by `topTable()`

differ for treatment A depending on whether or not the treatment B samples are present in the `ExpressionSet`

. For example:

```
# Function for fitting the model
make_fit <- function(data) {
trt_ <- factor(pData(data)$TreatmentCode)
ptid <- factor(pData(data)$Subject)
design <- model.matrix(~ 0 + trt_ + Subject)
fit <- lmFit(data, design)
contrasts <- makeContrasts(
A = trt_A - trt_Baseline,
levels = design)
fit <- eBayes(contrasts.fit(contrasts, fit = fit))
topTable(fit, coef = "A")
}
complete <- data # all of the data (baseline, treatment A and treatment B)
# just the baseline and treatment A samples
baseline_and_A <- data[ , pData(data)$TreatmentCode %in% c("Baseline", "A")]
# apply the make_fit function defined above to the completed and subsetted data
make_fit(complete)
make_fit(baseline_and_A)
```

The p-values are greater for treatment A when I fit the model on the `ExpressionSet`

containing only the baseline and treatment A samples (`baseline_and_A`

) than the p-values when I fit the model on the `ExpressionSet`

including baseline, treatment A, and treatment B samples (`complete`

). The fold changes are identical.

Why does the presence of the treatment B samples change the p-values for treatment A, given that the treatment B samples are not involved in the contrast? I am trying to determine which method is more appropriate to report.

Thank you in advance.

**52k**• written 7 months ago by smhughes •

**10**