design matrix and contrast for paired experiment using NanoStringDiff for nCounter data
c.kohler
Dear Bioconductors,

I have a NanoString nCounter dataset consisting of paired data (conditions A and B) with some additional measurements for samples in condition B.

My current setting consists of 5 patients and 2 conditions:

df<-data.frame(pair=c("p1","p1","p2","p2","p2","p3","p3","p4","p4","p4","p5","p5"), condition=c("A","B","A","B","B","A","B","A","B","B","A","B"))

df

pair condition
p1         A
p1         B
p2         A
p2         B
p2         B
p3         A
p3         B
p4         A
p4         B
p4         B
p5         A
p5         B

I'd like to assess differential expression between conditions A and B by using the Bioconductor package <NanoStringDiff>.
So my definition of a design matrix would look like this

design<-model.matrix(~0+factor(condition)+factor(pair),data=df)

colnames(design)<-c("A","B","p2","p3","p4","p5")
design
A   B   p2   p3   p4   p5
1   0    0    0    0    0
0   1    0    0    0    0
1   0    1    0    0    0
0   1    1    0    0    0
0   1    1    0    0    0
1   0    0    1    0    0
0   1    0    1    0    0
1   0    0    0    1    0
0   1    0    0    1    0
0   1    0    0    1    0
1   0    0    0    0    1
0   1    0    0    0    1
attr(,"assign")
[1] 1 1 2 2 2 2
attr(,"contrasts")
attr(,"contrasts")$factor(condition) [1] "contr.treatment" attr(,"contrasts")$factor(pair)
[1] "contr.treatment"

In addition, I defined the contrast to be

contrast<-c(1,-1,0,0,0,0) # should encode for A-B

Are the definitions of the design matrix and the contrast correct?

I have this workflow in my mind (following the <NanoStringDiff> vignette) :

# [1] create NanoStringSet

# [2] estimate normalisation parameters
NanoStringData1=estNormalizationFactors(NanoStringData1)

# [3] run the Generalize linear model likelihood ratio test
result=glm.LRT(NanoStringData1,design,contrast=contrast)

So does this workflow indeed answer my question which features differ significantly between the two conditions A and B (considering the paired nature of the data)?

I feel like I made a mistake, but I don't know for sure. Hence, I'd like to check with you, whether it is correct or not.

As the Lab people will do their experiments based on my resulting gene list, I want to be sure whether my definitions of the design matrix and the contrast are correct.

Thank you very much for any help / suggestions

Christian