Hello
I am new to the analysis of bisulfite sequencing data with edgeR. I read the user guide and wrote code to perform the analysis. With this post, I want to check if my design matrix and contrasts are the correct ones to assess the question of interest.
target
sample subject condition
1 SampleA animal1 normal
2 SampleB animal1 treated
3 SampleC animal2 normal
4 SampleD animal2 treated
5 SampleE animal3 normal
6 SampleF animal3 treated
I want to discover which promoters are differentially methylated between treated and normal, while correcting for animal.
I generated the design matrix as described in the f1000 paper "Differential methylation analysis of reduced representation bisulfite sequencing experiments using edgeR". I first created the design matrix that I would use for a RNA-seq differential expression and afterwards expanded it with modelMatrixMeth
.
d <- model.matrix(~ 0 + subject + condition, data = target)
d <- modelMatrixMeth(d)
d
Sample1 Sample2 Sample3 Sample4 Sample5 Sample6 subjectanimal1 subjectanimal2 subjectanimal3
1 1 0 0 0 0 0 1 0 0
2 1 0 0 0 0 0 0 0 0
3 0 1 0 0 0 0 1 0 0
4 0 1 0 0 0 0 0 0 0
5 0 0 1 0 0 0 0 1 0
6 0 0 1 0 0 0 0 0 0
7 0 0 0 1 0 0 0 1 0
8 0 0 0 1 0 0 0 0 0
9 0 0 0 0 1 0 0 0 1
10 0 0 0 0 1 0 0 0 0
11 0 0 0 0 0 1 0 0 1
12 0 0 0 0 0 1 0 0 0
conditiontreated
1 0
2 0
3 1
4 0
5 0
6 0
7 1
8 0
9 0
10 0
11 1
12 0
Afterwards, I proceeded as described in the f1000 paper. I used glmLRT(fit)
to test the relevant coefficient. I thought that "conditiontreated" was the coefficient that describes the difference between normal and treated while correcting for animal, but I am not quite sure if this is correct.
Many thanks in advance.
Thank you very much.