Hi All,
I am writing you because I need help in designing the design matrix for analyzing the results of my RNA-seq experiment.
I have sequenced 57 samples. Each sample is described by 6 factors ( Timepoint, Sex, Treatment, Family, sequencing Lane and Run of extraction).
I am struggling to find an effective way to write a matrix design for such a high number of factor. My first idea was to create a design matrix as follows:
f <- paste(data$TimePoint, data$Sex, data$Treatment, data$Sibgroup, data$Run, data$Lane, sep=".")
f <- as.factor(f)
design <- model.matrix(~f)
colnames(design) <- levels(f)
I wanted to create it this way because it would have made simple to make the contrast matrix (for ex. comparing male vs females would have required just to make the sum of the combination of factor including "male" and substract it to the sum for the combination of factors including the "female" term).
This doesn't work, because the observed combination of the 6 factors are 57, which means no sample share the exact same combination of factors and voom needs replication of at least one combination to run.
Can you suggest me a way to write the design and make the constrast matrix (for ex. to compare male vs females).
Thank you in advance,
let me know if you need more informations,
best regards
Oliver
You should be able to do something like this:
... and feed that into the
contrast
argument ofglmLRT
. The reported log-fold changes will be that of the female over the male.