I have a general question on running limma-voom on an RNA-Seq study with repeated measures.
I have a repeated measures design, where 19 subjects were measured over multiple timepoints in two different conditions. The PCA/MDS plots indicate a strong impact of Subject, so initially I ran the study in edgeR including "subject" as a separate factor in the design matrix. However I am interesting in also applying the limma approach to allow the "subject" impact to be modeled as a random contributor; the impact of "subject" is not really of interest. I started following the standard commands, creating a DGEList object and removing low expressed genes. I kept genes with a rowSum cpm of at least 1 in 19 or more samples. (Increasing stringency to only keep genes with a rowSum cpm of at least 5 in 19 or more samples still results in the same negative counts problem noted below). Then I used calcNormFactors to normalize, set up my design matrix, and ran voom a first time. Next I estimated the correlation with duplicateCorrelation. My cor$consensus value is 0.3549613.
Following the example in 18.1.9 of the limma user guide (revision 14 Nov 2021), I then tried to run voom a second time in case the correlation could modify voom weights. However, this resulted in an error message that "negative counts are not allowed."
Digging into the issue, I found that my original filtered/normalized DGEList did NOT have negative counts; the minimum value of any single gene in any single sample was 0. However after running voom the first time, negative values were generated, so I am inhibited from running voom a second time since the dataset contains negative values.
I tried to find explanations in bioconductor, and I found one posting (Voom on TCGA data shifts count distributions towards negative values ) suggesting that it is OK to have negative values generated by a run of voom. That was a relief! However I am also seeing a posting (using duplicateCorrelation with limma+voom for RNA-seq data) suggesting that running voom twice is recommended.
What should I do if the first voom run creates negative values? Is it sufficient to run voom once? Is it better to stick to an edgeR approach and just encode subjects as a factor in the design matrix/model?
NB: I am running R version 4.2.1, edgeR version 3.38.4, limma version 3.52.2.
Thank you in advance for your help!
Thank you; I was not aware of this function.