I have data generated from CSF samples from mice. CSF volume was very low - in some cases too low for analysis. In these cases two CSF samples (from same treatment group) were pooled. However there are also unpooled samples when the CSF volume allowed. I know which samples are a pool and which are single-source. The data are not actually gene expression data - it's a proteomics panel, with semi-quantitative protein expression values (log2 scaled). I'd like to use limma for this analysis, but I'm not sure about the pooling in this case.
I'm struggling to think how to handle this. Obviously individual-level covariates can't be used, which should be fine as it's a well-controlled study (all mice are the same strain, one sex, same age etc). But is the between-sample variance going to reflect true population variance? Do I need to account for this somehow? Is there anything I can do, or just analyze the data as usual and highlight this in the explanation of results?