I am after some advice! I have a set of proteomic data run on healthy individuals; there is therefore no "treatment" or "status" outcome and the data will be tested against different continuous outcomes.
Although the data initially looked well, closer look indicated the levels of some proteins in some individuals were unaturaly low- even closer look indicated that this decrease was specific to the location of the lab and also the year of extraction; however these two are not independent i.e. blood extraction in labs 1 and 2 took place mainly in the first two years and in labs 3 and 4 in the next two years. Linear regression analyses however indicate that the effect of both lab and year is independent- i.e. even within the same lab there was a difference by year of extraction.
I am a bit confused of how to proceed. My problems/questions are
a) I have no variable of interest - only covariates such as sex- is this OK for combat?
b) can I use combat with lab as a batch -controlling for year and sex as covariates and then a second round of combat with year as a batch? Is it suggested to control for year in the first step?
c) when I did b- results were slightly corrected but still remains a difference in the levels of proteins- should I use sva as well?
Thanks so much in advance, really appreciated