I have been working on a meta-analysis of methylation data and encountered a snag. Before analysis, I wanted to normalize the data using the removeBatchEffects() function for array type and origination cohort. These variables were higher collinear (canonical correlation of 0.74) and resulted in the "coefficients not estimable" output. As such, I decided to only normalize by array type and leave out the cohort term from the formula. Upon clustering of the normalized data, one cluster was highly overrepresented with a specific cohort, suggesting that variance due to cohort is still within the data. Would it be reasonable for me to combine the array type and cohort variables (i.e., cohort_array) into a single variable and regress against that? Thank you for any advice.