ComBat: Error non-conformable arguments
Dear all,

I have a large metabolomics data set. About 6000 samples were run over a few months in 5 batches, or 61 batches (depending on definition).

At the moment for each sample I have the intensity for 21 peaks (metabolites).

> head(df$dat) peak1 peak2 peak3 peak4 peak5 peak6 peak7 PA14_EM_14-4_E-3_P1-E-3_01_11213.mzXML 5.440897 6.505249 5.251598 7.206793 7.467628 10.759801 9.294075 PA14_EM_1-2_D-4_P1-D-4_01_2002.mzXML 5.108385 6.556920 5.543050 7.652522 6.748898 9.606819 9.019394 PA14_EM_10-4_H-5_P1-H-5_01_8890.mzXML 5.591401 6.766761 5.381610 7.471881 8.100680 10.481429 9.689601 PA14_EM_2-3_A-12_P1-A-12_01_2500.mzXML 4.618323 6.485310 4.498478 7.309714 8.813708 9.658948 9.379349 PA14_EM_9-2_C-5_P1-C-5_01_6835.mzXML 5.836406 7.378964 6.446740 8.505912 7.779362 10.045803 9.704689 PA14_EM_2_B-6_P1-B-6_01_11723.mzXML 5.231878 6.639438 5.473027 7.712421 7.425328 10.343695 9.246132 peak8 peak9 peak10 peak11 peak12 peak13 peak14 PA14_EM_14-4_E-3_P1-E-3_01_11213.mzXML 9.130252 9.879932 10.441853 8.277511 7.258236 8.837902 4.522068 PA14_EM_1-2_D-4_P1-D-4_01_2002.mzXML 9.058104 8.606485 9.272817 8.047970 6.825918 7.924373 4.738949 PA14_EM_10-4_H-5_P1-H-5_01_8890.mzXML 9.744228 9.416476 9.936105 8.577914 7.534848 8.511881 4.592875 PA14_EM_2-3_A-12_P1-A-12_01_2500.mzXML 9.455950 8.490708 8.859265 8.305719 7.257221 7.529841 4.244724 PA14_EM_9-2_C-5_P1-C-5_01_6835.mzXML 9.779392 9.128307 9.420374 8.487148 7.413307 7.872341 4.345545 PA14_EM_2_B-6_P1-B-6_01_11723.mzXML 9.358493 9.539392 10.017200 8.228972 7.089368 8.362185 4.132186 peak15 peak16 peak17 peak18 peak19 peak20 peak21 PA14_EM_14-4_E-3_P1-E-3_01_11213.mzXML 7.503102 8.519748 7.118348 6.301519 4.083066 5.801221 9.971810 PA14_EM_1-2_D-4_P1-D-4_01_2002.mzXML 7.843904 8.123712 6.916606 6.550114 4.741928 6.003363 9.010882 PA14_EM_10-4_H-5_P1-H-5_01_8890.mzXML 7.618536 8.226453 6.932789 6.565171 4.615487 5.906193 8.728420 PA14_EM_2-3_A-12_P1-A-12_01_2500.mzXML 7.341069 8.136234 6.191456 6.195770 4.499564 5.737833 8.022057 PA14_EM_9-2_C-5_P1-C-5_01_6835.mzXML 7.287684 8.216923 6.457609 6.364861 4.857282 5.839109 8.712801 PA14_EM_2_B-6_P1-B-6_01_11723.mzXML 7.123842 8.229959 6.656468 6.620781 4.688067 5.586544 9.881702 Here is my command: df_cmB<-ComBat(dat=as.matrix(df$dat),batch=df$MSBa,mod=NULL) And here is the error mesage: Error in solve(t(design) %*% design) %*% t(design) %*% t(as.matrix(dat)) : non-conformable arguments I read a similar post on stacked overflow and it was solved by removing variables with near zero variance, but I don't have any such variables Any help is appreciated ComBat • 4.5k views ADD COMMENT 0 Entering edit mode @james-w-macdonald-5106 Last seen 12 hours ago United States Most software intended for the analysis of high-dimensional data (microarray, RNA-Seq, metabolomics, etc) expects that the data will be in a format with samples in columns and observations in rows. Your data are just the opposite, so try df_cmB<-ComBat(dat=t(as.matrix(df$dat)),batch=df$MSBa,mod=NULL) where the only difference is that I wrapped your data matrix in t() to transpose it. ADD COMMENT 0 Entering edit mode Thanks, I now have a new error. And it thinks I have 10 batches when I only have 5. > df_cmB<-ComBat(dat=t(as.matrix(df$dat)),batch=df$MSBa,mod=NULL) Found 10 batches Found 0 categorical covariate(s) Standardizing Data across genes Fitting L/S model and finding priors Error in apply(s.data[, i], 1, var, na.rm = T) : dim(X) must have a positive length ADD REPLY 0 Entering edit mode Is your df$MSBa a factor with 5 levels?
Okay - oops I had a few extra entries in my  df$MSBa. Now I"m getting a different error re: non-conformable arguments > df_cmB<-ComBat(dat=t(as.matrix(df$dat)),batch=as.factor(df$MSBa),mod=NULL) Found 5 batches Found 0 categorical covariate(s) Standardizing Data across genes Error in ((dat - t(design %*% B.hat))^2) %*% rep(1/n.array, n.array) : non-conformable arguments > str(df$dat)
'data.frame':    5864 obs. of  21 variables:
$peak1 : num 5.44 5.11 5.59 4.62 5.84 ...$ peak2 : num  6.51 6.56 6.77 6.49 7.38 ...
$peak3 : num 5.25 5.54 5.38 4.5 6.45 ...$ peak4 : num  7.21 7.65 7.47 7.31 8.51 ...
$peak5 : num 7.47 6.75 8.1 8.81 7.78 ...$ peak6 : num  10.76 9.61 10.48 9.66 10.05 ...
$peak7 : num 9.29 9.02 9.69 9.38 9.7 ...$ peak8 : num  9.13 9.06 9.74 9.46 9.78 ...
$peak9 : num 9.88 8.61 9.42 8.49 9.13 ...$ peak10: num  10.44 9.27 9.94 8.86 9.42 ...
$peak11: num 8.28 8.05 8.58 8.31 8.49 ...$ peak12: num  7.26 6.83 7.53 7.26 7.41 ...
$peak13: num 8.84 7.92 8.51 7.53 7.87 ...$ peak14: num  4.52 4.74 4.59 4.24 4.35 ...
$peak15: num 7.5 7.84 7.62 7.34 7.29 ...$ peak16: num  8.52 8.12 8.23 8.14 8.22 ...
$peak17: num 7.12 6.92 6.93 6.19 6.46 ...$ peak18: num  6.3 6.55 6.57 6.2 6.36 ...
$peak19: num 4.08 4.74 4.62 4.5 4.86 ...$ peak20: num  5.8 6 5.91 5.74 5.84 ...
$peak21: num 9.97 9.01 8.73 8.02 8.71 ... > str(as.factor(df$MSBa))
Factor w/ 5 levels "MSB1","MSB2",..: 5 1 4 1 3 5 3 5 2 1 ...