ComBat: Error non-conformable arguments
1
0
Entering edit mode
@alisonsarawaller-7103
Last seen 5.2 years ago
Germany

Dear all,

I have a large metabolomics data set. About 6000 samples were run over a few months in 5 batches, or 61 batches (depending on definition).

At the moment for each sample I have the intensity for 21 peaks (metabolites).

    > head(df$dat)
                                          peak1    peak2    peak3    peak4    peak5     peak6    peak7
PA14_EM_14-4_E-3_P1-E-3_01_11213.mzXML 5.440897 6.505249 5.251598 7.206793 7.467628 10.759801 9.294075
PA14_EM_1-2_D-4_P1-D-4_01_2002.mzXML   5.108385 6.556920 5.543050 7.652522 6.748898  9.606819 9.019394
PA14_EM_10-4_H-5_P1-H-5_01_8890.mzXML  5.591401 6.766761 5.381610 7.471881 8.100680 10.481429 9.689601
PA14_EM_2-3_A-12_P1-A-12_01_2500.mzXML 4.618323 6.485310 4.498478 7.309714 8.813708  9.658948 9.379349
PA14_EM_9-2_C-5_P1-C-5_01_6835.mzXML   5.836406 7.378964 6.446740 8.505912 7.779362 10.045803 9.704689
PA14_EM_2_B-6_P1-B-6_01_11723.mzXML    5.231878 6.639438 5.473027 7.712421 7.425328 10.343695 9.246132
                                          peak8    peak9    peak10   peak11   peak12   peak13   peak14
PA14_EM_14-4_E-3_P1-E-3_01_11213.mzXML 9.130252 9.879932 10.441853 8.277511 7.258236 8.837902 4.522068
PA14_EM_1-2_D-4_P1-D-4_01_2002.mzXML   9.058104 8.606485  9.272817 8.047970 6.825918 7.924373 4.738949
PA14_EM_10-4_H-5_P1-H-5_01_8890.mzXML  9.744228 9.416476  9.936105 8.577914 7.534848 8.511881 4.592875
PA14_EM_2-3_A-12_P1-A-12_01_2500.mzXML 9.455950 8.490708  8.859265 8.305719 7.257221 7.529841 4.244724
PA14_EM_9-2_C-5_P1-C-5_01_6835.mzXML   9.779392 9.128307  9.420374 8.487148 7.413307 7.872341 4.345545
PA14_EM_2_B-6_P1-B-6_01_11723.mzXML    9.358493 9.539392 10.017200 8.228972 7.089368 8.362185 4.132186
                                         peak15   peak16   peak17   peak18   peak19   peak20   peak21
PA14_EM_14-4_E-3_P1-E-3_01_11213.mzXML 7.503102 8.519748 7.118348 6.301519 4.083066 5.801221 9.971810
PA14_EM_1-2_D-4_P1-D-4_01_2002.mzXML   7.843904 8.123712 6.916606 6.550114 4.741928 6.003363 9.010882
PA14_EM_10-4_H-5_P1-H-5_01_8890.mzXML  7.618536 8.226453 6.932789 6.565171 4.615487 5.906193 8.728420
PA14_EM_2-3_A-12_P1-A-12_01_2500.mzXML 7.341069 8.136234 6.191456 6.195770 4.499564 5.737833 8.022057
PA14_EM_9-2_C-5_P1-C-5_01_6835.mzXML   7.287684 8.216923 6.457609 6.364861 4.857282 5.839109 8.712801
PA14_EM_2_B-6_P1-B-6_01_11723.mzXML    7.123842 8.229959 6.656468 6.620781 4.688067 5.586544 9.881702

Here is my command:

df_cmB<-ComBat(dat=as.matrix(df$dat),batch=df$MSBa,mod=NULL)

And here is the error mesage:

Error in solve(t(design) %*% design) %*% t(design) %*% t(as.matrix(dat)) :
  non-conformable arguments

I read a similar post on stacked overflow and it was solved by removing variables with near zero variance, but I don't have any such variables

 

Any help is appreciated

ComBat • 6.3k views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 16 hours ago
United States

Most software intended for the analysis of high-dimensional data (microarray, RNA-Seq, metabolomics, etc) expects that the data will be in a format with samples in columns and observations in rows. Your data are just the opposite, so try

df_cmB<-ComBat(dat=t(as.matrix(df$dat)),batch=df$MSBa,mod=NULL)

where the only difference is that I wrapped your data matrix in t() to transpose it.

ADD COMMENT
0
Entering edit mode

Thanks, I now have a new error.  And it thinks I have 10 batches when I only have 5.

> df_cmB<-ComBat(dat=t(as.matrix(df$dat)),batch=df$MSBa,mod=NULL)
Found 10 batches
Found 0  categorical covariate(s)
Standardizing Data across genes
Fitting L/S model and finding priors
Error in apply(s.data[, i], 1, var, na.rm = T) :
  dim(X) must have a positive length

ADD REPLY
0
Entering edit mode
Is your df$MSBa a factor with 5 levels?
ADD REPLY
0
Entering edit mode

Okay - oops I had a few extra entries in my  df$MSBa.

Now I"m getting a different error re: non-conformable arguments

> df_cmB<-ComBat(dat=t(as.matrix(df$dat)),batch=as.factor(df$MSBa),mod=NULL)
Found 5 batches
Found 0  categorical covariate(s)
Standardizing Data across genes
Error in ((dat - t(design %*% B.hat))^2) %*% rep(1/n.array, n.array) :
  non-conformable arguments

> str(df$dat)
'data.frame':    5864 obs. of  21 variables:
 $ peak1 : num  5.44 5.11 5.59 4.62 5.84 ...
 $ peak2 : num  6.51 6.56 6.77 6.49 7.38 ...
 $ peak3 : num  5.25 5.54 5.38 4.5 6.45 ...
 $ peak4 : num  7.21 7.65 7.47 7.31 8.51 ...
 $ peak5 : num  7.47 6.75 8.1 8.81 7.78 ...
 $ peak6 : num  10.76 9.61 10.48 9.66 10.05 ...
 $ peak7 : num  9.29 9.02 9.69 9.38 9.7 ...
 $ peak8 : num  9.13 9.06 9.74 9.46 9.78 ...
 $ peak9 : num  9.88 8.61 9.42 8.49 9.13 ...
 $ peak10: num  10.44 9.27 9.94 8.86 9.42 ...
 $ peak11: num  8.28 8.05 8.58 8.31 8.49 ...
 $ peak12: num  7.26 6.83 7.53 7.26 7.41 ...
 $ peak13: num  8.84 7.92 8.51 7.53 7.87 ...
 $ peak14: num  4.52 4.74 4.59 4.24 4.35 ...
 $ peak15: num  7.5 7.84 7.62 7.34 7.29 ...
 $ peak16: num  8.52 8.12 8.23 8.14 8.22 ...
 $ peak17: num  7.12 6.92 6.93 6.19 6.46 ...
 $ peak18: num  6.3 6.55 6.57 6.2 6.36 ...
 $ peak19: num  4.08 4.74 4.62 4.5 4.86 ...
 $ peak20: num  5.8 6 5.91 5.74 5.84 ...
 $ peak21: num  9.97 9.01 8.73 8.02 8.71 ...

> str(as.factor(df$MSBa))
 Factor w/ 5 levels "MSB1","MSB2",..: 5 1 4 1 3 5 3 5 2 1 ...

 

 

 

 

 

 

ADD REPLY

Login before adding your answer.

Traffic: 788 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6