Hi,
Basically we have a spreadsheet containing all our microarray data, genes in rows and samples in colums. We have 6 batches in total, each containing 5-8 samples (53 samples in total) that were run on 5 different microarrays. I also created a SIF.csv file containing 3 columns : Array, Sample, Batch (See head and tail below). Array goes from 1 to 6, Batch goes from 1 to 6 and sample contains the name of each sample (same as in the spreadsheet). When running combat it generates some error.. The tutorial and manual is custom for a certain type of data and didn't help me understand how to apply this to more universal type of data. I don't understand how to get run combat, it constantly generates errors. What am I doing wrong with the script below?
> head(sif) Array Sample Batch 1 Array1 Sample1 Batch1 2 Array1 Sample2 Batch1 3 Array1 Sample3 Batch1 4 Array1 Sample4 Batch1 5 Array1 Sample5 Batch1 6 Array1 Sample6 Batch1 > tail(sif) Array Sample Batch 48 Array6 Sample48 Batch6 49 Array6 Sample49 Batch6 50 Array6 Sample50 Batch6 51 Array6 Sample51 Batch6 52 Array6 Sample52 Batch6 53 Array6 Sample53 Batch6
> library(sva) > dat = read.csv("Combat_matrix_input.csv"); > sif = read.csv("sif.csv"); > modcombat = model.matrix(~1, data=dat) # Here I run combat as per the tutorial > combat_edata = ComBat(dat, batch=batch, mod=modcombat, par.prior=TRUE, prior.plots=FALSE) Found 6 batches Error in cbind(batchmod, mod) : number of rows of matrices must match (see arg 2) #I was suspecting that the error was coming from mod, so I set it to NULL > combat_edata = ComBat(dat, batch=batch, mod=NULL, par.prior=TRUE, prior.plots=FALSE) Found 6 batches Adjusting for 0 covariate(s) or covariate level(s) Standardizing Data across genes Error in solve(t(design) %*% design) %*% t(design) %*% t(as.matrix(dat)) : requires numeric/complex matrix/vector arguments
I should also say that using model.matrix(~1,...) is equivalent to the default mod = NULL. Unless you have actual covariates, you can leave mod at the default value.