I'm trying to use the "intgroup" parameter of the arrayQualityMetrics function to specify the column in my dataset where sample groupings are stored. This is the aptly named "group" column.
When I execute the following:
arrayQualityMetrics(expressionset = sym.eset,
                    outdir = paste(studyName, "Quality1",sep="_" ),
                    intgroup = "group",
                    force = T)
I get an error message:
Error in prepdata(expressionset, intgroup = intgroup, do.logtransform = do.logtransform) : all elements of 'intgroup' should match column names of 'pData(expressionset)'.
I have verified that the column named "group" is present in my original dataset:
colnames(pData(sym.eset)) [1] "geo_accession" "Patient_ID" "geo_accn_hg.u133b" "geo_accn_hg.u133plus2" "series" [6] "age" "grade" "size" "ER_STATUS" "pgr" [11]"node" "DFS_TIME" "EVENT_DFS" "DMFS_TIME" "EVENT_DMFS" [16]"treatment" "group" "supplementary_file"
There is an internal function, "prepdata" that is called by "arrayQualityMetrics" and performs some preprocessing steps.  I added a trace trace(prepdata, edit = T), and a print-line at the section where prepdata checks "intgroup" against the column names in the expressionset's pData object.
function (expressionset, intgroup, do.logtransform)
{
    conversions = c(RGList = "NChannelSet")
    for (i in seq_along(conversions)) {
        if (is(expressionset, names(conversions)[i])) {
            expressionset = try(as(expressionset, conversions[i]))
            if (is(expressionset, "try-error")) {
                stop(sprintf("The argument 'expressionset' is of class '%s', and its automatic conversion into '%s' failed. Please try to convert it manually, or contact the creator of that object.\n",
                  names(conversions)[i], conversions[i]))
            }
            else {
                break
            }
        }
    }
    x = platformspecific(expressionset, intgroup, do.logtransform)
    if (!all(intgroup %in% colnames(x$pData)))
        print(colnames(x$pData))
    stop("all elements of 'intgroup' should match column names of 'pData(expressionset)'.")
    x = append(x, list(numArrays = ncol(x$M), intgroup = intgroup,
        do.logtransform = do.logtransform))
    x = append(x, intgroupColors(x))
    return(x)
}
It looks like only the first 10 pData columns are preserved during preprocessing:
[1] "geo_accession" "geo_accn_hg.u133plus2" "series" "age" "grade" [6] "size" "node" "DFS_TIME" "EVENT_DFS" "DMFS_TIME" Show Traceback Rerun with Debug Error in prepdata(expressionset, intgroup = intgroup, do.logtransform = do.logtransform) : all elements of 'intgroup' should match column names of 'pData(expressionset)'.
Re-ordering the columns such that my grouping variable comes first fixes this error,
as was previously suggested by others:
https://stat.ethz.ch/pipermail/bioconductor/2012-June/046295.html
