Difficulty extending ExpressionSet
Entering edit mode
Last seen 4.8 years ago

Hello, This question was asked here previously around 7 years ago (On extending class ExpressionSet), however the answer seems to have either changed or the problem I'm facing is not the same.

I am trying to write a class that extends ExpressionSet. I keep running into the following error:

Error in checkSlotAssignment(object, name, value) : 
  assignment of an object of class “matrix” is not valid for slot ‘assayData’ in an object of class “ExpressionSet”; is(value, "AssayData") is not TRUE

My class is simply a thin wrapper around ExpressionSet to give a couple additional slots and to allow me to write some specific methods that would not directly apply to an ExpressionSet instance.

By way of the simplest example, the following seems to trigger the error:

    contains = "ExpessionSet"
) -> MyExpressionSet

mat <- matrix(runif(100), nrow = 20, ncol =5))

# Fails:
my_instance <- MyExpressionSet(mat)

# Works:
normal_instance <- Biobase::ExpressionSet(mat)

The error you get for the above is:

Error: MyExpSet 'assayData' is class 'matrix' but should be or extend 'AssayData'

I've looked at the developer documentation about how to extend eSet, however those recommendations, particularly in terms of the initialize method, did not resolve my issue, e.g. naming all arguments and writing a new initialize method that only passes relevant arguments to the ExpressionSet initializer does not resolve the issue.

Based on the answer to the question at the above link, I understand that there are peculiarities with how the initialize method was written for ExpressionSet and eSet that make it difficult to extend. I have a feeling the issue may be with the validObject function, but I'm not sure.

Has anyone on this forum sucessfully extended eSet or ExpressionSet without having to include modified versions of the originally class definitions in their package? I've taken a look at (https://github.com/lgatto/MSnbase), which has several classes that extend eSet, however they are using deprecated S4 syntax and have essentially copied large portions of the eSet,initialize method in order to make their classes work.

Any advice on this would be very much appreciated.



Session info:

R version 3.5.0 (2018-04-23)
Platform: x86_64-apple-darwin17.5.0 (64-bit)
Running under: macOS High Sierra 10.13.3

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.5.0      parallel_3.5.0      tools_3.5.0         yaml_2.1.19         Biobase_2.40.0     
[6] BiocGenerics_0.26.0
biobase expressionset eset • 1.3k views
Entering edit mode
Last seen 19 days ago
United States

As Aaron says, new development should focus on SummarizedExperiment.

Since 'mat' is the exprs, try


Normally one would not expose the 'raw' constructor returned by setClass(), but use that in a constructor that performed initial argument checking / coercion, etc. 

If ExpressionSet were implemented following standard S4 conventions, then one would expect


to work -- the unnamed argument(s) to new() are meant to initialize the base class. So even under normal circumstances one would not expect MyExpressionSet(mat) to work -- MyExpressionSet does not extend matrix.

Entering edit mode

Thanks! I guess I had assumed that since no initialize method was defined for the subclass it would pass all its arguments unchanged to the parent class initialize method, like an implicit callNextMethod. I need to look more into SummarizedExperiment and how difficult it would be to interoperate with existing code designed with eSet derivatives in mind. A lot of limma code can be executed just with the exprs matrix, so it might not be much of a problem in my use case.

Entering edit mode

It does do as you say, modulo first constructing a prototype 'MyExpressionSet' object .Object. So your matrix is passed as the 'assayData' argument in the initialize signature, which is not actually what you want (assayData has been transformed by ExpressionSet() to be an environment containing the matrix). This generally illustrates the problem of initialize -- it exposes the internal structure of the class, which is really none of the user's business. The reason naming the argument exprs = mat works is because it matches the correct argument for your use case. The only way you'd know this is by closely examining the code, which is again generally not best practice.

Bioconductor has learned from at least some of its mistakes, and the SummarizedExperiment class is much better behaved in the way it works, in particular offering a 'real' constructor SummarizedExperiment() that is more than just the renamed call to new() returned by setClass().

These discussions belong on the bioc-devel mailing list.

> selectMethod("initialize", "MyExpressionSet")
Method Definition:

function (.Object, ...) 
    .local <- function (.Object, assayData, phenoData, featureData, 
        exprs = new("matrix"), ...) 
        if (missing(assayData)) {
            if (missing(phenoData)) 
Entering edit mode
Aaron Lun ★ 28k
Last seen 13 hours ago
The city by the bay

This seems like a package development question, which belongs on the Bioc-devel mailing list.

While I'm here, I'll note that SummarizedExperiment is really the way to go if you're creating a new class for some expression-like data structure (i.e., samples in columns, features in rows). There's some "canonical" instructions on how to extend the SE class, see the SummarizedExperiment vignettes for more details.

Entering edit mode
Last seen 9 months ago
United States

The RccSet class in the NanoStringQCPro package extends ExpressionSet, and I don't really see any wholesale copying of any code in there at all.

Entering edit mode

I see what you're saying. After looking over the code for the RccSet class definition a bit, it looks like they had to create a new assayData environment instance out of a matrix before calling the ExpressionSet constructor and made a bunch of methods to cover their bases depending on the input. However, I still don't understand why this isn't "automatic" when you're inheriting from ExpressionSet. The code is definitely helpful, although it feels like a lot of work just to add a single slot to the class...


Login before adding your answer.

Traffic: 763 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6