Question: designing an eSet derived object
0
8.6 years ago by
Wolfgang RAFFELSBERGER130 wrote:
Dear list, basically I'm trying to design an object to contain the following microarray-data 1) "gxIndData": microarray-data normalized in parallel by (an array- dependent) number of n methods plus the corresponding expression-calls (again, <= n methods), 2) "gxAvData": derived values (replicate-averages, SEMs, etc), 3) gene/spot annotation, 4) sample-description, 5) various supl informations (parameters, notes, versions, etc) In overall, this is a somehow modified/extended concept to the Biobase eSet and I'm trying to figure out if there is a way to use the Biobase eSet. This way I hope to maintain a decent level of compatibility with other Bioconductor methods and allow code-reuse. Now I'd like to store the various sections of 1) and 2) as separate lists with n matrixes of values to keep things organized. According to the Vignette "Biobase development and the new eSet" section 5 ("Extending eSet"), I defined new a new class 'eSet'. But as soon as I integrate something different than matrixes at the level of 'AssayData', I get an error-message (see code below) - no matter if these are simply lists or custom-objects. I suppose this means that I would have to store all matrixes (up to 10*6methods =60 matrixes) without further organization at the level of 'AssayData'. However, I'd like to keep at least one (in my case better 2) levels of additional arborescence to keep the data organized. So, finally I would like to integrate two new classes for 1) and 2) at the level of the assayData slot of my modified/new eSet. Does this mean this is not possible and that I cannot use the 'eSet' for my purposes ? Do I have to create a novel class somehow equivalent but finally incompatible to the 'eSet' ? Any suggestions/hints ? Thanks in advance, wolfgang ## require(Biobase) setClass("gxSet", contains = "eSet") setMethod("initialize", "gxSet", function(.Object, A=new("list"),B=new("list"),...) { callNextMethod(.Object, A=A,B=B, ...) }) new("gxSet") ## produces : Error in function (storage.mode = c("lockedEnvironment", "environment", : 'AssayData' elements with invalid dimensions: 'A' 'B' ## ideally I'd like to use setClass("gxIndData",representation(SIdata="list",SIcall="list")) setClass("gxAvData",representation(avSI="list",expressed="list",SEM=" list", conCall="list", FC="list",FiltFin="list",FiltSI="list",FiltOther="list")) setClass("gxSet", contains = "eSet") setMethod("initialize","gxSet", function(.Object, assayData=assayDataNew(IndData=IndData,AvData=AvData), IndData=new("gxIndData"), AvData=new("gxAvData"),...) { if(!missing(assayData) && any(!missing(IndData), !missing(AvData))) { warning("using 'assayData'; ignoring 'IndData', 'AvData'") } callNextMethod(.Object, assayData = assayData, ...) }) new("gxSet") ## produces : Error in assayDataNew(IndData = IndData, AvData = AvData) : 'AssayData' elements with invalid dimensions: 'AvData' 'IndData' ## the alternative : an eSet 'like' but independent and incompatible object .. setClass("gxSet",representation(IndData="gxIndData",AvData="gxAvData" ,phenoData="AnnotatedDataFrame",featureData="AnnotatedDataFrame", experimentData="MIAME",annotation="character",protocolData="Annotate dDataFrame",notes="list")) ## for completeness: sessionInfo() R version 2.12.0 (2010-10-15) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 [4] LC_NUMERIC=C LC_TIME=French_France.1252 attached base packages: [1] grDevices datasets splines graphics stats tcltk utils methods base other attached packages: [1] affy_1.28.0 Biobase_2.10.0 svSocket_0.9-50 TinnR_1.0.3 R2HTML_2.2 Hmisc_3.8-3 survival_2.35-8 loaded via a namespace (and not attached): [1] affyio_1.18.0 cluster_1.13.1 grid_2.12.0 lattice_0.19-13 preprocessCore_1.12.0 [6] svMisc_0.9-60 tools_2.12.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et Génomique Intégratives IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 wolfgang.raffelsberger @ igbmc.fr [[alternative HTML version deleted]]
annotation • 725 views
modified 8.6 years ago by Martin Morgan ♦♦ 23k • written 8.6 years ago by Wolfgang RAFFELSBERGER130
Answer: designing an eSet derived object
0
8.6 years ago by
Martin Morgan ♦♦ 23k
United States
Martin Morgan ♦♦ 23k wrote:
On 11/05/2010 05:02 AM, Wolfgang RAFFELSBERGER wrote: > Dear list, > > basically I'm trying to design an object to contain the following > microarray-data > 1) "gxIndData": microarray-data normalized in parallel by (an > array-dependent) number of n methods plus the corresponding > expression-calls (again, <= n methods), > 2) "gxAvData": derived values (replicate-averages, SEMs, etc), > 3) gene/spot annotation, > 4) sample-description, > 5) various supl informations (parameters, notes, versions, etc) > > In overall, this is a somehow modified/extended concept to the > Biobase eSet and I'm trying to figure out if there is a way to use > the Biobase eSet. This way I hope to maintain a decent level of > compatibility with other Bioconductor methods and allow code-reuse. > > Now I'd like to store the various sections of 1) and 2) as separate > lists with n matrixes of values to keep things organized. > > According to the Vignette "Biobase development and the new eSet" > section 5 ("Extending eSet"), I defined new a new class 'eSet'. But > as soon as I integrate something different than matrixes at the level > of 'AssayData', I get an error-message (see code below) - no matter > if these are simply lists or custom-objects. I suppose this means > that I would have to store all matrixes (up to 10*6methods =60 > matrixes) without further organization at the level of 'AssayData'. eSet requires that all AssayData elements are two-dimensional with identical dimensions, so a list-of-matrices would not work. > However, I'd like to keep at least one (in my case better 2) levels > of additional arborescence to keep the data organized. > > So, finally I would like to integrate two new classes for 1) and 2) > at the level of the assayData slot of my modified/new eSet. > > Does this mean this is not possible and that I cannot use the 'eSet' > for my purposes ? Do I have to create a novel class somehow > equivalent but finally incompatible to the 'eSet' ? > > Any suggestions/hints ? One possiblity, if this is for your own use and not as the foundation for a package, is to use NChannelSet, where each method is a 'channel'. Another possibility is to create a class that extends eSet with a slot containing, e.g., an AnnotatedDataFrame with columns describing the AssayData, and a method to query the slot / select the appropriate assayData elements And perhaps what you really have is more a list of (of lists of) ExpressionSets, each element of the list with additional information. An approach here would use the IRanges 'SimpleList' infrastructure, e.g., > lst = SimpleList(a=new("ExpressionSet"), b=new("ExpressionSet")) > elementMetadata(lst) = DataFrame(method=c("A", "B")) > lst[elementMetadata(lst)$method == "A"] SimpleList of length 1 names(1): a > lst[elementMetadata(lst)$method == "A"][[1]] ExpressionSet (storageMode: lockedEnvironment) assayData: 0 features, 0 samples element names: exprs protocolData: none phenoData: none featureData: none experimentData: use 'experimentData(object)' Annotation: Martin > > Thank?s in advance, > wolfgang > > ## > > require(Biobase) > setClass("gxSet", contains = "eSet") > setMethod("initialize", "gxSet", function(.Object, A=new("list"),B=new("list"),...) { > callNextMethod(.Object, A=A,B=B, ...) }) > new("gxSet") > ## produces : > Error in function (storage.mode = c("lockedEnvironment", "environment", : > 'AssayData' elements with invalid dimensions: 'A' 'B' > > > ## ideally I'd like to use > setClass("gxIndData",representation(SIdata="list",SIcall="list")) > setClass("gxAvData",representation(avSI="list",expressed="list",SEM ="list", conCall="list", > FC="list",FiltFin="list",FiltSI="list",FiltOther="list")) > setClass("gxSet", contains = "eSet") > > setMethod("initialize","gxSet", function(.Object, > assayData=assayDataNew(IndData=IndData,AvData=AvData), > IndData=new("gxIndData"), AvData=new("gxAvData"),...) { > if(!missing(assayData) && any(!missing(IndData), !missing(AvData))) { > warning("using 'assayData'; ignoring 'IndData', 'AvData'") } > callNextMethod(.Object, assayData = assayData, ...) > }) > > new("gxSet") > ## produces : > Error in assayDataNew(IndData = IndData, AvData = AvData) : > 'AssayData' elements with invalid dimensions: 'AvData' 'IndData' > > > ## the alternative : an eSet 'like' but independent and incompatible object .. > setClass("gxSet",representation(IndData="gxIndData",AvData="gxAvDat a",phenoData="AnnotatedDataFrame",featureData="AnnotatedDataFrame", > experimentData="MIAME",annotation="character",protocolData="Annota tedDataFrame",notes="list")) > > > > ## for completeness: > sessionInfo() > R version 2.12.0 (2010-10-15) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 > [4] LC_NUMERIC=C LC_TIME=French_France.1252 > > attached base packages: > [1] grDevices datasets splines graphics stats tcltk utils methods base > > other attached packages: > [1] affy_1.28.0 Biobase_2.10.0 svSocket_0.9-50 TinnR_1.0.3 R2HTML_2.2 Hmisc_3.8-3 survival_2.35-8 > > loaded via a namespace (and not attached): > [1] affyio_1.18.0 cluster_1.13.1 grid_2.12.0 lattice_0.19-13 preprocessCore_1.12.0 > [6] svMisc_0.9-60 tools_2.12.0 > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > Wolfgang Raffelsberger, PhD > Laboratoire de BioInformatique et G?nomique Int?gratives > IGBMC, > 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France > Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 > wolfgang.raffelsberger @ igbmc.fr > > > [[alternative HTML version deleted]] > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
Dear Martin, thank you very much for your helpful input. I'm sorry I have to bug you again. I was about there, but at the recent Bioconductor Developer Meeting I got another intersting suggestion, which I haven't succeded implementing. Briefly, (if I understood right) the idea was rather to make a modified SimpleList class where I could check that each elment is an expression set (instead of using the SimpleList class as is). From there one might even go one step further and check if all dimensions are identical, too ... For the making the modified SimpleList I returned to the help provided in the Bioconductor pdf "Biobase development and the new eSet". But it seems I'm not getting the inizialization right. My 'problem' is, that I don't want to fix in advance how many ExperssionSets will be put in the list (SimpleList), neither what their names will be. This way I hope the object will be sufficienltly general to hold results from normalization-methods that might become available in the future. Now, this is now quite different to the example provided in "Biobase development and the new eSet". To link to my previous post: This (modified) SimpleList will then be used as a slot (allowing to store data normalized by multiple methods) of another new class (the "GxSet"), plus in other slots for data- derived values (averages, etc) and more documentation/notes)... Thank's in advance fro any hints, Wolfgang > > require(Biobase); require(IRanges); require(affy) > # the toy data > eset1 <- new("ExpressionSet", exprs=matrix(1,10,4)) > pData(eset1) <- data.frame("class"=c(1,2,2,2)) > > eset2 <- new("ExpressionSet", exprs=matrix(3,10,4)) > pData(eset2) <- data.frame("class"=c(1,2,2,2)) > > # making the modified class > setClass("GxSimpleList",contains="SimpleList") [1] "GxSimpleList" > getClass("GxSimpleList") Class "GxSimpleList" [in ".GlobalEnv"] Slots: Name: listData elementMetadata elementType metadata Class: list ANY character list Extends: Class "SimpleList", directly Class "Sequence", by class "SimpleList", distance 2 Class "Annotated", by class "SimpleList", distance 3 > > # for the "initialize" I didn't understand how to formulate it in my case (as I don't know how many elements, neither their names) > setMethod("initialize","GxSimpleList", function(.object,...) listData = listDataNew(lapply(list(.object,...) == "ExpressionSet") )) Error in conformMethod(signature, mnames, fnames, f, fdef, definition) : in method for ?initialize? with signature ?.Object="GxSimpleList"?: formal arguments (.Object = "GxSimpleList", ... = "GxSimpleList") omitted in the method definition cannot be in the signature > > setMethod("initialize","GxSimpleList", function(.object,...) {.object <- callNextMethod(.object,...)}) Error in conformMethod(signature, mnames, fnames, f, fdef, definition) : in method for ?initialize? with signature ?.Object="GxSimpleList"?: formal arguments (.Object = "GxSimpleList", ... = "GxSimpleList") omitted in the method definition cannot be in the signature > > # I guess the check for experssionSets should go into validity > setValidity("GxSimpleList", function(object) { # experimetal + if(sum(!(unlist(lapply(object,function(x) class(x))) %in% "ExpressionSet")) >0) "A 'GxSimpleList' object should contain elements of class 'ExpressionSet' only !" + #same as ?# assayDataValidMembers(class(object), rep("ExpressionSet",length(object))) + }) Class "GxSimpleList" [in ".GlobalEnv"] Slots: Name: listData elementMetadata elementType metadata Class: list ANY character list Extends: Class "SimpleList", directly Class "Sequence", by class "SimpleList", distance 2 Class "Annotated", by class "SimpleList", distance 3 > > # what happens .. > lst1 = SimpleList(a=eset1, b=eset2) # OK > > lst2 = new("GxSimpleList",a=eset1, b=eset2) # error (due to missing "initialize" ?) Error in initialize(value, ...) : invalid names for slots of class "GxSimpleList": a, b > lst3 = GxSimpleList(a=eset1, b=eset2) # error (due to missing "initialize" ?) Error: could not find function "GxSimpleList" > > # for completeness ... > sessionInfo() R version 2.12.0 (2010-10-15) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 LC_NUMERIC=C [5] LC_TIME=French_France.1252 attached base packages: [1] grDevices datasets splines graphics stats tcltk utils methods base other attached packages: [1] affy_1.28.0 IRanges_1.8.0 Biobase_2.10.0 svSocket_0.9-50 TinnR_1.0.3 R2HTML_2.2 Hmisc_3.8-3 survival_2.35-8 loaded via a namespace (and not attached): [1] affyio_1.18.0 cluster_1.13.1 grid_2.12.0 lattice_0.19-13 preprocessCore_1.12.0 svMisc_0.9-60 [7] tools_2.12.0 > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et G?nomique Int?gratives IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 wolfgang.raffelsberger (at ) igbmc.fr ________________________________________ De : Martin Morgan [mtmorgan at fhcrc.org] Date d'envoi : vendredi 5 novembre 2010 18:33 ? : Wolfgang RAFFELSBERGER Cc : bioconductor at stat.math.ethz.ch Objet : Re: [BioC] designing an eSet derived object On 11/05/2010 05:02 AM, Wolfgang RAFFELSBERGER wrote: > Dear list, > > basically I'm trying to design an object to contain the following > microarray-data > 1) "gxIndData": microarray-data normalized in parallel by (an > array-dependent) number of n methods plus the corresponding > expression-calls (again, <= n methods), > 2) "gxAvData": derived values (replicate-averages, SEMs, etc), > 3) gene/spot annotation, > 4) sample-description, > 5) various supl informations (parameters, notes, versions, etc) > > In overall, this is a somehow modified/extended concept to the > Biobase eSet and I'm trying to figure out if there is a way to use > the Biobase eSet. This way I hope to maintain a decent level of > compatibility with other Bioconductor methods and allow code-reuse. > > Now I'd like to store the various sections of 1) and 2) as separate > lists with n matrixes of values to keep things organized. > > According to the Vignette "Biobase development and the new eSet" > section 5 ("Extending eSet"), I defined new a new class 'eSet'. But > as soon as I integrate something different than matrixes at the level > of 'AssayData', I get an error-message (see code below) - no matter > if these are simply lists or custom-objects. I suppose this means > that I would have to store all matrixes (up to 10*6methods =60 > matrixes) without further organization at the level of 'AssayData'. eSet requires that all AssayData elements are two-dimensional with identical dimensions, so a list-of-matrices would not work. > However, I'd like to keep at least one (in my case better 2) levels > of additional arborescence to keep the data organized. > > So, finally I would like to integrate two new classes for 1) and 2) > at the level of the assayData slot of my modified/new eSet. > > Does this mean this is not possible and that I cannot use the 'eSet' > for my purposes ? Do I have to create a novel class somehow > equivalent but finally incompatible to the 'eSet' ? > > Any suggestions/hints ? One possiblity, if this is for your own use and not as the foundation for a package, is to use NChannelSet, where each method is a 'channel'. Another possibility is to create a class that extends eSet with a slot containing, e.g., an AnnotatedDataFrame with columns describing the AssayData, and a method to query the slot / select the appropriate assayData elements And perhaps what you really have is more a list of (of lists of) ExpressionSets, each element of the list with additional information. An approach here would use the IRanges 'SimpleList' infrastructure, e.g., > lst = SimpleList(a=new("ExpressionSet"), b=new("ExpressionSet")) > elementMetadata(lst) = DataFrame(method=c("A", "B")) > lst[elementMetadata(lst)$method == "A"] SimpleList of length 1 names(1): a > lst[elementMetadata(lst)$method == "A"][[1]] ExpressionSet (storageMode: lockedEnvironment) assayData: 0 features, 0 samples element names: exprs protocolData: none phenoData: none featureData: none experimentData: use 'experimentData(object)' Annotation: Martin > > Thank?s in advance, > wolfgang > > ## > > require(Biobase) > setClass("gxSet", contains = "eSet") > setMethod("initialize", "gxSet", function(.Object, A=new("list"),B=new("list"),...) { > callNextMethod(.Object, A=A,B=B, ...) }) > new("gxSet") > ## produces : > Error in function (storage.mode = c("lockedEnvironment", "environment", : > 'AssayData' elements with invalid dimensions: 'A' 'B' > > > ## ideally I'd like to use > setClass("gxIndData",representation(SIdata="list",SIcall="list")) > setClass("gxAvData",representation(avSI="list",expressed="list",SEM ="list", conCall="list", > FC="list",FiltFin="list",FiltSI="list",FiltOther="list")) > setClass("gxSet", contains = "eSet") > > setMethod("initialize","gxSet", function(.Object, > assayData=assayDataNew(IndData=IndData,AvData=AvData), > IndData=new("gxIndData"), AvData=new("gxAvData"),...) { > if(!missing(assayData) && any(!missing(IndData), !missing(AvData))) { > warning("using 'assayData'; ignoring 'IndData', 'AvData'") } > callNextMethod(.Object, assayData = assayData, ...) > }) > > new("gxSet") > ## produces : > Error in assayDataNew(IndData = IndData, AvData = AvData) : > 'AssayData' elements with invalid dimensions: 'AvData' 'IndData' > > > ## the alternative : an eSet 'like' but independent and incompatible object .. > setClass("gxSet",representation(IndData="gxIndData",AvData="gxAvDat a",phenoData="AnnotatedDataFrame",featureData="AnnotatedDataFrame", > experimentData="MIAME",annotation="character",protocolData="Annota tedDataFrame",notes="list")) > > > > ## for completeness: > sessionInfo() > R version 2.12.0 (2010-10-15) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 > [4] LC_NUMERIC=C LC_TIME=French_France.1252 > > attached base packages: > [1] grDevices datasets splines graphics stats tcltk utils methods base > > other attached packages: > [1] affy_1.28.0 Biobase_2.10.0 svSocket_0.9-50 TinnR_1.0.3 R2HTML_2.2 Hmisc_3.8-3 survival_2.35-8 > > loaded via a namespace (and not attached): > [1] affyio_1.18.0 cluster_1.13.1 grid_2.12.0 lattice_0.19-13 preprocessCore_1.12.0 > [6] svMisc_0.9-60 tools_2.12.0 > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > Wolfgang Raffelsberger, PhD > Laboratoire de BioInformatique et G?nomique Int?gratives > IGBMC, > 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France > Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 > wolfgang.raffelsberger @ igbmc.fr > > > [[alternative HTML version deleted]] > > > > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
Hi Wolfgang -- On 11/22/2010 03:44 AM, Wolfgang RAFFELSBERGER wrote: > Dear Martin, > > thank you very much for your helpful input. I'm sorry I have to bug > you again. > I was about there, but at the recent Bioconductor Developer Meeting I > got another intersting suggestion, which I haven't succeded > implementing. > Briefly, (if I understood right) the idea was rather to make a > modified SimpleList class where I could check that each elment is an > expression set (instead of using the SimpleList class as is). From > there one might even go one step further and check if all dimensions > are identical, too ... > > For the making the modified SimpleList I returned to the help > provided in the Bioconductor pdf "Biobase development and the new > eSet". But it seems I'm not getting the inizialization right. > My 'problem' is, that I don't want to fix in advance how many > ExperssionSets will be put in the list (SimpleList), neither what > their names will be. This way I hope the object will be > sufficienltly general to hold results from normalization-methods that > might become available in the future. Now, this is now quite > different to the example provided in "Biobase development and the > new eSet". > > To link to my previous post: This (modified) SimpleList will then be > used as a slot (allowing to store data normalized by multiple > methods) of another new class (the "GxSet"), plus in other slots for > data-derived values (averages, etc) and more documentation/notes)... > > Thank's in advance fro any hints, Wolfgang > > >> >> require(Biobase); require(IRanges); require(affy) # the toy data >> eset1 <- new("ExpressionSet", exprs=matrix(1,10,4)) pData(eset1) <- >> data.frame("class"=c(1,2,2,2)) >> >> eset2 <- new("ExpressionSet", exprs=matrix(3,10,4)) pData(eset2) <- >> data.frame("class"=c(1,2,2,2)) >> >> # making the modified class >> setClass("GxSimpleList",contains="SimpleList") I think the idea is setClass("SimpleExpressionSetList", contains="SimpleList", prototype=prototype(elementType="ExpressionSet")) and then you're done... > listData1 <- list(A=new("ExpressionSet"), B=new("ExpressionSet")) > listData2 <- list(A=new("ExpressionSet"), B=matrix()) > new("SimpleExpressionSetList", listData=listData1) SimpleExpressionSetList of length 2 names(2): A B > new("SimpleExpressionSetList", listData=listData2) Error in validObject(.Object) : invalid class "SimpleExpressionSetList" object: the 'listData' slot must be a list containing ExpressionSet objects > > [1] "GxSimpleList" >> getClass("GxSimpleList") > Class "GxSimpleList" [in ".GlobalEnv"] > > Slots: > > Name: listData elementMetadata elementType > metadata Class: list ANY character > list > > Extends: Class "SimpleList", directly Class "Sequence", by class > "SimpleList", distance 2 Class "Annotated", by class "SimpleList", > distance 3 >> >> # for the "initialize" I didn't understand how to formulate it in >> my case (as I don't know how many elements, neither their names) >> setMethod("initialize","GxSimpleList", function(.object,...) >> listData = listDataNew(lapply(list(.object,...) == "ExpressionSet") >> )) > Error in conformMethod(signature, mnames, fnames, f, fdef, > definition) : in method for ?initialize? with signature > ?.Object="GxSimpleList"?: formal arguments (.Object = "GxSimpleList", > ... = "GxSimpleList") omitted in the method definition cannot be in > the signature >> >> setMethod("initialize","GxSimpleList", function(.object,...) >> {.object <- callNextMethod(.object,...)}) > Error in conformMethod(signature, mnames, fnames, f, fdef, > definition) : in method for ?initialize? with signature > ?.Object="GxSimpleList"?: formal arguments (.Object = "GxSimpleList", > ... = "GxSimpleList") omitted in the method definition cannot be in > the signature >> >> # I guess the check for experssionSets should go into validity >> setValidity("GxSimpleList", function(object) { # experimetal > + if(sum(!(unlist(lapply(object,function(x) class(x))) %in% > "ExpressionSet")) >0) "A 'GxSimpleList' object should contain > elements of class 'ExpressionSet' only !" + #same as ?# > assayDataValidMembers(class(object), > rep("ExpressionSet",length(object))) + }) Class "GxSimpleList" [in > ".GlobalEnv"] > > Slots: > > Name: listData elementMetadata elementType > metadata Class: list ANY character > list > > Extends: Class "SimpleList", directly Class "Sequence", by class > "SimpleList", distance 2 Class "Annotated", by class "SimpleList", > distance 3 >> >> # what happens .. lst1 = SimpleList(a=eset1, b=eset2) # OK >> >> lst2 = new("GxSimpleList",a=eset1, b=eset2) # error (due to >> missing "initialize" ?) > Error in initialize(value, ...) : invalid names for slots of class > "GxSimpleList": a, b >> lst3 = GxSimpleList(a=eset1, b=eset2) # error (due to >> missing "initialize" ?) > Error: could not find function "GxSimpleList" >> >> # for completeness ... sessionInfo() > R version 2.12.0 (2010-10-15) Platform: i386-pc-mingw32/i386 > (32-bit) > > locale: [1] LC_COLLATE=French_France.1252 > LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 > LC_NUMERIC=C [5] LC_TIME=French_France.1252 > > attached base packages: [1] grDevices datasets splines graphics > stats tcltk utils methods base > > other attached packages: [1] affy_1.28.0 IRanges_1.8.0 > Biobase_2.10.0 svSocket_0.9-50 TinnR_1.0.3 R2HTML_2.2 > Hmisc_3.8-3 survival_2.35-8 > > loaded via a namespace (and not attached): [1] affyio_1.18.0 > cluster_1.13.1 grid_2.12.0 lattice_0.19-13 > preprocessCore_1.12.0 svMisc_0.9-60 [7] tools_2.12.0 >> > > > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et > G?nomique Int?gratives IGBMC, 1 rue Laurent Fries, 67404 Illkirch > Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 > 3276 wolfgang.raffelsberger (at ) igbmc.fr > > ________________________________________ De : Martin Morgan > [mtmorgan at fhcrc.org] Date d'envoi : vendredi 5 novembre 2010 18:33 ? > : Wolfgang RAFFELSBERGER Cc : bioconductor at stat.math.ethz.ch Objet : > Re: [BioC] designing an eSet derived object > > On 11/05/2010 05:02 AM, Wolfgang RAFFELSBERGER wrote: >> Dear list, >> > >> basically I'm trying to design an object to contain the following >> microarray-data 1) "gxIndData": microarray-data normalized in >> parallel by (an array-dependent) number of n methods plus the >> corresponding expression-calls (again, <= n methods), 2) >> "gxAvData": derived values (replicate-averages, SEMs, etc), 3) >> gene/spot annotation, 4) sample-description, 5) various supl >> informations (parameters, notes, versions, etc) >> >> In overall, this is a somehow modified/extended concept to the >> Biobase eSet and I'm trying to figure out if there is a way to use >> the Biobase eSet. This way I hope to maintain a decent level of >> compatibility with other Bioconductor methods and allow >> code-reuse. >> >> Now I'd like to store the various sections of 1) and 2) as >> separate lists with n matrixes of values to keep things organized. >> >> According to the Vignette "Biobase development and the new eSet" >> section 5 ("Extending eSet"), I defined new a new class 'eSet'. >> But as soon as I integrate something different than matrixes at the >> level of 'AssayData', I get an error-message (see code below) - no >> matter if these are simply lists or custom-objects. I suppose this >> means that I would have to store all matrixes (up to 10*6methods >> =60 matrixes) without further organization at the level of >> 'AssayData'. > > eSet requires that all AssayData elements are two-dimensional with > identical dimensions, so a list-of-matrices would not work. > >> However, I'd like to keep at least one (in my case better 2) >> levels of additional arborescence to keep the data organized. >> >> So, finally I would like to integrate two new classes for 1) and >> 2) at the level of the assayData slot of my modified/new eSet. >> >> Does this mean this is not possible and that I cannot use the >> 'eSet' for my purposes ? Do I have to create a novel class somehow >> equivalent but finally incompatible to the 'eSet' ? >> >> Any suggestions/hints ? > > One possiblity, if this is for your own use and not as the > foundation for a package, is to use NChannelSet, where each method is > a 'channel'. > > Another possibility is to create a class that extends eSet with a > slot containing, e.g., an AnnotatedDataFrame with columns describing > the AssayData, and a method to query the slot / select the > appropriate assayData elements > > And perhaps what you really have is more a list of (of lists of) > ExpressionSets, each element of the list with additional information. > An approach here would use the IRanges 'SimpleList' infrastructure, > e.g., > >> lst = SimpleList(a=new("ExpressionSet"), b=new("ExpressionSet")) >> elementMetadata(lst) = DataFrame(method=c("A", "B")) >> lst[elementMetadata(lst)$method == "A"] > SimpleList of length 1 names(1): a >> lst[elementMetadata(lst)$method == "A"][[1]] > ExpressionSet (storageMode: lockedEnvironment) assayData: 0 features, > 0 samples element names: exprs protocolData: none phenoData: none > featureData: none experimentData: use 'experimentData(object)' > Annotation: > > Martin > >> >> Thank?s in advance, wolfgang >> >> ## >> >> require(Biobase) setClass("gxSet", contains = "eSet") >> setMethod("initialize", "gxSet", function(.Object, >> A=new("list"),B=new("list"),...) { callNextMethod(.Object, A=A,B=B, >> ...) }) new("gxSet") ## produces : Error in function (storage.mode >> = c("lockedEnvironment", "environment", : 'AssayData' elements >> with invalid dimensions: 'A' 'B' >> >> >> ## ideally I'd like to use >> setClass("gxIndData",representation(SIdata="list",SIcall="list")) >> setClass("gxAvData",representation(avSI="list",expressed="list",SEM ="list", >> conCall="list", >> FC="list",FiltFin="list",FiltSI="list",FiltOther="list")) >> setClass("gxSet", contains = "eSet") >> >> setMethod("initialize","gxSet", function(.Object, >> assayData=assayDataNew(IndData=IndData,AvData=AvData), >> IndData=new("gxIndData"), AvData=new("gxAvData"),...) { >> if(!missing(assayData) && any(!missing(IndData), !missing(AvData))) >> { warning("using 'assayData'; ignoring 'IndData', 'AvData'") } >> callNextMethod(.Object, assayData = assayData, ...) }) >> >> new("gxSet") ## produces : Error in assayDataNew(IndData = IndData, >> AvData = AvData) : 'AssayData' elements with invalid dimensions: >> 'AvData' 'IndData' >> >> >> ## the alternative : an eSet 'like' but independent and >> incompatible object .. >> setClass("gxSet",representation(IndData="gxIndData",AvData="gxAvDat a",phenoData="AnnotatedDataFrame",featureData="AnnotatedDataFrame", >> >> experimentData="MIAME",annotation="character",protocolData="AnnotatedD ataFrame",notes="list")) >> >> >> >> ## for completeness: sessionInfo() R version 2.12.0 (2010-10-15) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: [1] LC_COLLATE=French_France.1252 >> LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 [4] >> LC_NUMERIC=C LC_TIME=French_France.1252 >> >> attached base packages: [1] grDevices datasets splines graphics >> stats tcltk utils methods base >> >> other attached packages: [1] affy_1.28.0 Biobase_2.10.0 >> svSocket_0.9-50 TinnR_1.0.3 R2HTML_2.2 Hmisc_3.8-3 >> survival_2.35-8 >> >> loaded via a namespace (and not attached): [1] affyio_1.18.0 >> cluster_1.13.1 grid_2.12.0 lattice_0.19-13 >> preprocessCore_1.12.0 [6] svMisc_0.9-60 tools_2.12.0 >> >> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . >> . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et >> G?nomique Int?gratives IGBMC, 1 rue Laurent Fries, 67404 Illkirch >> Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 >> 3276 wolfgang.raffelsberger @ igbmc.fr >> >> >> [[alternative HTML version deleted]] >> >> >> >> >> _______________________________________________ Bioconductor >> mailing list Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >> archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 > Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 Telephone: 206 667-2793 -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793
Dear Martin, thank's again - I've got things working as you explained. Just to make sure I completely understood: Now everything is streamlined for the storage of the multiple ExperssionSets for the various methods employed (the 1st slot in my GxSet). The next step is then to review how I'm storing the "derived" data (eg averages, SEM,... for each of the methods from above). Here I've tried a few things, but as far as I understand, there is no already existing class close enough to my case (ideally a "SimpleListList" = list of SimpleLists). So I made a new class containing multiple SimpleList objects (code below) : setClass("GxAvData",representation(avSI="SimpleList",expressed="Simple List",SEM="SimpleList", FC="SimpleList",FiltFin="SimpleList",FiltSI="SimpleList",FiltOther= "SimpleList")) I've also tried to use the SimpleMatrixList object since all my (final) data are nothing but matrixes, but I didn't get this working. Does this matter much ? Or should I rather define a general "SimpleListList" (list of SimpleLists) first, to decline my specific class ("GxAvData") of this ? Thanks for all your helpful comments, Wolfgang PS: Hope you had a good travel back to the US. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et G?nomique Int?gratives IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 3276 wolfgang.raffelsberger (at ) igbmc.fr ________________________________________ De : bioconductor-bounces at stat.math.ethz.ch [bioconductor-bounces at stat.math.ethz.ch] de la part de Martin Morgan [mtmorgan at fhcrc.org] Date d'envoi : lundi 22 novembre 2010 19:42 ? : Wolfgang RAFFELSBERGER Cc : bioconductor at stat.math.ethz.ch Objet : Re: [BioC] RE : designing an eSet derived object Hi Wolfgang -- On 11/22/2010 03:44 AM, Wolfgang RAFFELSBERGER wrote: > Dear Martin, > > thank you very much for your helpful input. I'm sorry I have to bug > you again. > I was about there, but at the recent Bioconductor Developer Meeting I > got another intersting suggestion, which I haven't succeded > implementing. > Briefly, (if I understood right) the idea was rather to make a > modified SimpleList class where I could check that each elment is an > expression set (instead of using the SimpleList class as is). From > there one might even go one step further and check if all dimensions > are identical, too ... > > For the making the modified SimpleList I returned to the help > provided in the Bioconductor pdf "Biobase development and the new > eSet". But it seems I'm not getting the inizialization right. > My 'problem' is, that I don't want to fix in advance how many > ExperssionSets will be put in the list (SimpleList), neither what > their names will be. This way I hope the object will be > sufficienltly general to hold results from normalization-methods that > might become available in the future. Now, this is now quite > different to the example provided in "Biobase development and the > new eSet". > > To link to my previous post: This (modified) SimpleList will then be > used as a slot (allowing to store data normalized by multiple > methods) of another new class (the "GxSet"), plus in other slots for > data-derived values (averages, etc) and more documentation/notes)... > > Thank's in advance fro any hints, Wolfgang > > >> >> require(Biobase); require(IRanges); require(affy) # the toy data >> eset1 <- new("ExpressionSet", exprs=matrix(1,10,4)) pData(eset1) <- >> data.frame("class"=c(1,2,2,2)) >> >> eset2 <- new("ExpressionSet", exprs=matrix(3,10,4)) pData(eset2) <- >> data.frame("class"=c(1,2,2,2)) >> >> # making the modified class >> setClass("GxSimpleList",contains="SimpleList") I think the idea is setClass("SimpleExpressionSetList", contains="SimpleList", prototype=prototype(elementType="ExpressionSet")) and then you're done... > listData1 <- list(A=new("ExpressionSet"), B=new("ExpressionSet")) > listData2 <- list(A=new("ExpressionSet"), B=matrix()) > new("SimpleExpressionSetList", listData=listData1) SimpleExpressionSetList of length 2 names(2): A B > new("SimpleExpressionSetList", listData=listData2) Error in validObject(.Object) : invalid class "SimpleExpressionSetList" object: the 'listData' slot must be a list containing ExpressionSet objects > > [1] "GxSimpleList" >> getClass("GxSimpleList") > Class "GxSimpleList" [in ".GlobalEnv"] > > Slots: > > Name: listData elementMetadata elementType > metadata Class: list ANY character > list > > Extends: Class "SimpleList", directly Class "Sequence", by class > "SimpleList", distance 2 Class "Annotated", by class "SimpleList", > distance 3 >> >> # for the "initialize" I didn't understand how to formulate it in >> my case (as I don't know how many elements, neither their names) >> setMethod("initialize","GxSimpleList", function(.object,...) >> listData = listDataNew(lapply(list(.object,...) == "ExpressionSet") >> )) > Error in conformMethod(signature, mnames, fnames, f, fdef, > definition) : in method for ?initialize? with signature > ?.Object="GxSimpleList"?: formal arguments (.Object = "GxSimpleList", > ... = "GxSimpleList") omitted in the method definition cannot be in > the signature >> >> setMethod("initialize","GxSimpleList", function(.object,...) >> {.object <- callNextMethod(.object,...)}) > Error in conformMethod(signature, mnames, fnames, f, fdef, > definition) : in method for ?initialize? with signature > ?.Object="GxSimpleList"?: formal arguments (.Object = "GxSimpleList", > ... = "GxSimpleList") omitted in the method definition cannot be in > the signature >> >> # I guess the check for experssionSets should go into validity >> setValidity("GxSimpleList", function(object) { # experimetal > + if(sum(!(unlist(lapply(object,function(x) class(x))) %in% > "ExpressionSet")) >0) "A 'GxSimpleList' object should contain > elements of class 'ExpressionSet' only !" + #same as ?# > assayDataValidMembers(class(object), > rep("ExpressionSet",length(object))) + }) Class "GxSimpleList" [in > ".GlobalEnv"] > > Slots: > > Name: listData elementMetadata elementType > metadata Class: list ANY character > list > > Extends: Class "SimpleList", directly Class "Sequence", by class > "SimpleList", distance 2 Class "Annotated", by class "SimpleList", > distance 3 >> >> # what happens .. lst1 = SimpleList(a=eset1, b=eset2) # OK >> >> lst2 = new("GxSimpleList",a=eset1, b=eset2) # error (due to >> missing "initialize" ?) > Error in initialize(value, ...) : invalid names for slots of class > "GxSimpleList": a, b >> lst3 = GxSimpleList(a=eset1, b=eset2) # error (due to >> missing "initialize" ?) > Error: could not find function "GxSimpleList" >> >> # for completeness ... sessionInfo() > R version 2.12.0 (2010-10-15) Platform: i386-pc-mingw32/i386 > (32-bit) > > locale: [1] LC_COLLATE=French_France.1252 > LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 > LC_NUMERIC=C [5] LC_TIME=French_France.1252 > > attached base packages: [1] grDevices datasets splines graphics > stats tcltk utils methods base > > other attached packages: [1] affy_1.28.0 IRanges_1.8.0 > Biobase_2.10.0 svSocket_0.9-50 TinnR_1.0.3 R2HTML_2.2 > Hmisc_3.8-3 survival_2.35-8 > > loaded via a namespace (and not attached): [1] affyio_1.18.0 > cluster_1.13.1 grid_2.12.0 lattice_0.19-13 > preprocessCore_1.12.0 svMisc_0.9-60 [7] tools_2.12.0 >> > > > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et > G?nomique Int?gratives IGBMC, 1 rue Laurent Fries, 67404 Illkirch > Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 > 3276 wolfgang.raffelsberger (at ) igbmc.fr > > ________________________________________ De : Martin Morgan > [mtmorgan at fhcrc.org] Date d'envoi : vendredi 5 novembre 2010 18:33 ? > : Wolfgang RAFFELSBERGER Cc : bioconductor at stat.math.ethz.ch Objet : > Re: [BioC] designing an eSet derived object > > On 11/05/2010 05:02 AM, Wolfgang RAFFELSBERGER wrote: >> Dear list, >> > >> basically I'm trying to design an object to contain the following >> microarray-data 1) "gxIndData": microarray-data normalized in >> parallel by (an array-dependent) number of n methods plus the >> corresponding expression-calls (again, <= n methods), 2) >> "gxAvData": derived values (replicate-averages, SEMs, etc), 3) >> gene/spot annotation, 4) sample-description, 5) various supl >> informations (parameters, notes, versions, etc) >> >> In overall, this is a somehow modified/extended concept to the >> Biobase eSet and I'm trying to figure out if there is a way to use >> the Biobase eSet. This way I hope to maintain a decent level of >> compatibility with other Bioconductor methods and allow >> code-reuse. >> >> Now I'd like to store the various sections of 1) and 2) as >> separate lists with n matrixes of values to keep things organized. >> >> According to the Vignette "Biobase development and the new eSet" >> section 5 ("Extending eSet"), I defined new a new class 'eSet'. >> But as soon as I integrate something different than matrixes at the >> level of 'AssayData', I get an error-message (see code below) - no >> matter if these are simply lists or custom-objects. I suppose this >> means that I would have to store all matrixes (up to 10*6methods >> =60 matrixes) without further organization at the level of >> 'AssayData'. > > eSet requires that all AssayData elements are two-dimensional with > identical dimensions, so a list-of-matrices would not work. > >> However, I'd like to keep at least one (in my case better 2) >> levels of additional arborescence to keep the data organized. >> >> So, finally I would like to integrate two new classes for 1) and >> 2) at the level of the assayData slot of my modified/new eSet. >> >> Does this mean this is not possible and that I cannot use the >> 'eSet' for my purposes ? Do I have to create a novel class somehow >> equivalent but finally incompatible to the 'eSet' ? >> >> Any suggestions/hints ? > > One possiblity, if this is for your own use and not as the > foundation for a package, is to use NChannelSet, where each method is > a 'channel'. > > Another possibility is to create a class that extends eSet with a > slot containing, e.g., an AnnotatedDataFrame with columns describing > the AssayData, and a method to query the slot / select the > appropriate assayData elements > > And perhaps what you really have is more a list of (of lists of) > ExpressionSets, each element of the list with additional information. > An approach here would use the IRanges 'SimpleList' infrastructure, > e.g., > >> lst = SimpleList(a=new("ExpressionSet"), b=new("ExpressionSet")) >> elementMetadata(lst) = DataFrame(method=c("A", "B")) >> lst[elementMetadata(lst)$method == "A"] > SimpleList of length 1 names(1): a >> lst[elementMetadata(lst)$method == "A"][[1]] > ExpressionSet (storageMode: lockedEnvironment) assayData: 0 features, > 0 samples element names: exprs protocolData: none phenoData: none > featureData: none experimentData: use 'experimentData(object)' > Annotation: > > Martin > >> >> Thank?s in advance, wolfgang >> >> ## >> >> require(Biobase) setClass("gxSet", contains = "eSet") >> setMethod("initialize", "gxSet", function(.Object, >> A=new("list"),B=new("list"),...) { callNextMethod(.Object, A=A,B=B, >> ...) }) new("gxSet") ## produces : Error in function (storage.mode >> = c("lockedEnvironment", "environment", : 'AssayData' elements >> with invalid dimensions: 'A' 'B' >> >> >> ## ideally I'd like to use >> setClass("gxIndData",representation(SIdata="list",SIcall="list")) >> setClass("gxAvData",representation(avSI="list",expressed="list",SEM ="list", >> conCall="list", >> FC="list",FiltFin="list",FiltSI="list",FiltOther="list")) >> setClass("gxSet", contains = "eSet") >> >> setMethod("initialize","gxSet", function(.Object, >> assayData=assayDataNew(IndData=IndData,AvData=AvData), >> IndData=new("gxIndData"), AvData=new("gxAvData"),...) { >> if(!missing(assayData) && any(!missing(IndData), !missing(AvData))) >> { warning("using 'assayData'; ignoring 'IndData', 'AvData'") } >> callNextMethod(.Object, assayData = assayData, ...) }) >> >> new("gxSet") ## produces : Error in assayDataNew(IndData = IndData, >> AvData = AvData) : 'AssayData' elements with invalid dimensions: >> 'AvData' 'IndData' >> >> >> ## the alternative : an eSet 'like' but independent and >> incompatible object .. >> setClass("gxSet",representation(IndData="gxIndData",AvData="gxAvDat a",phenoData="AnnotatedDataFrame",featureData="AnnotatedDataFrame", >> >> experimentData="MIAME",annotation="character",protocolData="AnnotatedD ataFrame",notes="list")) >> >> >> >> ## for completeness: sessionInfo() R version 2.12.0 (2010-10-15) >> Platform: i386-pc-mingw32/i386 (32-bit) >> >> locale: [1] LC_COLLATE=French_France.1252 >> LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 [4] >> LC_NUMERIC=C LC_TIME=French_France.1252 >> >> attached base packages: [1] grDevices datasets splines graphics >> stats tcltk utils methods base >> >> other attached packages: [1] affy_1.28.0 Biobase_2.10.0 >> svSocket_0.9-50 TinnR_1.0.3 R2HTML_2.2 Hmisc_3.8-3 >> survival_2.35-8 >> >> loaded via a namespace (and not attached): [1] affyio_1.18.0 >> cluster_1.13.1 grid_2.12.0 lattice_0.19-13 >> preprocessCore_1.12.0 [6] svMisc_0.9-60 tools_2.12.0 >> >> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . >> . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et >> G?nomique Int?gratives IGBMC, 1 rue Laurent Fries, 67404 Illkirch >> Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 >> 3276 wolfgang.raffelsberger @ igbmc.fr >> >> >> [[alternative HTML version deleted]] >> >> >> >> >> _______________________________________________ Bioconductor >> mailing list Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >> archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 > Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 Telephone: 206 667-2793 -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
On 11/24/2010 05:47 AM, Wolfgang RAFFELSBERGER wrote: > Dear Martin, > > thank's again - I've got things working as you explained. > > Just to make sure I completely understood: Now everything is > streamlined for the storage of the multiple ExperssionSets for the > various methods employed (the 1st slot in my GxSet). The next step is > then to review how I'm storing the "derived" data (eg averages, > SEM,... for each of the methods from above). Here I've tried a few > things, but as far as I understand, there is no already existing > class close enough to my case (ideally a "SimpleListList" = list of > SimpleLists). So I made a new class containing multiple SimpleList > objects (code below) : > > setClass("GxAvData",representation(avSI="SimpleList",expressed="Simp leList",SEM="SimpleList", > > FC="SimpleList",FiltFin="SimpleList",FiltSI="SimpleList",FiltOther=" SimpleList")) > > > I've also tried to use the SimpleMatrixList object since all my > (final) data are nothing but matrixes, but I didn't get this working. > Does this matter much ? Or should I rather define a general > "SimpleListList" (list of SimpleLists) first, to decline my specific > class ("GxAvData") of this ? It seems like your class has a well-defined number of 'SimpleList' slots, so your setClass above seems appropriate. If I setClass("SimpleMatrixList", contains="SimpleList", prototype=prototype(elementType="matrix")) SimpleMatrixList <- function(...) new("SimpleMatrixList", listData=list(...)) things seem to work? > mlst <- SimpleMatrixList(a=matrix(0, 5, 5), b=matrix(1, 5, 5)) > mlst[["b"]] [,1] [,2] [,3] [,4] [,5] [1,] 1 1 1 1 1 [2,] 1 1 1 1 1 [3,] 1 1 1 1 1 [4,] 1 1 1 1 1 [5,] 1 1 1 1 1 > mlst <- SimpleMatrixList(c=data.frame()) Error in validObject(.Object) : invalid class "SimpleMatrixList" object: the 'listData' slot must be a list containing matrix objects Martin > > > Thanks for all your helpful comments, > > Wolfgang > > PS: Hope you had a good travel back to the US. > > . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . > . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et > G?nomique Int?gratives IGBMC, 1 rue Laurent Fries, 67404 Illkirch > Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 > 3276 wolfgang.raffelsberger (at ) igbmc.fr > > ________________________________________ De : > bioconductor-bounces at stat.math.ethz.ch > [bioconductor-bounces at stat.math.ethz.ch] de la part de Martin Morgan > [mtmorgan at fhcrc.org] Date d'envoi : lundi 22 novembre 2010 19:42 ? : > Wolfgang RAFFELSBERGER Cc : bioconductor at stat.math.ethz.ch Objet : > Re: [BioC] RE : designing an eSet derived object > > Hi Wolfgang -- > > On 11/22/2010 03:44 AM, Wolfgang RAFFELSBERGER wrote: >> Dear Martin, >> >> thank you very much for your helpful input. I'm sorry I have to >> bug you again. > >> I was about there, but at the recent Bioconductor Developer Meeting >> I got another intersting suggestion, which I haven't succeded >> implementing. > >> Briefly, (if I understood right) the idea was rather to make a >> modified SimpleList class where I could check that each elment is >> an expression set (instead of using the SimpleList class as is). >> From there one might even go one step further and check if all >> dimensions are identical, too ... >> >> For the making the modified SimpleList I returned to the help >> provided in the Bioconductor pdf "Biobase development and the new >> eSet". But it seems I'm not getting the inizialization right. > >> My 'problem' is, that I don't want to fix in advance how many >> ExperssionSets will be put in the list (SimpleList), neither what >> their names will be. This way I hope the object will be >> sufficienltly general to hold results from normalization-methods >> that might become available in the future. Now, this is now quite >> different to the example provided in "Biobase development and the >> new eSet". >> >> To link to my previous post: This (modified) SimpleList will then >> be used as a slot (allowing to store data normalized by multiple >> methods) of another new class (the "GxSet"), plus in other slots >> for data-derived values (averages, etc) and more >> documentation/notes)... >> >> Thank's in advance fro any hints, Wolfgang > >> >> >>> >>> require(Biobase); require(IRanges); require(affy) # the toy data >>> eset1 <- new("ExpressionSet", exprs=matrix(1,10,4)) pData(eset1) >>> <- data.frame("class"=c(1,2,2,2)) >>> >>> eset2 <- new("ExpressionSet", exprs=matrix(3,10,4)) pData(eset2) >>> <- data.frame("class"=c(1,2,2,2)) >>> >>> # making the modified class >>> setClass("GxSimpleList",contains="SimpleList") > > I think the idea is > > setClass("SimpleExpressionSetList", contains="SimpleList", > prototype=prototype(elementType="ExpressionSet")) > > and then you're done... > >> listData1 <- list(A=new("ExpressionSet"), B=new("ExpressionSet")) >> listData2 <- list(A=new("ExpressionSet"), B=matrix()) >> new("SimpleExpressionSetList", listData=listData1) > SimpleExpressionSetList of length 2 names(2): A B >> new("SimpleExpressionSetList", listData=listData2) > Error in validObject(.Object) : invalid class > "SimpleExpressionSetList" object: the 'listData' slot must be a list > containing ExpressionSet objects >> > >> [1] "GxSimpleList" >>> getClass("GxSimpleList") >> Class "GxSimpleList" [in ".GlobalEnv"] >> >> Slots: >> >> Name: listData elementMetadata elementType metadata >> Class: list ANY character list >> >> Extends: Class "SimpleList", directly Class "Sequence", by class >> "SimpleList", distance 2 Class "Annotated", by class "SimpleList", >> distance 3 >>> >>> # for the "initialize" I didn't understand how to formulate it >>> in my case (as I don't know how many elements, neither their >>> names) setMethod("initialize","GxSimpleList", >>> function(.object,...) listData = >>> listDataNew(lapply(list(.object,...) == "ExpressionSet") )) >> Error in conformMethod(signature, mnames, fnames, f, fdef, >> definition) : in method for ?initialize? with signature >> ?.Object="GxSimpleList"?: formal arguments (.Object = >> "GxSimpleList", ... = "GxSimpleList") omitted in the method >> definition cannot be in the signature >>> >>> setMethod("initialize","GxSimpleList", function(.object,...) >>> {.object <- callNextMethod(.object,...)}) >> Error in conformMethod(signature, mnames, fnames, f, fdef, >> definition) : in method for ?initialize? with signature >> ?.Object="GxSimpleList"?: formal arguments (.Object = >> "GxSimpleList", ... = "GxSimpleList") omitted in the method >> definition cannot be in the signature >>> >>> # I guess the check for experssionSets should go into validity >>> setValidity("GxSimpleList", function(object) { # experimetal >> + if(sum(!(unlist(lapply(object,function(x) class(x))) %in% >> "ExpressionSet")) >0) "A 'GxSimpleList' object should contain >> elements of class 'ExpressionSet' only !" + #same as ?# >> assayDataValidMembers(class(object), >> rep("ExpressionSet",length(object))) + }) Class "GxSimpleList" >> [in ".GlobalEnv"] >> >> Slots: >> >> Name: listData elementMetadata elementType metadata >> Class: list ANY character list >> >> Extends: Class "SimpleList", directly Class "Sequence", by class >> "SimpleList", distance 2 Class "Annotated", by class "SimpleList", >> distance 3 >>> >>> # what happens .. lst1 = SimpleList(a=eset1, b=eset2) # OK >>> >>> lst2 = new("GxSimpleList",a=eset1, b=eset2) # error (due to >>> missing "initialize" ?) >> Error in initialize(value, ...) : invalid names for slots of class >> "GxSimpleList": a, b >>> lst3 = GxSimpleList(a=eset1, b=eset2) # error (due to >>> missing "initialize" ?) >> Error: could not find function "GxSimpleList" >>> >>> # for completeness ... sessionInfo() >> R version 2.12.0 (2010-10-15) Platform: i386-pc-mingw32/i386 >> (32-bit) >> >> locale: [1] LC_COLLATE=French_France.1252 >> LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 >> LC_NUMERIC=C [5] LC_TIME=French_France.1252 >> >> attached base packages: [1] grDevices datasets splines graphics >> stats tcltk utils methods base >> >> other attached packages: [1] affy_1.28.0 IRanges_1.8.0 >> Biobase_2.10.0 svSocket_0.9-50 TinnR_1.0.3 R2HTML_2.2 >> Hmisc_3.8-3 survival_2.35-8 >> >> loaded via a namespace (and not attached): [1] affyio_1.18.0 >> cluster_1.13.1 grid_2.12.0 lattice_0.19-13 >> preprocessCore_1.12.0 svMisc_0.9-60 [7] tools_2.12.0 >>> >> >> >> >> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . >> . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique et >> G?nomique Int?gratives IGBMC, 1 rue Laurent Fries, 67404 Illkirch >> Strasbourg, France Tel (+33) 388 65 3300 Fax (+33) 388 65 >> 3276 wolfgang.raffelsberger (at ) igbmc.fr >> >> ________________________________________ De : Martin Morgan >> [mtmorgan at fhcrc.org] Date d'envoi : vendredi 5 novembre 2010 18:33 >> ? : Wolfgang RAFFELSBERGER Cc : bioconductor at stat.math.ethz.ch >> Objet : Re: [BioC] designing an eSet derived object >> >> On 11/05/2010 05:02 AM, Wolfgang RAFFELSBERGER wrote: >>> Dear list, >>> >> >>> basically I'm trying to design an object to contain the >>> following microarray-data 1) "gxIndData": microarray-data >>> normalized in parallel by (an array-dependent) number of n >>> methods plus the corresponding expression-calls (again, <= n >>> methods), 2) "gxAvData": derived values (replicate-averages, >>> SEMs, etc), 3) gene/spot annotation, 4) sample-description, 5) >>> various supl informations (parameters, notes, versions, etc) >>> >>> In overall, this is a somehow modified/extended concept to the >>> Biobase eSet and I'm trying to figure out if there is a way to >>> use the Biobase eSet. This way I hope to maintain a decent level >>> of compatibility with other Bioconductor methods and allow >>> code-reuse. >>> >>> Now I'd like to store the various sections of 1) and 2) as >>> separate lists with n matrixes of values to keep things >>> organized. >>> >>> According to the Vignette "Biobase development and the new eSet" >>> section 5 ("Extending eSet"), I defined new a new class 'eSet'. >>> But as soon as I integrate something different than matrixes at >>> the level of 'AssayData', I get an error-message (see code below) >>> - no matter if these are simply lists or custom-objects. I >>> suppose this means that I would have to store all matrixes (up to >>> 10*6methods =60 matrixes) without further organization at the >>> level of 'AssayData'. >> >> eSet requires that all AssayData elements are two-dimensional with >> identical dimensions, so a list-of-matrices would not work. >> >>> However, I'd like to keep at least one (in my case better 2) >>> levels of additional arborescence to keep the data organized. >>> >>> So, finally I would like to integrate two new classes for 1) and >>> 2) at the level of the assayData slot of my modified/new eSet. >>> >>> Does this mean this is not possible and that I cannot use the >>> 'eSet' for my purposes ? Do I have to create a novel class >>> somehow equivalent but finally incompatible to the 'eSet' ? >>> >>> Any suggestions/hints ? >> >> One possiblity, if this is for your own use and not as the >> foundation for a package, is to use NChannelSet, where each method >> is a 'channel'. >> >> Another possibility is to create a class that extends eSet with a >> slot containing, e.g., an AnnotatedDataFrame with columns >> describing the AssayData, and a method to query the slot / select >> the appropriate assayData elements >> >> And perhaps what you really have is more a list of (of lists of) >> ExpressionSets, each element of the list with additional >> information. An approach here would use the IRanges 'SimpleList' >> infrastructure, e.g., >> >>> lst = SimpleList(a=new("ExpressionSet"), b=new("ExpressionSet")) >>> elementMetadata(lst) = DataFrame(method=c("A", "B")) >>> lst[elementMetadata(lst)$method == "A"] >> SimpleList of length 1 names(1): a >>> lst[elementMetadata(lst)$method == "A"][[1]] >> ExpressionSet (storageMode: lockedEnvironment) assayData: 0 >> features, 0 samples element names: exprs protocolData: none >> phenoData: none featureData: none experimentData: use >> 'experimentData(object)' Annotation: >> >> Martin >> >>> >>> Thank?s in advance, wolfgang >>> >>> ## >>> >>> require(Biobase) setClass("gxSet", contains = "eSet") >>> setMethod("initialize", "gxSet", function(.Object, >>> A=new("list"),B=new("list"),...) { callNextMethod(.Object, >>> A=A,B=B, ...) }) new("gxSet") ## produces : Error in function >>> (storage.mode = c("lockedEnvironment", "environment", : >>> 'AssayData' elements with invalid dimensions: 'A' 'B' >>> >>> >>> ## ideally I'd like to use >>> setClass("gxIndData",representation(SIdata="list",SIcall="list")) >>> >>> setClass("gxAvData",representation(avSI="list",expressed="list",SEM="l ist", >>> conCall="list", >>> FC="list",FiltFin="list",FiltSI="list",FiltOther="list")) >>> setClass("gxSet", contains = "eSet") >>> >>> setMethod("initialize","gxSet", function(.Object, >>> assayData=assayDataNew(IndData=IndData,AvData=AvData), >>> IndData=new("gxIndData"), AvData=new("gxAvData"),...) { >>> if(!missing(assayData) && any(!missing(IndData), >>> !missing(AvData))) { warning("using 'assayData'; ignoring >>> 'IndData', 'AvData'") } callNextMethod(.Object, assayData = >>> assayData, ...) }) >>> >>> new("gxSet") ## produces : Error in assayDataNew(IndData = >>> IndData, AvData = AvData) : 'AssayData' elements with invalid >>> dimensions: 'AvData' 'IndData' >>> >>> >>> ## the alternative : an eSet 'like' but independent and >>> incompatible object .. >>> setClass("gxSet",representation(IndData="gxIndData",AvData="gxAvDa ta",phenoData="AnnotatedDataFrame",featureData="AnnotatedDataFrame", >>> >>> > >>> experimentData="MIAME",annotation="character",protocolData="AnnotatedD ataFrame",notes="list")) >>> >>> >>> >>> ## for completeness: sessionInfo() R version 2.12.0 (2010-10-15) >>> Platform: i386-pc-mingw32/i386 (32-bit) >>> >>> locale: [1] LC_COLLATE=French_France.1252 >>> LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 >>> [4] LC_NUMERIC=C LC_TIME=French_France.1252 >>> >>> attached base packages: [1] grDevices datasets splines >>> graphics stats tcltk utils methods base >>> >>> other attached packages: [1] affy_1.28.0 Biobase_2.10.0 >>> svSocket_0.9-50 TinnR_1.0.3 R2HTML_2.2 Hmisc_3.8-3 >>> survival_2.35-8 >>> >>> loaded via a namespace (and not attached): [1] affyio_1.18.0 >>> cluster_1.13.1 grid_2.12.0 lattice_0.19-13 >>> preprocessCore_1.12.0 [6] svMisc_0.9-60 tools_2.12.0 >>> >>> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . >>> . . . Wolfgang Raffelsberger, PhD Laboratoire de BioInformatique >>> et G?nomique Int?gratives IGBMC, 1 rue Laurent Fries, 67404 >>> Illkirch Strasbourg, France Tel (+33) 388 65 3300 Fax >>> (+33) 388 65 3276 wolfgang.raffelsberger @ igbmc.fr >>> >>> >>> [[alternative HTML version deleted]] >>> >>> >>> >>> >>> _______________________________________________ Bioconductor >>> mailing list Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the >>> archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >> >>> -- Computational Biology Fred Hutchinson Cancer Research Center 1100 >> Fairview Ave. N. PO Box 19024 Seattle, WA 98109 >> >> Location: M1-B861 Telephone: 206 667-2793 > > > -- Computational Biology Fred Hutchinson Cancer Research Center 1100 > Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 Telephone: 206 667-2793 > > _______________________________________________ Bioconductor mailing > list Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the > archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793