Search
Question: Using Ringo
0
10.8 years ago by
Joern Toedling720 wrote:
Hello Christoph, since this issue might of interest to other users of Ringo as well, I have CCed this to the Bioconductor mailing list. Please find my answers to your remarks below. > However, at one point I seem to have a misconception about how data should > be processed with Ringo: Based on your previous e-mail, I concluded that it > would be best to have separate RGList and ExpressionSet objects for each > separate analysis, such as for different histone modifications, and also to > hold promoter array slide 1 (chr to chr10) and slide 2 (chr10 to chrY) in > separate objects. That is, only biological replicates of the same histone > modification on the same array slide are packaged into a single RGList and > ExpressionSet object. > Sorry, apparently my answer and the example in the vignette were slightly misleading here. I do not see why different histone modifications measured on the same array platform should be kept in separate RGLists. I usually keep them in the same RGList, unless there's a very good reason to separate the histone modifications (or TF-ChIP data or other hybridizations), such as a very strong batch affect (huge differences between the raw data that were measured on different days and/or in different labs etc.) whose presence or absence you should assess during the quality assessment step on the raw data (boxplots of raw data distribution etc.). There's is another advantage in combining the histone modifications into a single RGList. VSN and other between-array normalizations aim at making different arrays more consistent and comparable to each other. If you only supply one array per RGList, you will only be able to perform sub-optimal comparisons between these different resulting MALists/ExpressionSets. Samples measured on a different array platforms, such as the second part of the genome represented by different probes on a separate array, however, should be kept in a separate RGList. I would then normalize the two RGLists separately and obtain the MALists of both. MA1 <- preprocess(RG1[RG1$genes$Status=="Probe", ], returnMAList=TRUE) MA2 <- preprocess(RG2[RG2$genes$Status=="Probe", ], returnMAList=TRUE) and then combine the MALists: MA.comb <- rbind(MA1,MA2) # results in one MAList X.comb <- Ringo:::asExprSet(MA.comb) # results in one ExpressionSet Then you should also generate one common probeAnno environment out of your two files. The script "makeProbeAnno.R" in the 'scripts' directory of the package contains such an example, too. Since I considerably streamlined that script after the BioC2.0 release, please use the development version of Ringo from http://www.bioconductor.org/packages/2.1/bioc/html/Ringo.html > This works perfectly fine for those cases where I have at least two > biological replicates. In these cases, preprocess() runs without error and > returns a normalized ExpressionSet. However, for some chromatin > modifications I have only a single array but no replicates, and for these > the normalization fails with the following error message: > > >> MA <- preprocess(RG[RG$genes$Status=="Probe", ]) # normalization >> > excluding any random probes, which are spotted in duplicate and cause > trouble > Normalizing... > vsn: 385301 x 2 matrix (1 stratum). 100% done. > Error in colnames<-(*tmp*, value = "1") : > attempt to set colnames on object with less than two dimensions > In addition: Warning messages: > 1: The function 'vsn' is deprecated, could you please use 'vsn2' instead. > 2: The exprSet class is deprecated, use ExpressionSet instead > 3: The exprSet class is deprecated, use ExpressionSet instead > 4: The exprSet class is deprecated, use ExpressionSet instead > 5: The exprSet class is deprecated, use ExpressionSet instead > 6: The exprSet class is deprecated, use ExpressionSet instead > > What solution would you suggest? Am I getting something terribly wrong? > > No, this is not worrying at all and is due to R's behavior to coerce matrices with only one column into vectors if you do not explicitly stop it from doing so and limma's "normalizeBetweenArray" does not. Taking the normalization issue if you only have one sample aside, if you only have one sample in the MAList you should manually convert the element 'M' of the MAList into a matrix before converting it into an ExpressionSet, like this: MA.comb$M <- as.matrix(MA.comb$M) X.comb <- Ringo:::asExprSet(MA.comb) # results in one ExpressionSet I will add such a line into the "asExprSet" function, too. Thank you for pointing this out. Regards, Joern -- Joern Toedling EMBL - European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD United Kingdom Phone +44(0)1223 492566 Email toedling at ebi.ac.uk