Hi
I have an old array that is not annotated within Bioc annotation. I tried affy and oligo package, but am running into issues applying a non-standard (invariantset) normalization. We have good bioloigcal reasons to believe that total transcriptional counts should vary in samples, so rma so I want to compare rma to other normalization approaches.
affy::expresso with invariant falls over.
expresso(abatch,bgcorrect.method="none",normalize.method="invariantset",pmcorrect.method="pmonly",summary.method="liwong")
background correction: none
normalization: invariantset
PM/MM correction : pmonly
expression values: liwong
background correcting...done.
normalizing...
Error in smooth.spline(ref[i.set], data[i.set]) :
missing or infinite values in inputs are not allowed
I know the affy package is older, so I tried oligo. I built my own custom pd package. Both my own package and the brainarray MBNI annotation fail. Both of these have class GenericPDInfo, and the vignette and man pages do not address this case in much detail.
After, searching this message board, I use target="mps1" (not included in the package documentation). Both oligo::rma or oligo:: fitProbeLevelModel, return a result, however I wish to compare rma or non-standard approaches. However oligo::summarize will not accept a "GenericFeatureSet" only a matrix or ff_matrix.
eSetTest <- oligo::summarize(exprs(eNormTest),method="medianpolish", verbose=TRUE)
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function 'summarize' for signature '"GenericFeatureSet"'
The following work;
eNormTest <- oligo::normalize(obatch_bgcorrected, method="invariantset", target="mps1")
eSetTest <- oligo::summarize(exprs(eNormTest),method="medianpolish", verbose=TRUE)
However of course exprs(eNorm) ignore the probe_level information. Therefore I need to provide probe with the probe level info in the pd.package (MBNI annotation).
I can start reading code of the pd package, Annotationdbi, but am spending more time that I wanted trying to do something very simple. Maybe I am missing a really easy work around and am wasting time needlessly.
Thanks
Aedin
There isn't at present much help. The GenericArray infrastructure is supposed to be a generalization of the code to allow things like the MBNI arrays (and any random future Array that Affy might foist upon us) to automatically be accommodated.
The old school way of doing things was to take all of Affy's library files and then generate tables in the DB that were named according to the data from Affy. As an example:
Where all those xxx_mps tables map the probes to probesets, depending on the summarization level you wanted to use. But this requires each array type to have its own methods, so we have a profusion of methods for e.g., rma:
The tables in a GenericPDInfoPackage are generic:
and if there are multiple summarization levels, you can simply add more featureSetK/mpsKpm/mpsKmm triplets to define the probe -> probeset mappings. So this stops the profusion of array types and methods, but it adds the inevitable question of 'so how are the mps1 and mps2 targets different for this array?', which will obviously have a profusion of its own, perhaps even worse?
The function getMPSInfo is intended to do a join between the featureSetK and mpsKpm tables (in your case featureSet1 and mps1pm) and return a data.frame that maps the fid (or what I conventionally call the probe ID) to the fsetid and man_fsetid, which are the probeset level IDs, in order to do the summarization.
And the mps0 target is sort of a joke, I imagine. Or maybe it's just a placeholder that has no inherent meaning as yet. The target argument isn't used, so you could put whatever you want and still get the right thing.
I don't understand your question. Are you saying that these lines:
run regardless of normalize being TRUE or FALSE? That doesn't seem likely - a bug of that magnitude (e.g., R totally ignoring that if statement) would have been caught by now.
Or do I misunderstand your question?