ID <- getMainProbes(OligoEset)
annot <- select(mogene10sttranscriptcluster.db, featureNames(ID),
c("SYMBOL","GENENAME","ENTREZID")) # 36631
I am stuck at trying to merge the "annot" with "OligoEset". I would like to have annotated and normalized data set in a dataframe/.txt/.xls files to analyze.
You actually don't want the annotated and normalized data any of those forms. If you are going to use Bioconductor to analyze, then you need to learn to use the tools that are supplied.
The ExpressionSet containing your data is a perfect input to say, the limma package. So you now need to define what comparisons you want to make, and express that as a design matrix. See the limma user's guide.
What you would tend to do is something like
design <- model.matrix(~<args go here>)
fit <- lmFit(data.oligo, design)
fit2 <- eBayes(fit)
You will have duplicates in your annot data.frame, so you have to deal with that. The most naive thing you could do is choose the first one:
but assuming I want a collated file so that non-R users can read and understand it, Im thinking a file that contains normalized raw datum such that layman can compare and make their own analysis.
Sure. If you are already using affycoretools, see ?writeFit.
I am not in general enthused with giving normalized data to 'laymen' so they can make their own analyses. In other words, generating summarized data from raw celfiles is not usually the part of the analysis that requires the most sophistication (although the QC part does take some base knowledge). Instead, fitting models to the data and ensuring that statistically unsophisticated collaborators understand what was done and why is the main deliverable for my line of work.
Because of that, I much prefer giving people either HTML or Excel spreadsheets that already contain the comparisons they wanted. The ReportingTools package makes it very easy to generate HTML tables that are easy to work with. The openxlsx package makes it easy to output Excel spreadsheets directly, which allows you to circumvent Excel's tendency to convert gene symbols that look like dates into actual dates, when people import data incorrectly (as an example, SEPT1 is helpfully converted to 9/1/2015, because obviously).
Thanks James,
but assuming I want a collated file so that non-R users can read and understand it, Im thinking a file that contains normalized raw datum such that layman can compare and make their own analysis.
Is there a way I can get this done?