Hi, this is probably a really stupid question. But I'm trying to create an ExpressionSet object with some assay results from NanoString. However, I keep running into errors when adding featureData to the genes. The error message shows up like this:
Error in validObject(.Object) : invalid class “ExpressionSet” object: featureNames differ between assayData and featureData
I've successfully added the phenoData and assayData to the object, and this is what the object currently looks like:
ExpressionSet (storageMode: lockedEnvironment) assayData: 760 features, 24 samples element names: exprs protocolData: none phenoData sampleNames: Cx-F-SCI-2 Cx-F-SCI-3 ... Cx-M-Sham-24 (24 total) varLabels: Sample Group Sex Surgery varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)'
The assayData has 760 genes and 24 samples (matrix named "counts"). I also have a gene annotation file (turned into an AnnotatedDataFrame) with details on the 760 genes being tested (named as gene.details). It seems to me that the rownames match up perfectly. But when I try to troubleshoot the issue with the command
''' all(rownames(counts)==rownames(gene.details)) '''
The result returns FALSE.
But if I use the command:
'''setequal.vector (rownames(counts), rownames(gene.details)) '''
The result returns as TRUE . Why do the two commands have completely different results? How can I get the two to match up?
My first question would be, am I using the right package here? I saw online that there is another package called SummarizeExpriment, but it seems to be mostly used for RNAseq analysis. The second question would be, what am I missing here with matching the featureData to featureNames? I'm still quite a noob with data analysis in RStudio. Send me a message if there's anything that I left out.
Would it be a bad idea to use ExpressionSet for nanostring data? Still, the thing nags me to no end. Also, I'm trying to run a PLS-DA on the data as part of downstream analysis for better differentiation between groups.
Normally I would use
edgeR
for NanoString data, in which case I would make aDGEList
. Alternatively, as I already mentioned, you could use aRangedSummarizedExperiment
withDESeq2
. But in the latter case you don't make theRangedSummarizedExperiment
by hand, instead using the functionDESeqDataSetFromMatrix
which handles all the finicky details for you.But you seem fixated on using an
ExpressionSet
, so do note that you already have the answer. YourfeatureData
object doesn't match yourassayData
and when you test for equality you get the same answer. The only recourse is to ensure they are identical, which seems pretty clear?