(Sorry if this information is provided elsewhere- i have looked high and low with no success)
I have a RNASeq data set of 22 samples that can be compared to one another in a number of different combinations. Some of the samples are essentially treatment subsets of others. I have used DESeq2 on the countdata ( along with associated colData) to generate both rld and dds (and res) objects. For some samples there are (biological) duplicates for others quadruplicates.
I can generate the DE results for various sample combinations using contrasts but what I also want is a list of genes that are 'expressed' in one set of samples but (for example) not another. I realise what is 'expressed' depends on how that is defined and what cutoff is used but my questions are
1) is it valid to use the rlog transformed matrix (assay(rld)) for this purpose? I am concerned that this data is subject to normalization based on the experimental design input and so would differ with different designs.
If no which package would be recommended to obtain such data from count matrix?
2) If yes to above: are there any error or p-values associated with this data? (There appears to be such data in the DESeqDataSet buty I cant work out how to get at it). The intention being I would be used as a cutoff for expression values (lfc) for which error is too high.
3) is it valid to use the mean of replicates to obtain a value for a particular cell type
4) I have tried to use the rlog function on the countdata matrix which the manual suggests you can but I get an error (Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘sizeFactors’ for signature ‘"data.frame"’).