I was wondering if anybody has experienced this issue or if someone can chime in on a very frustrating problem I am having. I think I have not specified my groups correctly given the count data and just the fact that with normalized counts I am getting completely different log fold changes for results as well as base mean results.
The issue I am having is as follows: I already have some data that suggests what rough fold-changes are and absolutely for sure what the counts are in samples versus others (i.e control vs tumour).
I have run the raw counts through DEseq2 to generate some differential expression analysis but the results table is a little confusing and I am worried that I am not specifying the contrast correctly. my code is as follows:
all(row.names(sampledataALL$SampleName)==colnames(countsALL))  TRUE
dds<-DESeqDataSetFromMatrix(countData = countsALL, colData = SampledataALL, design = ~ Type.tumour_normal+Age+Genotype)
colData(dds)... All samples line up (colnames of counts with rownames of metadata
dds$groups <- factor(paste0(dds$Genotype, dds$Model, dds$Age)) design(dds) <- ~ groups dds <- DESeq(dds, parallel = TRUE) resultsNames(dds)
I then specify the specific contrast I would like to see results for: Res<-results(dds, contrast=c("groups","WTE16.5","knockoutE16.5"), independentFiltering = F)
I then get a results table of the (apparently correct contrast):
log2 fold change (MLE): groups NormalE16.5 vs knockoutE16.5 Wald test p-value: groups groups NormalE16.5 vs knockoutE16.5 DataFrame with 32075 rows and 6 columns
Now here is the problem: Looking at the expression counts for a gene of interest: ENSMUSG00000014704
plotCounts(dds, gene = "ENSMUSG00000014704", intgroup = c("Age", "Genotype"), normalized = TRUE, returnData = T)
normal16.5: 480.821761 E16.5
415.469461 knockout:7.0, 163.53, 20.4 and 124+366... (written our quickly not pasted like for normal e16.5) However what is really really clear is the average counts of normal are much higher than for the knockout... Now here is the issue... The results table shows the following for this gene:
ENSMUSG00000014704 64.34613 1.297509 0.9962282 1.30242 0.1927723 1
Firstly, the basemean is way lower than the mean of counts for this gene across the samples in the contrast of interest (should be around 230)... Is there a reason for this?. Secondly, the second column of 1.29 is the logfold change.. (ignoring the significance) please may I just confirm that this is 1.29 fold higher in the normal sample compared to knockout and not the other way around? If it 1.29 fold higher in normal then this makes sense! But just to be sure that it is alwayts with respect to the first contrast.. so for example, any log fold change means that much log fold change in normal compared to knockout.
Sorry for the long post but any clarification would be much appreciated!