I'm using DESeq2 to compare amplicons counts difference between 2 conditions. My input data is raw count table (as required), and as my dataset has many 0's, I used a former solution by using:
dds_lettuce <- DESeqDataSetFromMatrix(countData=countData_lettuce, colData=metaData_lettuce, design=~source, tidy = TRUE) #deal with many 0's in the dataset: dds_lettuce <- dds_lettuce[ rowSums(counts(dds_lettuce)) > 5, ] cts <- counts(dds_lettuce) geoMeans <- apply(cts, 1, function(row) if (all(row == 0)) 0 else exp(mean(log(row[row != 0])))) dds_lettuce <- estimateSizeFactors(dds_lettuce, geoMeans=geoMeans) dds_lettuce=DESeq(dds_lettuce)
However, for some of the genes I'm getting very high log2foldchange values (>20), along with low pvalue and padjuested. What can be a potential reason for that? looking at the raw counts for these genes clearly shows they present only in the treatment group (average of 4500 vs. 0 in the control).
Thank you, Barak