filtering out lowly expressed genes by groups
1
0
Entering edit mode
sohail.mbio ▴ 10
@335cd804
Last seen 13 months ago
United States

Hello DESeq2 community,

I'm working on a complex RNA-seq experiment with a large design, involving 5 different timepoints, 2 different stimulation conditions, and 2 disease states, resulting in a total of 96 samples. In my initial preprocessing steps, I've filtered out genes with low expression (<100 counts).

However, as I delve into differential expression analysis, I've encountered a situation where I'm observing many differentially expressed genes (DEGs) with adjusted p-values (padj) < 0.1 in certain comparisons. What's intriguing is that most of these genes have zero transcript counts in the majority of samples within those specific comparison groups. These genes were not filtered out initially because they exhibit expression in other groups. For example (plot below), Deseq2 DE show significance in first two groups, even there was only 1 sample with few transcripts. enter image description here

Now my question is how to get rid of these DEGs? As these are not really DEGs.

RNASeqData DESeq2 • 1.0k views
ADD COMMENT
0
Entering edit mode

Is there a question?

ADD REPLY
0
Entering edit mode

My question is how to avoid these DEGs? There are so many DEGs that are not actually expressed at T0. Is there any filter ? my current code is `dds <- DESeqDataSetFromMatrix(countData = cts, colData = coldata, design = ~ Disease + Treatment + Timepoint)

dds <- dds[ rowSums(counts(dds)) > 100, ] dds <- DESeq(dds) dds$group <- factor(paste0(dds$Treatment, dds$Timepoint, dds$Disease)) design(dds) <- ~ group dds <- DESeq(dds)

resultsNames(dds)

res<- results(dds, contrast = c("group","unstimT0CRSwNP", "unstimT0HC"), independentFiltering=TRUE)

`

ADD REPLY
0
Entering edit mode
@james-w-macdonald-5106
Last seen 3 days ago
United States

Something like

results(dds, c("condition","T0_unstim_CRWnP","T0_unstim_HC"))[rowMeans(assays(dds, "counts")[,colData(dds)$condition %in% c("T0_unstim_CRWnP","T0_unstim_HC")] > 10,]

Should do the trick, I imagine. You could even make a function

fun <- function(deseqobj, contvec, cutoff) 
        results(deseqobj, contvec)[rowMeans(assays(deseqobj, "counts")[,colData(dds)[,contvec[1]] %in% contvec[-1])] > cutoff,]

## and run it

fun(dds, c("condition","T0_unstim_CRWnP","T0_unstim_HC"))

None of this is tested, btw, so you might need to adjust.

ADD COMMENT
0
Entering edit mode

You might want to filter first though.

fun <- function(deseqobj, contvec, cutoff) {
            lildds <- dds[rowMeans(assays(deseqobj, "counts")[,colData(dds)[,contvec[1]] %in% contvec[-1])] > cutoff,], 
             results(lildds, contvec)
}

Which won't invalidate your FDR values.

ADD REPLY

Login before adding your answer.

Traffic: 449 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6