Filtering after a contrast
Entering edit mode
rattigak • 0
Last seen 5.0 years ago

I''m new to R and deseq2 so apologies I'm made silly mistakes. I have three conditions with three replicates each. They are control, infection and attenuated infection. I have carried out analysis using 3 factor levels and used a contrast afterwards. I want to look at whats different between infection and attenuated infection (p.adjust<0.01, lfcThreshold=1), and take this output to make a heatmap including all conditions.

The contrast is:

res = results(dds, contrast=c("condition","infection ","attenuated infection"))

I can filter on p.ajust:

sig = which(cres$padj < 0.05)

I can't figure how to filter on logfoldchange after I used the contrast, I can do it before but then I can't use the contrast on the output from this.

resBigFC <- results(dds, lfcThreshold=1, altHypothesis="greaterAbs")


deseq2 r filtering bioconductor • 792 views
Entering edit mode
Last seen 1 day ago

I'm not sure I understand the confusion, so in general: however you ultimately call the results funciton (ie. by specifying an lfcTreshold or otherwise) you will be given a data.frame that has a log2FoldChange column, a pvalue column, and a padj column.

You can subset these "as usual" to get what you're after. If you want to filter on log2FoldChange you can:

res <- results(dds, ...)
subset(res, abs(log2FoldChange) >= 1)


subset(res, padj <= 0.10)


subset(res, abs(log2FoldChange) >= 1 & padj < 0.10)

or ...


Entering edit mode

Thanks for that, its a big help, I wanted to filter on both logfc and p.ajust for 'Infection' and Attentuated Infection"'but running all levels so I could compare these DE genes with the control. Most changes are infection specific and without this filtering I couldn't see nice patterns caused/suppressed by the live parasite vs the Control

Sigdif <- subset(res, abs(log2FoldChange) >= 1 & padj < 0.10)

I used this to get a list of genes that were DE between 'Infection' and 'Attentuated Infection' which I merged with vsd transformed counts. The resulting matrix was then used for heatmaps

One more question: I read this on the bioconducter help page and want to check if I've undersand it correctly:

"If there are more than 2 levels for this variable, results will extract the results table for a comparison of the last level over the first level"

If I do the following

colData(ddsHTSeq)$condition<-factor(colData(ddsHTSeq)$condition, levels=c('Infection','Control','Attentuated Infection'))

The pval, pval.adjust, logfc... are calculated from contrasting my 'Attentuated Infection' condition vs 'Infection' condition?

I was confused as I saw this in the vignette for Deseq2

"By default, R will choose a reference level for factors based on alphabetical order. Then, if you never tell the DESeq2 functions which level you want to compare against (e.g. which level represents the control group), the comparisons will be based on the alphabetical order of the levels. There are two solutions: you can either explicitly tell results which comparison to make using the contrast argument (this will be shown later), or you can explicitly set the factors levels. Setting the factor levels can be done in two ways, either using factor:

dds$condition <- factor(dds$condition, levels=c("untreated","treated"))

...or using relevel, just specifying the reference level:

dds$condition <- relevel(dds$condition, ref="untreated")


Thanks for you help again


Entering edit mode

My bad, just realised that the head(res) give that info

res <- results(dds)
res <- res[order(res$padj),]

Entering edit mode

Yeah, you can see the contrast tested the result table itself. Although there is a determined behavior (based on the ordering of the levels in the factor) that results will provide when you exclude specifying the contrast parameter in your call to results, I'd never feel comfortable to use it that way.

I'd much rather prefer to specify the precise combination of levels I want to test and pass that into the contrast parameter (and you should, too! ;-)

In your case, to specifically a test for 'Infection' vs 'Control', you would:

IvsC <- results(dds, contrast=c('condition', 'Infection', 'Control'))

The help page for ?results is quite detailed, so I'd take some (more) time to read through it. You can actually construct a contrast such that you can formally test that the average expression among your two "Infection" conditions vs Control by passing an appropriately constructed list object into the contrast argument (instead of the character vector shown here), too.



Login before adding your answer.

Traffic: 563 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6