Question: Filtering after a contrast
gravatar for rattigak
15 months ago by
rattigak0 wrote:

I''m new to R and deseq2 so apologies I'm made silly mistakes. I have three conditions with three replicates each. They are control, infection and attenuated infection. I have carried out analysis using 3 factor levels and used a contrast afterwards. I want to look at whats different between infection and attenuated infection (p.adjust<0.01, lfcThreshold=1), and take this output to make a heatmap including all conditions.

The contrast is:

res = results(dds, contrast=c("condition","infection ","attenuated infection"))

I can filter on p.ajust:

sig = which(cres$padj < 0.05)

I can't figure how to filter on logfoldchange after I used the contrast, I can do it before but then I can't use the contrast on the output from this.

resBigFC <- results(dds, lfcThreshold=1, altHypothesis="greaterAbs")


ADD COMMENTlink modified 15 months ago by Steve Lianoglou12k • written 15 months ago by rattigak0
gravatar for Steve Lianoglou
15 months ago by
Steve Lianoglou12k wrote:

I'm not sure I understand the confusion, so in general: however you ultimately call the results funciton (ie. by specifying an lfcTreshold or otherwise) you will be given a data.frame that has a log2FoldChange column, a pvalue column, and a padj column.

You can subset these "as usual" to get what you're after. If you want to filter on log2FoldChange you can:

res <- results(dds, ...)
subset(res, abs(log2FoldChange) >= 1)


subset(res, padj <= 0.10)


subset(res, abs(log2FoldChange) >= 1 & padj < 0.10)

or ...


ADD COMMENTlink written 15 months ago by Steve Lianoglou12k

Thanks for that, its a big help, I wanted to filter on both logfc and p.ajust for 'Infection' and Attentuated Infection"'but running all levels so I could compare these DE genes with the control. Most changes are infection specific and without this filtering I couldn't see nice patterns caused/suppressed by the live parasite vs the Control

Sigdif <- subset(res, abs(log2FoldChange) >= 1 & padj < 0.10)

I used this to get a list of genes that were DE between 'Infection' and 'Attentuated Infection' which I merged with vsd transformed counts. The resulting matrix was then used for heatmaps

One more question: I read this on the bioconducter help page and want to check if I've undersand it correctly:

"If there are more than 2 levels for this variable, results will extract the results table for a comparison of the last level over the first level"

If I do the following

colData(ddsHTSeq)$condition<-factor(colData(ddsHTSeq)$condition, levels=c('Infection','Control','Attentuated Infection'))

The pval, pval.adjust, logfc... are calculated from contrasting my 'Attentuated Infection' condition vs 'Infection' condition?

I was confused as I saw this in the vignette for Deseq2

"By default, R will choose a reference level for factors based on alphabetical order. Then, if you never tell the DESeq2 functions which level you want to compare against (e.g. which level represents the control group), the comparisons will be based on the alphabetical order of the levels. There are two solutions: you can either explicitly tell results which comparison to make using the contrast argument (this will be shown later), or you can explicitly set the factors levels. Setting the factor levels can be done in two ways, either using factor:

dds$condition <- factor(dds$condition, levels=c("untreated","treated"))

...or using relevel, just specifying the reference level:

dds$condition <- relevel(dds$condition, ref="untreated")


Thanks for you help again


ADD REPLYlink modified 15 months ago • written 15 months ago by rattigak0

My bad, just realised that the head(res) give that info

res <- results(dds)
res <- res[order(res$padj),]

ADD REPLYlink written 15 months ago by rattigak0

Yeah, you can see the contrast tested the result table itself. Although there is a determined behavior (based on the ordering of the levels in the factor) that results will provide when you exclude specifying the contrast parameter in your call to results, I'd never feel comfortable to use it that way.

I'd much rather prefer to specify the precise combination of levels I want to test and pass that into the contrast parameter (and you should, too! ;-)

In your case, to specifically a test for 'Infection' vs 'Control', you would:

IvsC <- results(dds, contrast=c('condition', 'Infection', 'Control'))

The help page for ?results is quite detailed, so I'd take some (more) time to read through it. You can actually construct a contrast such that you can formally test that the average expression among your two "Infection" conditions vs Control by passing an appropriately constructed list object into the contrast argument (instead of the character vector shown here), too.


ADD REPLYlink written 15 months ago by Steve Lianoglou12k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 126 users visited in the last hour