DEseq2 comparison order
1
0
Entering edit mode
blur ▴ 10
@acd4e23e
Last seen 3.8 years ago

I have a stupid question I have two conditions that I want to compare - control and treatment

I enter them like this into the package:

sampleFiles <-c("treatment1","treatment2","treatment3","control1","control2","control3")
sampleCondition <-c("treatment","treatment","treatment","control","control","control")

But now I don't know what the comparison does - the first condition vs. the second? I mean if I get log2FoldChange = 3 does it mean that the treatment generates more expression than the control for this gene?

I know this is silly, but I want to be absolutely sure... Thanks

DESeq2 • 7.8k views
ADD COMMENT
1
Entering edit mode
Kevin Blighe ★ 4.0k
@kevin
Last seen 28 days ago
Republic of Ireland

Hey,

Most importantly, you have not shown your DESeq2 code; however, I will answer as best as I can.

If, in your metadata that is supplied to DESeq2, 'control' is set as the reference level in your 'sampleCondition' column, then DESeq2 will use control as the denominator for determining fold-changes. Then, you'll see a situation like this: http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#quick-start

In this case, a log2 value of +3 would indicate that the gene in question is more highly expressed (8 times on the linear scale) in 'condition' (or 'treated', for you) when compared to 'untreated' ('control' for you).

You can also manually set the order of comparison like this: http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#differential-expression-analysis, i.e..:

res <- results(dds, contrast=c("condition","treated","untreated"))

Here, the numerator is 'treated'; while the denominator is 'untreated'.



If in further doubt, just take a quick look via a box-and-whiskers:

boxplot(assay(vst)['VCAM1',] ~ colData(dds)[,'dex'])

h

Let's check the results table and do 'trt' versus 'untrt':

res <- results(dds,
 contrast = c('dex','trt','untrt'))
res <- lfcShrink(dds,
  contrast = c('dex','trt','untrt'), res=res, type = 'normal')

res['VCAM1',]
log2 fold change (MAP): dex trt vs untrt 
Wald test p-value: dex trt vs untrt 
DataFrame with 1 row and 6 columns
       baseMean log2FoldChange     lfcSE      stat      pvalue        padj
      <numeric>      <numeric> <numeric> <numeric>   <numeric>   <numeric>
VCAM1   509.995       -3.45198  0.177923   -19.307 4.69235e-83 4.72958e-80

Looks good.

Kevin

ADD COMMENT
0
Entering edit mode

Dear Kevin, Thank you for your help. I was using this to determine the baseline:

dds$condition <- relevel(dds$condition, ref = "untreated")

I have tried also:

 dds$condition <- factor(dds$condition, levels = c("untreated","treated"))

but it did not seem to work (might be my lack of experience, though)

I will use your suggestion to check using box-and-whiskers make sure

Here is my full code:

directory <-"RNA_seq"
sampleFiles <-c("treat1","treat2","treat3","control1","control2","control3")
sampleCondition <- c("trt","trt","trt","cntrl","cntrl","cntrl")
sampleTable <- data.frame(sampleName = sampleFiles,fileName = sampleFiles,condition = sampleCondition)
sampleTable$condition <- factor(sampleTable$condition)
library("DESeq2")
ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable,directory = directory, design= ~ condition)
ddsHTSeq
dds <- DESeq(ddsHTSeq)
dds$condition <-relevel(dds$condition, ref="cntrl")
res <- results(dds)
res
resOrdered <- res[order(res$pvalue),]
resSig <- subset(resOrdered, padj < 0.1)
resSig
summary(resSig)
par(mfrow=c(1,1))
with(resSig, plot(log2FoldChange, -log10(pvalue), pch=20, main="Volcano plot", xlim=c(-3,3)))
with(subset(res, padj<.01 ), points(log2FoldChange, -log10(pvalue), pch=20, col="blue"))
with(subset(res, padj<.01 & abs(log2FoldChange)>2), points(log2FoldChange, -log10(pvalue), pch=20, col="red"))
ADD REPLY
1
Entering edit mode

Hi, I think that you need to change this line:

sampleTable$condition <- factor(sampleTable$condition)

...to:

sampleTable$condition <- factor(sampleTable$condition, levels = c('cntrl','trt'))


Or, you could change this line:

dds$condition <- relevel(dds$condition, ref="cntrl")

...to:

colData(dds)[,'condition'] <- relevel(colData(dds)[,'condition'], ref = 'cntrl')


Another problem may be where you are running the results() function. You seem to be running it 'blindly'. Can you take a look at the example in the vignette, and also the extra related function lfcShrink()? - see http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#quick-start



If still in doubt, after you have run all of the code above, what is the output of:

str(colData(dds))

?

ADD REPLY
0
Entering edit mode

Hi,

Thanks for the reply!

I changed the line:

sampleTable$condition <- factor(sampleTable$condition, levels = c('cntrl','trt'))

I don't seem to have lfcShrink function in my R version...

Also, here is what I get when I run str(colData(dds))

> str(colData(dds))
Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
  ..@ rownames       : chr [1:6] "trt" "trt" "trt" "control" ...
  ..@ nrows          : int 6
  ..@ listData       :List of 2
  .. ..$ condition : Factor w/ 2 levels "cntrl","trt": 2 2 2 1 1 1
  .. ..$ sizeFactor: Named num [1:6] 1.914 1.379 0.751 1.37 0.952 ...
  .. .. ..- attr(*, "names")= chr [1:6] "trt" "trt" "trt" "control" ...
  ..@ elementType    : chr "ANY"
  ..@ elementMetadata:Formal class 'DataFrame' [package "S4Vectors"] with 6 slots
  .. .. ..@ rownames       : NULL
  .. .. ..@ nrows          : int 2
  .. .. ..@ listData       :List of 2
  .. .. .. ..$ type       : chr [1:2] "input" "intermediate"
  .. .. .. ..$ description: chr [1:2] "" "a scaling factor for columns"
  .. .. ..@ elementType    : chr "ANY"
  .. .. ..@ elementMetadata: NULL
  .. .. ..@ metadata       : list()
  ..@ metadata       : list()
ADD REPLY
1
Entering edit mode

You must be using an old version of DESeq2 if lfcShrink() is not available. Why would you be using an old version? Which R version are you using? If you run sessioninfo(), you should see this information.

This following line indicates that your variable is already ordered correctly, i.e., with 'cntrl' as the reference level:

  .. ..$ condition : Factor w/ 2 levels "cntrl","trt": 2 2 2 1 1 1

So, there should be no problem.

ADD REPLY
1
Entering edit mode

Thank you so much! I will update my DESeq2 :)

ADD REPLY

Login before adding your answer.

Traffic: 836 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6