DESeq2: dealing with outliers
3
0
Entering edit mode
nikmehr22 • 0
@nikmehr22-13526
Last seen 7.4 years ago

Dear DESeq2 Experts,

I found a miRNA that is expressed significantly different between 2 groups.

Below is the output from DESeq2:

ID baseMean log2FoldChange lfcSE stat pvalue padj

miRNA 100.985551384495 -0.57460722765581 0.083369216154338 -6.89231894170701 5.48901333343412e-12 3.67763893340086e-09

 

However, when I look at the figure generated by "plotCounts", it seems the difference is mainly driven by an outlier in first group.

There is the link to the figure:

https://figshare.com/articles/t1_png/5219959

 

My question is, why DESeq2 didn't replace the outlier?

what functions/arguments should I use to prevent such findings?

Thanks for your time,

Nikmehr

deseq2 • 1.4k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 4 days ago
United States

hi Nikmehr,

It's possible that the count was indeed replaced. plotCounts() has a default of showing the original counts, which can be modified by argument.

What are the trimmed means of the normalized counts in the two groups? E.g.:

dat <- plotCounts(dds, gene, "group", returnData=TRUE)

mean(dat$counts[dat$group == "first"], trim=.05)

mean(dat$counts[dat$group == "second"], trim=.05)
ADD COMMENT
0
Entering edit mode

Perhaps plotCounts should plot replaced counts using a different color or shape to make it more obvious which counts have been replaced.

ADD REPLY
0
Entering edit mode
That's a good idea.
ADD REPLY
0
Entering edit mode
nikmehr22 • 0
@nikmehr22-13526
Last seen 7.4 years ago

Hi Michael,

Thanks for the update. You are right the difference between the trimmed means is significant.

There is the statistics:

Group    First    Second
Mean    37.80    48.02
SD    20.90    43.23
SEM    1.60    3.33
N    171    169

 

I wonder, what argument should I specify, so that plotCounts() shows the outlier-replaced normalized counts?

I added the "replaced=TRUE" argument but it seems it is not working and the generated plot is still with the outlier.

 

> dat <- plotCounts(dds, gene, "group", replaced=TRUE)
Warning message:
In .local(object, ...) :
  there are no assays named 'replaceCounts', using original.
calling DESeq() will replace outliers if they are detected and store this assay.

 

ADD COMMENT
0
Entering edit mode
That point is not necessarily an outlier to DESeq2, in that, its removal may not affect the log2 fold change at all.
ADD REPLY
0
Entering edit mode

I understand but a plot is required and that point is perplexing for some people.

ADD REPLY
0
Entering edit mode
nikmehr22 • 0
@nikmehr22-13526
Last seen 7.4 years ago

I think another option is to extract the normalized counts from DESeq2 then winsorize the data and do further exploratory analysis or generate plots. Please let me know if this is not making sense.

 

ADD COMMENT

Login before adding your answer.

Traffic: 585 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6