Question

Protein differential Expression analysis

0

Entering edit mode

cardin.julie • 0

@cardinjulie-12735

Last seen 6.3 years ago

I have experienced very good results with DESeq2 for my RNASeq analysis. As far as I understand, it is a tool that normalise our data from sequencing to make them comparable.

I have a new project implicating proteins counts.

I have couple of data sets. For each sample we have:

rows with proteins names (instead of genes), with their respective counts.

My goal is again to make a differential expression between treated groups versus controls.

I wonder if I can use DESeq2 to do a differential expression for proteins?

Or if the correcting factor that is used by DESeq2 to correct counts for RNASeq is specific to DNA sequencing and it is not applicable to proteins?

Is there a tool that do the exact same thing as DESeq2 but for proteins?

differential expression proteomics • 2.9k views

ADD COMMENT • link updated 7 months ago by ATpoint ★ 4.0k • written 6.3 years ago by cardin.julie • 0

score 2 · Answer 1 · 2018-01-07

Julie

There is no "in-principle" reason why DESeq2 shouldn't produce useful results also for count data from technologies that are not DNA-sequencing based. There are two issues:

the normalization (a.k.a. size factors)
the error model (Gamma-Poisson, GP)

Both of these are quite generic, and whether they are appropriate for your data is a specific question on the particular dataset, rather than the technology that produced it

Regarding the normalization, can you show us MA plots between replicates (and also, between different conditions)? Also include the line of M=0.

Regarding the error model, you will want to do model fit diagnostics to see whether the residuals for each protein across replicates and conditions (after fitting the model) are reasonably consistent with the GP, in particular, that they look unimodal. There is one argument for piece of mind: If you have enough replicates (or: degrees of freedom) that you can actually "see" deviations from the GP assumption (i.e. >=dozens), then you don't really need a parametric method, and you could switch just as well to something non-parametric, without any of DESeq2's shrinkage or distributional assumptions. If not, then it obviously cannot matter much.

Kind regards
Wolfgang