Question

DESeq2 for translation efficiency when there are normalization issues

0

Entering edit mode

GFM ▴ 20

@gfm-8326

Last seen 2.3 years ago

European Union

Hello,
We are designing a ribosome footprint experiment in which we have cells with or without treatment. For the same samples we will have also sequencing of total RNA-Seq.

Here are the samples we will have:
1. Cells without treatment - total RNA protocol (TruSeq)
2. Cells without treatment - ribosome footprint (TruSeq Ribo Profile kit)
3. Cells with treatment - total RNA protocol (TruSeq)
4. Cells with treatment - ribosome footprint (TruSeq Ribo Profile kit)

We would like to compare the translation efficiency with and without treatment. We thought to have a model with 2 factors and the interaction between the 2 factors. The 2 factors:
1. treatment: with/without
2. library prep: total/ribosome footprint

We think that the treatment might change globally both the RNA amount in the cell, and also the ribosome bound RNA.

We thought that in this situation the assumptions of DESeq2 normalization are violated. Therefore we thought to add ERCC spikes to the samples for the normalization.

We thought of the following workflow:
1. Normalize all the samples of the total RNA together, using the ERCC spikes.
2. Normalize all the samples of the ribosome footprint together, using the ERCC spikes.
3. Combine the 2 data-sets, input to DESeq2 without performing normalization inside DESeq2.
4. In DESeq2 define a model with 2 factors: library prep (total/ribosome footprint) and treatment: with/without
5. Include the interaction between the factors. The interaction will give an indication to change in the translation efficiency in response to the treatment.

Do you think that doing such normalization separately for each data-set before DESeq2 and combining them for DESeq2 is a good approach?

Thank you.

deseq2 translation efficiency spikes ribosome profiling • 2.1k views

ADD COMMENT • link 7.5 years ago GFM ▴ 20

0

Entering edit mode

Thanks a lot for the quick reply.

I have some questions:

1. I think we must at least normalize each data set (ribo and total) separately in some way before extracting the interaction from DESeq2 (ideally this would be with DESeq2, but maybe the assumption of the normalization is violated). Am I correct?
We don't like the option of adding ERCC according to the number of cells, and use this for normalization, but if there are global changes in response to the treatment, we don't have much alternatives. We thought to normalize each data set (ribo / total) separately according to the ERCC. Any other suggestions?

2. In cases that there are no global changes in RNA content (or RNA bound to ribosome) in response to the treatment, it is OK to do the normalization of the whole data set (ribo and total samples) inside DESeq2. Is this correct?

Regarding the recent post,do you mean the ERCC post:
https://support.bioconductor.org/p/88413/?

Thanks a lot for the great support.

ADD REPLY • link 7.5 years ago GFM ▴ 20

0

Entering edit mode

The other post must not have had a descriptive title, because I can't find it either now.

If you are only concerned with finding genes where the ratio of ratios ribo/total is not equal to 1, you in theory don't need to estimate size factors (set them to 1). If there is a factor, by which the ribo samples are always higher than the total, it will cancel out by taking the ratio of the ratios, correct?

ADD REPLY • link 7.5 years ago Michael Love 41k

0

Entering edit mode

I thought we must do some normalization in order to compare the samples. If we look at the ratio (ribo/total for treated) / (ribo/total for untreated) and just for example, if the sample ribo - treated has a much larger coverage, we might get biased ratios. Am I wrong?

ADD REPLY • link 7.5 years ago GFM ▴ 20

1

Entering edit mode

Sorry, you're right, I wrote too quickly. You need to normalize between libraries of the same assay type. There is actually a previous thread which addressed this question, and I have some code for normalizing the different assay types separately within one DESeqDataSet:

A: Ribosome profiling analysis in DEseq2/limma

ADD REPLY • link 7.5 years ago Michael Love 41k

0

Entering edit mode

(Edit: This is wrong. See below.) ~~Normalization for sequencing depth is always performed. The size factors are an additional normalization on top of that, which is not necessary in this case if you only care about the interaction.~~

ADD REPLY • link 7.5 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

DESeq2 is a bit different than edgeR, where size factor is the deviation from sequencing depth. In DESeq2, the size factor is the only normalization going on. My point is that, if the deviation of the ratio of ratios from 1 is global, and one doesn't want to remove these with the size factor estimation, you could set the size factors to 1.

ADD REPLY • link 7.5 years ago Michael Love 41k

0

Entering edit mode

Thanks for the correction.

ADD REPLY • link 7.5 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

Thanks a lot for the answers.

ADD REPLY • link 7.5 years ago GFM ▴ 20

score 0 · Answer 1 · 2016-11-01

There was a recent post on the support site which is relevant to this.

Basically, if you are looking only at the ratio of ratios (the interaction term), and you think there are global changes, where size factor estimation would remove all the signal, you can skip the size factor calculation. The interaction term will give you (ribo/total for treated) / (ribo/total for untreated). Does this make sense? You can try it out on a public data set or even simulated data to see if it works for your proposed experiment. The way to skip size factor estimation is to set sizeFactors(dds) <- rep(1,ncol(dds))