Question

Help with DESeq2 design matrix

0

Entering edit mode

MGuny • 0

@mguny-22523

Last seen 6.2 years ago

Austria, BOKU

Hi,

I would like to hear some opinions and get some support on how to generate the design for DE Analysis.

First, a short description of the data:

The data is from cell line development. 3 different host cell lines were treated/processed the same way and showed improvement. Now we are interested in the difference between the improved subclone and the orginal cell line to look for possible targets for rational cell line development.

Vsd normalised PCA

This is the vsd normalised PCA plot - which is as expected: each cell line is clustering and the subclones are different than the host.

Now to my question:

Due to the different biological context - 3 different cell lines, should I subset the matrix and investigate each cell line on its own or should I keep all of them in one big matrix and then do the design according to levelC1 to levelC3, respectively?

 > sample_ann
                   ID        levels condition  cellline       levelC1       levelC2       levelC3
1      Cellline1.rep1     Cellline1      host Cellline1     Cellline1     Cellline1     Cellline1
2      Cellline1.rep2     Cellline1      host Cellline1     Cellline1     Cellline1     Cellline1
3  Cellline1_imp.rep1 Cellline1_imp  subclone Cellline1 Cellline1_imp     Cellline1     Cellline1
4  Cellline1_imp.rep2 Cellline1_imp  subclone Cellline1 Cellline1_imp     Cellline1     Cellline1
5      Cellline2.rep1     Cellline2      host Cellline2     Cellline2     Cellline2     Cellline2
6      Cellline2.rep2     Cellline2      host Cellline2     Cellline2     Cellline2     Cellline2
7  Cellline2_imp.rep1 Cellline2_imp  subclone Cellline2     Cellline2 Cellline2_imp     Cellline2
8  Cellline2_imp.rep2 Cellline2_imp  subclone Cellline2     Cellline2 Cellline2_imp     Cellline2
9      Cellline3.rep1     Cellline3      host Cellline3     Cellline3     Cellline3     Cellline3
10     Cellline3.rep2     Cellline3      host Cellline3     Cellline3     Cellline3     Cellline3
11 Cellline3_imp.rep1 Cellline3_imp  subclone Cellline3     Cellline3     Cellline3 Cellline3_imp
12 Cellline3_imp.rep2 Cellline3_imp  subclone Cellline3     Cellline3     Cellline3 Cellline3_imp

The standard pipeline in my group is to keep samples that belong together and were sequenced together in one matrix. However, this is the first time that we actually sequenced 3 distinct cell lines and we could arque for both ways.

Thanks!

deseq2 • 624 views

ADD COMMENT • link updated 6.2 years ago by mona.alee101 • 0 • written 6.2 years ago by MGuny • 0

score 0 · Answer 1 · 2019-12-09

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 3 days ago

United States

Do you expect the subclone to host effect to be the same or different per cell line? If the latter, use

~line + line:condition

For a line-specific condition effect. Condition should take values either subclone or host. Both should be factors.

ADD COMMENT • link 6.2 years ago Michael Love 43k

0

Entering edit mode

Thank you for the fast reply.

Based on the PCA I would actually assume that the subclone to host effect will be different per cell line. However, I am still confused why if it would be better in this case whether to keep the whole matrix and not to subset the matrix, so that each cell line has its own matrix.

ADD REPLY • link 6.2 years ago MGuny • 0

0

Entering edit mode

That question is answered in the vignette FAQ actually.

The above design would work, or another equivalent design (so answers are the same) would be to combine line and condition into one factor called group which has levels line1host, line1subclone, etc. and then just use the contrast argument of results() to make comparisons. Whichever is easier.

ADD REPLY • link 6.2 years ago Michael Love 43k