[DESeq2] Multi-factor designs
2
0
Entering edit mode
ofonov ▴ 10
@ofonov-10155
Last seen 7.7 years ago

I have RNA-seq dataset with 3 factors:

  • Treatment: control , treatment (2 levels)
  • Cell lines: ABC.1, Colo320DM, HCT.8, OU31, OVCAR4, RKO (6 levels)
  • Groups: I, II (2 levels, so that cell lines ABC.1, Colo320DM, OVCAR4, OU31 belong to group I and RKO, HCT.8 to group II).

I would like to find out following:

  1. List of genes which respond to treatment within groups I and II
  2. Genes (biomarkers) unique to each group (group I vs group II).

Which design formula would be appropriate for this case?

RNA-seq workflow at Bioconductor suggests to use

~ group + treatment + group:treatment

I have tried it and it does not seem to work, there are no results produced. I would like to add cell lines to the formula, so that e.g. cell line A untreated is compared to treated cell line A, but not to the bulk of cell lines belonging to one group. 

Is there a way of doing it in DESEq2?

 

deseq2 multifactorial design • 5.8k views
ADD COMMENT
1
Entering edit mode
@mikelove
Last seen 1 day ago
United States

Can you post this:

as.data.frame(colData(dds))

also make sure you are using the latest version of DESeq2 (v1.10): 

packageVersion("DESeq2")

ADD COMMENT
0
Entering edit mode
ofonov ▴ 10
@ofonov-10155
Last seen 7.7 years ago

Hi! Thanks for the reply, here is output of the commands:

> packageVersion("DESeq2") [1] ‘1.10.1’

> as.data.frame(colData(dds)) names cell.line treatment responders 1 1_1 Colo320DM CONTROL RESP 2 1_2 Colo320DM CONTROL RESP 3 1_3 Colo320DM TREATED RESP 4 1_4 Colo320DM TREATED RESP 5 2_1 ABC.1 CONTROL RESP 6 2_2 ABC.1 CONTROL RESP 7 2_3 ABC.1 TREATED RESP 8 2_4 ABC.1 TREATED RESP 9 3_1 OU31 CONTROL RESP 10 3_2 OU31 CONTROL RESP 11 3_3 OU31 TREATED RESP 12 3_4 OU31 TREATED RESP 13 4_1 OVCAR4 CONTROL RESP 14 4_2 OVCAR4 CONTROL RESP 15 4_3 OVCAR4 TREATED RESP 16 4_4 OVCAR4 TREATED RESP 17 8_1 RKO CONTROL NRESP 18 8_2 RKO CONTROL NRESP 19 8_3 RKO TREATED NRESP 20 8_4 RKO TREATED NRESP 21 9_1 HCT CONTROL NRESP 22 9_2 HCT CONTROL NRESP 23 9_3 HCT TREATED NRESP 24 9_4 HCT TREATED NRESP

ADD COMMENT
0
Entering edit mode

(A quick note on the support site: there are two ways to reply: Comments or Answers. If you want to start a threaded discussion you can use Comment. If you have an answer for the original post, you would use Answer.)

Where did the other 3 cell lines go? You said there were 9, but I only see 6.

For your aim to find "genes (biomarkers) unique to each group (group I vs group II)" do you want to find genes which have different response to treatment across group? Or just different baseline (control) expression across group? Or something else?

ADD REPLY
0
Entering edit mode

Hi, thanks for pointing out that there are 6 cell lines, for this analysis I have removed 3 cell lines which are known to be "partial responders" and can affect the analysis (I have corrected the original question).

I would like to find genes for which the change in regulation differs between responders and non-responders (across groups). For instance, genes that are upregulated in one group and donwregulated in another. I would like DEseq2 first to compare treated and untreated samples within cell lines and then compare responders vs non-responders.

It seems to me that it has to be done in a way, which is described in section 3.12.1 Linear combinations of DEseq2 vignette, where it is suggested to add ind.n row to colData, but I am not sure about the right way of doing this.

ADD REPLY
0
Entering edit mode

It just takes a little management, because the interaction term between group (responders) and cell.line produces some columns which are all zero and must be removed manually.

You should add a new column cell.line.nested to colData which has levels A,B,C,D for responders and levels A,B for non-responders. Now create a matrix with:

mm <- model.matrix(~ group + group:cell.line.nested + group:treatment, colData(dds))

Then go in and remove the columns with all zeros (see vignette section "Levels without samples" immediately following the linear combinations section).

You will then provide this modified model matrix to the 'full' argument:

dds <- DESeq(dds, full=mm)

You should be able to test the individual treatment effects in responders and non-responders using the 'name' argument, and you can contrast them using the list-style of the 'contrast' argument. And this controls for cell line effects, so you are comparing the treatment effects within cell lines.

ADD REPLY
0
Entering edit mode

HI Michael,  I'm working with a similar design where I have a model with 3 factors, two fixed factors, SEX (M, F) and DIET (Low and high). Then the random factor that is Strain 1, 2, 3.

I'm using DESeq2_1.12.4  

> as.data.frame(colData(dds))
             sex diet strain    name sizeFactor replaceable
C2          male  H     2   M3HPS  0.6105888       FALSE
C3          male  H     3   M6HPS  1.0305482       FALSE
C4        female  H     1   F2HPS  1.2311071       FALSE
C5        female  H     2   F3HPS  0.5622835       FALSE
C6        female  H      3   F6HPS  1.1050964       FALSE
C7          male  L     1   M2LPS  1.2206942        TRUE
C8          male  L     3   M6LPS  1.3228398        TRUE
C9          male  L     2   M3LPS  0.6330985        TRUE
C10       female  L     1   F2LPS  1.3312426        TRUE
C11       female  L      2   F3LPS  1.2465671        TRUE
C12       female  L      3   F6LPS  1.8290932        TRUE
Maria_2HM   male  H      1 M2HPS_b  0.7746767       FALSE
Maria_2LM   male  L     1 M2LPS_b  0.9076435        TRUE

I'm most interested in the effect of diet, so use LRT to analyze the effect diet, full = ~ sex + strain + diet , reduced = ~ sex + strain. However analyzing each factor I know there is a difference among the strains and between sexes. The difference among the strains will allow me to generalize the effect of the diets to the population.

I have also evaluated the effect of diet*strain

full = ~ sex + strain + diet + diet*strain , reduced = ~ sex + strain + diet

my references are: Female, High diet, strain 1

here I get the following

[1,] "Intercept"         
[2,] "sex_male_vs_female"
[3,] "strain_2_vs_1"     
[4,] "strain_3_vs_1"     
[5,] "diet_L_vs_H"   
[6,] "strain2.dietL"   
[7,] "strain3.dietL"    

How could I get the effect of diet within strain.  The effect of low diet vs high diet in strain 1? Low diet vs high diet in strain 2?Low diet vs high diet in strain 3? . I would like to see the effect of the diet in females vs males.  Any help will be more than appreciated.

 

 

 

ADD REPLY
0
Entering edit mode

hi,

I'd recommend you speak with a local statistician to discuss the interpretation of coefficients and how to pull out your comparisons of interest. There are many possible designs to use here, and the modeling choices are up to really.

ADD REPLY

Login before adding your answer.

Traffic: 349 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6