Question: Multifactor model design for DE analysis (DESeq2 & edgeR)
1
gravatar for Mathieu Bahin
5.3 years ago by
Mathieu Bahin30 wrote:
Hi all, I am using DESeq2 and edgeR to perform DE analysis on paired samples on a dog cancer project. Sorry if the question is redundant but I can?t find one very similar to my case. I have been designing models with 2 factors: condition (control / tumor) and patient ID (to match the paired samples). I used the model '~sample_id + condition? until now but I would like to add a third factor, the breed. Is that then correct to use ?~sample_id + breed + condition? if my goal is to analyse the DE between control and tumor samples taking into account the individual variabilities (with the sample ID factor) and the breed variability (with the breed factor). Here is an example of a sample table I could have: Patient ID Condition Breed Sample1 1 Control Breed1 Sample2 2 Control Breed2 Sample3 3 Control Breed1 Sample4 4 Control Breed2 Sample5 1 Tumor Breed1 Sample6 2 Tumor Breed2 Sample7 3 Tumor Breed1 Sample8 4 Tumor Breed2
cancer edger deseq2 • 1.5k views
ADD COMMENTlink modified 5.3 years ago by Simon Anders3.6k • written 5.3 years ago by Mathieu Bahin30
Answer: Multifactor model design for DE analysis (DESeq2 & edgeR)
0
gravatar for Simon Anders
5.3 years ago by
Simon Anders3.6k
Zentrum für Molekularbiologie, Universität Heidelberg
Simon Anders3.6k wrote:
Hi Mathieu On 19/08/14 10:01, Mathieu Bahin wrote: > I have been designing models with 2 factors: condition (control / > tumor) and patient ID (to match the paired samples). I used the model > '~sample_id + condition? until now but I would like to add a third > factor, the breed. > Is that then correct to use ?~sample_id + breed + condition? if my > goal is to analyse the DE between control and tumor samples taking > into account the individual variabilities (with the sample ID factor) > and the breed variability (with the breed factor). No. This would make breed another blocking factor, besides patient_id. But it does not offer any new information, because all samples from the same patient are from the same breed, so the patient_id factor already captures all variation associated with this. Therefore, there is no need to account for breed if you just want to see the overall effect of cancer. If, however, you want to know for which genes the expression change due to cancer _depends_ on breed, you are looking for an _interaction_ between breed and condition and should hence use: ~ patient_id + condition + breed:condition (BTW, I renamed your factor from "sample_id" to "patient_id": After all, you have two samples from each patient.) > Another question: If I use the pairwise information, I don?t have > replicates because I only have two sample (one control, one tumor) > for each patient. Is it better to use it (and then have no replicates) > or not (and then have replicates for ?control? and ?tumor? samples) ? Of course, you still have replicates. You have several dogs. This is the whole point of the paired design. If you omitted the "patient_id" factor, you would drastically lose inferential power. Simon
ADD COMMENTlink written 5.3 years ago by Simon Anders3.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 267 users visited in the last hour