Paired analysis (tumor vs normal) at three different conditions with EdgeR
1
0
Entering edit mode
takumima ▴ 10
@takumima-16365
Last seen 9 months ago
Italy

Hi all,

I have 18 samples, 3 tumor and 3 paired normal tissue samples, belonging to the same individual, at three different conditions, each in three replicates.

 $df

 sample             group   tissue
 sample1-T-cond1    cond1   T
 sample1-T-cond1    cond1   T
 sample1-T-cond1    cond1   T
 sample1-T-cond2    cond2   T
 sample1-T-cond2    cond2   T
 sample1-T-cond2    cond2   T
 sample1-T-cond3    cond3   T
 sample1-T-cond3    cond3   T
 sample1-T-cond3    cond3   T
 sample1-N-cond1    cond1   N
 sample1-N-cond1    cond1   N
 sample1-N-cond1    cond1   N
 sample1-N-cond2    cond2   N
 sample1-N-cond2    cond2   N
 sample1-N-cond2    cond2   N
 sample1-N-cond3    cond3   N
 sample1-N-cond3    cond3   N
 sample1-N-cond3    cond3   N

My goal is to compare T versus N (tissue), in each condition (group). In other words, I would like to get the list of DE genes in tumor samples, and I'd like to know if this list is the same in the three different conditions. I can't figure out how to manage the replicates and how to set the DGEList object and design matrix:

 geneList = DGEList(counts=round(geneData), genes=rownames(geneData), group = df$group)
 design = model.matrix(~df$group+df$tissue)

Is this solution right?

Thanks

edger design matrix de • 1.5k views
ADD COMMENT
1
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 13 hours ago
The city by the bay

Paste them together and use a one-way layout.

group <- paste0(df$group, ".", df$tissue)
design <- model.matrix(~0 + group)

And if you want to compare tissues in each condition, you can just do:

con <- makeContrasts(
    groupcond1.T - groupcond1.N, # T vs N in cond1
    groupcond2.T - groupcond2.N, # ... in cond2
    groupcond3.T - groupcond3.N, # ... in cond3
    levels=design)

Take the desired column and use it in contrasts= in glmQLFTest() to test individual comparisons. Or you can just supply contrasts=con to do an ANODEV, i.e., to identify any differences between T and N in any condition.

The formulation of design above assumes that there's no batch structure that requires additional blocking factors. If there is, then you'll need to include the necessary terms as additive factors.

ADD COMMENT
0
Entering edit mode

It might be worth mentioning that the design matrix shown in the question also is correct but it's a bit harder to think about which coefficient to look at for which comparison of interest.

ADD REPLY
2
Entering edit mode

Yes, the original design matrix is valid but assumes that the effects of group and tissue are additive. My design sacrifices some residual d.f. to eliminate this assumption, which is important if the tumor/normal difference is condition-specific (i.e., there is a non-zero interaction between group and tissue). In contrast, the original design assumes that the tumor-normal difference is the same for all conditions.

ADD REPLY
0
Entering edit mode

Thank you @Aaron Lun.

I made two more comparisons, namely conditions 2 and 3 versus the cond1:

cond1vscond2=(groupcond1.T-groupcond1.N)-(groupcond2.T-groupcond2.N)
cond1vscond3=(groupcond1.T-groupcond1.N)-(groupcond3.T-groupcond3.N)

So, when I use geneQLFT = glmQLFTest(geneQLF, contrast=con), I get the following

> summary(decideTestsDGE(geneQLFT, adjust.method="BH"))
   LR test on 3 degrees of freedom
NotSig                           1773
Sig                              13657

Accordingly, I can assess that 13657 genes are DE across all paired-samples. This means that different conditions impact differently the results, but what is the best one?

ADD REPLY

Login before adding your answer.

Traffic: 649 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6