Question: edgeR design between and within subjects
0
gravatar for b.nota
23 months ago by
b.nota320
Netherlands
b.nota320 wrote:

Hello,

I am trying to do edgeR analysis, such as described in 3.5 of the manual. My experiment is very alike, but I don't exactly understand how to get all the contrasts of my interests. Here some code:

> data.frame(Genotype, Cell, Patient)
   Genotype Cell Patient
1        WT  TCM       1
2        WT  TEM       1
3        WT  TRM       1
4        WT  TCM       2
5        WT  TEM       2
6        WT  TRM       2
7        WT  TCM       3
8        WT  TEM       3
9        WT  TRM       3
10      MUT  TCM       1
11      MUT  TEM       1
12      MUT  TRM       1
13      MUT  TCM       2
14      MUT  TEM       2
15      MUT  TRM       2
16      MUT  TCM       3
17      MUT  TEM       3
18      MUT  TRM       3
> design <- model.matrix(~Genotype+Genotype:Cell+Genotype:Patient)
> colnames(design)
[1] "(Intercept)"         "GenotypeMUT"         "GenotypeWT:CellTEM"
[4] "GenotypeMUT:CellTEM" "GenotypeWT:CellTRM"  "GenotypeMUT:CellTRM"
[7] "GenotypeWT:Patient"  "GenotypeMUT:Patient"

Now to get some genes of interest:

# Genes respond different between TEM vs TCM in WT
lrt <- glmLRT(fit, coef="GenotypeWT:CellTEM")
topTags(lrt)

# Genes respond different between TEM vs TCM in MUT
lrt <- glmLRT(fit, coef="GenotypeMUT:CellTEM")
topTags(lrt)
# Genes respond different between TRM vs TCM in MUT
lrt <- glmLRT(fit, coef="GenotypeMUT:CellTRM")
topTags(lrt)

# Genes respond different between TRM in MUT vs TRM in WT
lrt <- glmLRT(fit, contrast=c(0,0,0,0,-1,1,0,0))
topTags(lrt)

I hope I understood this correctly.

First Question: Is it possible with this design to get genes that respond different between TRM vs TEM in MUT? Or do I have to make another design first for this specific contrast?

Second question: Is it possible to get genes that respond different between TCM in MUT vs TCM in WT?

 

ADD COMMENTlink modified 23 months ago by Aaron Lun24k • written 23 months ago by b.nota320
Answer: edgeR design between and within subjects
2
gravatar for Aaron Lun
23 months ago by
Aaron Lun24k
Cambridge, United Kingdom
Aaron Lun24k wrote:

Fiddling around with interaction terms is confusing. Make your life easier, and do this instead:

grouping <- paste0(Genotype, ".", Cell)
patient <- paste0(Genotype, Patient)
# ... assuming WT/Mut patients are different people.
# also ensures 'patient' is a factor, not a covariate.

design <- model.matrix(~0 + patient + grouping)
design <- design[,!grepl("TCM$", colnames(design))] # get to full rank

Your first 6 terms are the patient-specific blocking terms, which are not scientifically interesting. The next 2 terms represent the log-fold change of mutant TEM/TRM cells over mutant TCM cells. The last 2 terms represent the log-fold change of wild-type TEM/TRM cells over wild-type TCM cells. To compare cell types within each genotype, you can drop each of the last four terms individually to compare cell types to TCM, or use makeContrasts to set up the desired comparison between TEM and TRM:

con <- makeContrasts(groupingMUT.TEM - groupingMUT.TRM, levels=design)

However, you cannot use edgeR to directly compare between mutant and wild-type cell types, regardless of how you parametrize the design matrix. This is because the differences in genotypes are confounded with the patient factor when both are present as fixed effects in the design matrix. If you want to compare between genotypes, you can only do so "indirectly". For example, you could test whether the differences between TEM and TCM are the same between mutant and wild-type genotypes:

con <- makeContrasts(groupingMUT.TEM - groupingWT.TEM, levels=design)

... keeping in mind that grouping*.TEM already represents the log-fold change between TEM and TCM. If you want to directly compare cell types between genotypes, you will need to use limma and voom with duplicateCorrelation.

ADD COMMENTlink modified 23 months ago • written 23 months ago by Aaron Lun24k

Thanks for your explanation. So for the last part, which is only possible as indirect comparison (MUT.TEM vs WT.TEM). What will be the meaning of the fold changes in the topTags results? So I understand that the patient blocking does not make sense for such comparison (since WT patient 1 is not the same as MUT patient 1). Alternatively, do you think I can safely make a new design without patient blocking for this comparison? I don't wanna go to limma, since this is count data from ATAC-seq and not 'normal' RNAseq.

ADD REPLYlink modified 23 months ago • written 23 months ago by b.nota320
1

I assume you're referring to MUT.TEM vs WT.TEM. The log-fold changes for an indirect comparison will be the TEM/TCM log-fold change in mutant, minus the TEM/TCM log-fold change in the wild-types.

If you want to compare TCM in MUT against TCM in WT, I would suggest performing an edgeR analysis with only the TCM samples, using a single factor for the genotype. This avoids any problems due to unmodelled correlations between samples from the same patient, as each patient now only contributes one sample to the data subset.

However, if you want to analyze all cell types together, it is unlikely that you can avoid blocking on patient. Individual-specific effects are particularly prominent in human patient data, you will have to deal with it somehow.

ADD REPLYlink modified 23 months ago • written 23 months ago by Aaron Lun24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 116 users visited in the last hour