Considering both : batch effect and patient to patient variation in EdgeR analysis
2
0
Entering edit mode
@ndeshpande-8759
Last seen 9.2 years ago
Australia

Hi all,

I have a question about the model to be defined for my the design in my EdgeR analysis..

 

My experiment design includes

9 patients :

one sample per condition per patient  i.e. Two sample per patient 'C1D1' & 'EOT'

RNASeq data was generated in  2 batches ('BATCH1' & 'BATCH2'). Every patient has both sample generated with a single batch (i.e. either BATCH1 or BATCH2) 

Our main question is to identify genes changing across conditions C1D1 and EOT for the nine patients.

 

My data.frame looks as below:

meta <- data.frame(
  row.names=colnames(counttable),
  condition=c("C1D1", "C1D1", "C1D1", "C1D1", "C1D1", "C1D1", "C1D1", "C1D1", "C1D1","EOT","EOT","EOT","EOT","EOT","EOT","EOT","EOT","EOT"),
  libType=c("Batch1","Batch2","Batch2","Batch2","Batch1","Batch2","Batch2","Batch1","Batch2","Batch1","Batch1","Batch1","Batch2","Batch2","Batch2","Batch2","Batch2","Batch2"),

patient=c('AW','DD','DG','EC','EL','GR','LA','NR','RL','AW','NR','EL','DD','DG','EC','GR','LA','RL'))

  1. I could see a prominent batch effect using the plotMDS function which got resolved to much extent (but not completely) using the  "logCPMc <- removeBatchEffect(logCPM, libType)"

           I assume that using a design matrix such as edesign <- model.matrix(~libType+condition) should help me take care of batch effect. Can someone confirm if I am defining the matrix correctly?

 

My final design matrix includes the 'patient' parameter to compensate for the patient-patient variation!!

edesign <- model.matrix(~libType+patient+condition)

Again not sure if this is the correct way!!

 

edesign

(Intercept) libTypeBatch2 patientDD patientDG patientEC patientEL patientGR patientLA patientNR patientRL conditionEOT
1            1             0         0         0         0         0         0         0         0         0            0
2            1             1         1         0         0         0         0         0         0         0            0
3            1             1         0         1         0         0         0         0         0         0            0
4            1             1         0         0         1         0         0         0         0         0            0
5            1             0         0         0         0         1         0         0         0         0            0
6            1             1         0         0         0         0         1         0         0         0            0
7            1             1         0         0         0         0         0         1         0         0            0
8            1             0         0         0         0         0         0         0         1         0            0
9            1             1         0         0         0         0         0         0         0         1            0
10           1             0         0         0         0         0         0         0         0         0            1
11           1             0         0         0         0         0         0         0         1         0            1
12           1             0         0         0         0         1         0         0         0         0            1
13           1             1         1         0         0         0         0         0         0         0            1
14           1             1         0         1         0         0         0         0         0         0            1
15           1             1         0         0         1         0         0         0         0         0            1
16           1             1         0         0         0         0         1         0         0         0            1
17           1             1         0         0         0         0         0         1         0         0            1
18           1             1         0         0         0         0         0         0         0         1            1

 

However I get an error

e <- estimateGLMCommonDisp(e, edesign)
Error in glmFit.default(y, design = design, dispersion = dispersion, offset = offset,  :
  Design matrix not of full rank.  The following coefficients not estimable:
 patientRL

 

Can someone put more light on this error and/or how can I do the analysis differently?

Appreciate any feedback,

regards,

 

Nandan

 

EdgeR batch effect patient rnaseq • 1.6k views
ADD COMMENT
0
Entering edit mode
Aaron Lun ★ 28k
@alun
Last seen 9 hours ago
The city by the bay

There's no point putting in the libType factor, as the batch effect is fully absorbed into the patient-specific blocking factors. Consider an example gene where all Batch2 samples have a 2-fold increase in expression. You don't need a specific coefficient to account for this batch effect, as the 2-fold increase will be absorbed by the patient coefficients for all patients in the second batch. In summary, use:

design <- model.matrix(~patient+condition)

This should avoid the error in estimateGLMCommonDisp. Your previous matrix wasn't of full rank because the libType coefficient was redundant with the patient coefficients, for reasons described above.

ADD COMMENT
0
Entering edit mode
@ndeshpande-8759
Last seen 9.2 years ago
Australia

Thanks Aaron for the explanation.

 

I will use this design,

 

Cheers,

 

Nandan

 

 

ADD COMMENT

Login before adding your answer.

Traffic: 886 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6