**150**wrote:

Dear Bioconductor community,

I am currently struggling with fitting a linear model to Illumina BeadChp HT12.v4 data. I have different cell generations (G1, G2, G3 etc) taken form the same clone. I assume these as biological replicates. For each generation I have different measurements. I assume these as technical replicates. Microarray data were measured in two different batches. Finally, I have a specific mutation that I am looking into: T790M: this is present in some of the samples and not present in others. Part of my targets file looks like this:

Sample | BiolRep | T790M | Batch | CellClone |

CL75_G1.2 | 1 | pos | 1 | CL75 |

CL75_G2 | 2 | pos | 1 | CL75 |

CL75_G2 | 2 | pos | 1 | CL75 |

CL75_G2 | 2 | pos | 1 | CL75 |

CL75_G3 | 3 | neg | 1 | CL75 |

CL75_G3 | 3 | neg | 1 | CL75 |

CL75_G3 | 3 | neg | 1 | CL75 |

CL86_G4 | 4 | neg | 0 | CL86 |

CL86_G4 | 4 | neg | 0 | CL86 |

CL86_G4 | 4 | neg | 0 | CL86 |

CL86_G5 | 5 | neg | 0 | CL86 |

CL86_G5 | 5 | neg | 0 |
CL86 |

I want to find differentially expressed genes between T790M positive (pos) and T790M negative (neg) samples. I am following limma's user guide for analysis of BeadArrays (section 17.3). I have reached the state where I need to factorize by my replicates. The guide says:"The study involves multiple cell types for the same patient..."

ct <- factor(targets$CellType)

and then the DE genes between these cell types are detected.

In my case it is different. I tried to factorize based on the Sample and on the CellClone and include the batch in my design matrix:

ct<-factor(targets$Sample) design <- model.matrix(~0+ct+targets$Batch)

but I realized that if I follow the example given by limma, I then have to detect the differences of the variables that rcorrespond to the colnames of my design matrix, which would be CL75_G.2, CL86_G5 etc..This is not the comparison that I want.

I then tried

ct<-factor(targets$T790M) design <- model.matrix(~0+ct+targets$Batch) dupcor <- duplicateCorrelation(gefR, design, block=targets$Sample) #cor between samples of the same clone dupcor$consensus.correlation [1] 0.0723

then give the pos, neg and batch as column names of my design matrix and try to detect the DE genes between T790M pos and T790M neg, using the suggested procedure.

fit <- lmFit(myMatrix, design, block=targets$Sample, correlation=dupcor$consensus.correlation) # the block indicates the technical replicates. contrasts <- makeContrasts('pos-neg',levels=design) fit2 <- contrasts.fit(fit, contrasts) fit2 <- eBayes(fit2) summary(decideTests(fit2, method="global"))

The outcome was 0 DE genes (4597 are the total genes that are truly expressed on my array, according to their Detection p-value)

pos-neg -1 0 0 4597 1 0

I cannot imagine this result is true. I am sure I am missing something but I cannot understand what... The level of complexity of my experiment is grater than the complexity of the limma's example...I would really appreciate any comments/answers/insights from you.

Thank you very much!

Eleni

**17k**• written 22 months ago by Eleni Christodoulou •

**150**