**40**wrote:

I have a question about how to analyze a mix of biological and semi-technical replicates.

The experiment I am analyzing consists of 3 cell lines X 3 replicates of each cell line X 2 conditions. The 3 replicates are done with the same cell line, but independently treated, processed and sequenced, so they aren't "hard" technical replicates, but they are not biological replicates as the 3 cell lines. They show higher correlation between them (cluster more closely in a PCA) than with the other biological replicates (cell lines). The experiment is paired, in which a sample is split and treated with treatments A and B. The 3 cell lines are sequenced together (replicate group below) in 3 groups.

What is the best way to analyze these data? Is a paired analysis (~condition + pair) OK? Or should I average the semi-technical replicates? How else should I account for different correlation between replicates/cell lines?

Analyzing ~condition + pair or ~condition + cell_line yields DEG fairly similar to analyzing only one replicate group and consistent GO enrichment (but many more DEG), but I wonder if using the semi-technical replicates in the same way I'm using biological replicates is increasing type I error. It doesn't seem it is, judging by the consistent GO fold-enrichment of some interesting terms.

Thank you!

```
condition cell_line repl_group pair
A C1 1 c1-1
A C2 1 c2-1
A C3 1 c3-1
A C1 2 c1-2
A C2 2 c2-2
A C3 2 c3-2
A C1 3 c1-3
A C2 3 c2-3
A C3 3 c3-3
B C1 1 c1-1
B C2 1 c2-1
B C3 1 c3-1
B C1 2 c1-2
B C2 2 c2-2
B C3 2 c3-2
B C1 3 c1-3
B C2 3 c2-3
B C3 3 c3-3
```