I am working on RNAseq data of parasites from three populations (p1, p2, p3) during its interaction between two different hosts (h1 and h2). so finally I have 6 conditions (3 populations X 2 hosts) and each condition have 3 replicates. It has been observed that p1h1, p2h1, p3h1 and p2h2 have same phenotype (S) while p1h2 and p2h2 have another phenotype (R).
when it comes to DE design, I used two methods:
Method 1. subset read data and calculate DE genes for each factor (parasite, host) independently.
for example: to find DE genes during interaction with different hosts in parasite p1
dds <- DESeqDataSetFromMatrix(countData = ReadCount[which(parasite==p1),], colData = condition, design= ~ host ) dds <- DESeq(dds) HostDiff.p1 <- results(dds, contrast=("host", "h1","h2") # host effect on p1: 230 DE genes (padj < 0.01)
Method 2: using the whole data, calculate DE genes using multiple factor model including interaction term; then extract DE result for each factor (parasite, host)
dds <- DESeqDataSetFromMatrix(countData = ReadCount[which(parasite==p1),], colData = condition, design= ~ parasite + host + parasite:host ) dds <- DESeq(dds) HostEffect <- results(dds, contrast=list("hosth1","hosth2")) # main host effect: 1 DE HostEffect.p1 <- results(cds, contrast=list("parasitep1.hosth1","parasitep1.hosth2")) # interaction term on parasite p1: 62 DE HostEffect2.p1 <- results(cds, contrast=list(c("hosth1","parasitep1.hosth1"),c("hosth2","parasitep1.hosth2")) # host term on parasite p1: 117 DE
I found that number of differential expressed genes due to host difference are quite different based on two methods. HostEffect2.p1 and HostDiff.p1 shared only 35 contigs.
Then, here are my questions:
1. why were DE genes quite different based on two methods, although testing the same factor?
2. which method is more reliable?
3. What is difference on biological meaning between 'interaction term' result (HostEffect.p1) and 'host term' result (HostEffect2.p1) ? According to the
results help page,
HostEffect.p1 means "host-parasite interaction effect for host in parasite p1" while
HostEffect2.p1 is the sum of "main effect for host" and "interaction effect for host" in parasite p1. This explanation is not that clear to me. Since parasite factor is fixed in interaction term of p1, interaction difference relied only on host effect. while "main effect for host" is still the host effect but just a more general comparison among all parasites. Then if I want to know host effect on parasite p1, which one should I choose,