--Hi all,
I am struggling with a statistical question related to RNA seq data. I have collected cells sets and then extract RNA data from 6 different persons. I have 2 conditions: treated and untreated. At the end i have 12 libraries: 6 treated and 6 untreated. My goal is to find genes differentially expressed between the conditions. After recieving back the sequencing data (2x50bp illumina paired-end sequencing) I noticed a clear batch/individual effect between the 6 sets of samples because genes are higher (additional variation in the counts) in some persons as compared to the other.
I have used DEseq2 package: first without using batch factor and then with batch factor, I have more differentially expressed genes including batch factor.
So, do you think I should build my model including the batch factor (different donors) to the design to increase the sensitivity for finding differences due to condition or not ?
Any feed back on possible alternatives methods I can use are very much appreciated!
thank you so much.
Regards, Laurent --
Here is my sampleTable:
sampleName fileName condition batch
1 treated1.dat treated1.dat treated D1
2 treated2.dat treated2.dat treated D2
3 treated3.dat treated3.dat treated D3
4 treated4.dat treated4.dat treated D4
5 treated5.dat treated5.dat treated D5
6 treated6.dat treated6.dat treated D6
7 untreated1.dat untreated1.dat untreated D1
8 untreated2.dat untreated2.dat untreated D2
9 untreated3.dat untreated3.dat untreated D3
10 untreated4.dat untreated4.dat untreated D4
11 untreated5.dat untreated5.dat untreated D5
12 untreated6.dat untreated6.dat untreated D6
Your sample table is unreadable. Please fix this first. (Use the formatting button with "101" on it)
Hey Simon, I sorted it out on the user's behalf.