Dear all,
I am trying to do a differential expression analysis for which I have 200 tumor samples and 170 of their paired normal tissues, from the same patient.
We intend to take them all (200) into the same analysis (same paper) and I am wondering how can I analyze them together in a statistically acceptable way without losing the 30 tumor samples without paired normal tissue
The metadata then would look like this (imagine with this example that I have 20 tumor samples samples instead of 200: 17 patients will be paired, 3 of them would be unpaired):
sample patient type
1 1 1 Normal
2 2 2 Normal
3 3 3 Normal
4 4 4 Normal
5 5 5 Normal
6 6 6 Normal
7 7 7 Normal
8 8 8 Normal
9 9 9 Normal
10 10 10 Normal
11 11 11 Normal
12 12 12 Normal
13 13 13 Normal
14 14 14 Normal
15 15 15 Normal
16 16 16 Normal
17 17 17 Normal
18 18 1 Tumor
19 19 2 Tumor
20 20 3 Tumor
21 21 4 Tumor
22 22 5 Tumor
23 23 6 Tumor
24 24 7 Tumor
25 25 8 Tumor
26 26 9 Tumor
27 27 10 Tumor
28 28 11 Tumor
29 29 12 Tumor
30 30 13 Tumor
31 31 14 Tumor
32 32 15 Tumor
33 33 16 Tumor
34 34 17 Tumor
35 35 18 Tumor
36 36 19 Tumor
37 37 20 Tumor
Can i put them all together in the same design? Is this looking right to you?
As I want to do the unpaired + paired (when available) tests at the same time, can I then define the model.matrix like this?:
design <- model.matrix(~ patient + type)
After that, the main idea is to compare the resulting log2foldchanges in order to find subgroups of patients with distinct patterns of expression, does it sound acceptable?
Thank you in advance,
David.
Sorry I have just updated the question as I noticed I wasn't using R markdown guidelines.
Hope anyone could help now.
Cheers!
Sorry I have just updated the question as I noticed I wasn't using R markdown guidelines.
Hope anyone could help now.
Cheers!