I have a dataset where for the same patient and time we have extracted different samples from different locations.
What would be the best way to encode this in MAE?
Patient | Time | Region | State of the region | |
---|---|---|---|---|
A | 0 | A | Healthy | |
A | 0 | B | Injured | |
A | 0 | C | Injured | |
A | 24 | A | Healthy | |
A | 24 | B | Injured |
The sampleMap is provides "many-to-one" mapping, but when those phenotypes are from each sample how should I store it? I have several variables related to the patient (sex, age of diagnosis, disease, C-reactive protein, treatment followed, antibiotics, ...) and some related to the sample mainly (date of extraction, region extracted, state of that region, Endoscopic Score of the region, type of sample, ...)
The only way I thought is using as ID a combination of Patient, Time and Location, something like paste(Patient, Time, Location, collapse = "_")
of the samples but it would duplicate information about the patient in order to store correctly the information about the sample.
Is there any better solution?
Hi Levi, I didn't explain myself well, sorry.
I have some samples linked to a location (biopsies from 5 regions) and some that aren't (stools) [or that they are are always from the same region]. I have two essays for the biopsies (RNA-seq and 16S-seq) and one assay for the stools (16S-seq).
My main goal is to know the relationship between assays. However, the regions of the biopsies differ on how they behave, so the relationship between assays could be different depending on the region of the biopsies. At the same time, it is interesting to see if there is a common relationship between patients in the relationship between biopsies and stools (RNA-seq to 16S-seq, 16S-seq to 16S-seq or between all the assays). I was considering to have just one row per patient in order to be able to see these common relationship between assays.
I hope I have explained myself a bit better. Many thanks