Hello out there.
I'm currently trying to use pathifier package against a microarray dataset. That specific dataset looks like :
Genes | Cell-line-A | Cell-line-A-Treated | Cell-line-B | Cell-line-B-Treated | Cell-line-C | Cell-line-C-Treated |
Gene1 | exprs.val | exprs.val | exprs.val | exprs.val | exprs.val | exprs.val |
Gene2 | exprs.val | exprs.val | exprs.val | exprs.val | exprs.val | exprs.val |
The total genes are 22,000. Anyway, the point here is that the dataset has only one biological replication from each Cell-line as you can see and only one biological replication for each treatment.
Question #1 :
Currently as normals argument at `quantify_pathways_deregulation()` function I give a vector of `TRUE , FALSE , TRUE , FALSE , TRUE ,FALSE` . But this isn't actually the right approach because these three cell lines are a bit different between each other. So the second thought here is to split that dataset into 3 sub-datasets according to cell lines and use as normals argument a vector of `TRUE, FALSE` and run the function 3 different times for each cell line. What is your opinion about that?
Question #2:
As I was reading the documentation of the pathifier, in order to configure it properly for my dataset, I saw that for the `min_std`
argument they suggest to use the technical noise.
"min_std: The minimal allowed standard deviation of each gene. Genes with a lower standard deviation are divided by min_std instead of their actual standard deviation. (Recommended: set min_std to be the technical noise)."
But how am I going to calculate it? At first run, I just used the 0.2254005
value from the example they provide with that package, but this might not be the right for my dataset.
Any idea or hint is welcome.