The paper that you mentioned is a general introduction to Pigengene. Our following paper is more focused on Bayesian networks (BNs): Agrahari, Rupesh, et al. "Applications of Bayesian network models in predicting types of hematological malignancies." Scientific Reports 8.1 (2018): 6951. I answer your questions below:
1) How to use pigengene for pre-calculated eigengenes for consensus modules?
This can be done if you inspect the one.step.pigengene()
function. Specifically,
- Use check.pigengene.input()
for QC:
c1 <- check.pigengene.input(Data=Data, Labels=Labels, na.rm=TRUE)
Data <- c1$Data
Labels <- c1$Labels
- Use compute.pigengene()
to compute a pigengene object.
- Train your model using learn.bn()
.
2) If I use your pipeline, you have shown only two conditions i.e aml and mds but I have 5 conditions, can pigengene work with that?
Yes, e.g., I made a mock example from the compute.pigengene()
example:
library(Pigengene)
data(aml)
data(mds)
data(eigengenes33)
d1 <- rbind(aml,mds)
Labels <- c(rep("AML",nrow(aml)),rep("MDS",nrow(mds)))
names(Labels) <- rownames(d1)
## Let's add another condition:
Labels[1:60] <- "Mock"
modules33 <- eigengenes33$modules[colnames(d1)]
## Computing:
pigengene <- compute.pigengene(Data=d1, Labels=Labels, modules=modules33,
saveFile="pigengene.RData", doPlot=TRUE, verbose=3)
class(pigengene)
plot(pigengene, DiseaseColors=1:3, fontsize=12)
Fitting a BN is also possible. I checked the learn.bn()
example using the above mock Labels.
3) Can I perform logistic regression with eigengenes and different phases of disease which are coded as 0 or 1?
Yes. in the paper, we showed that eigengenes are informative features (biomarkers), which can be used efficiently in different predictive models including decision trees, BNs, etc. Logistic regression is not an exception.
How many modules do you have? If you have many modules (20-30) and few samples (5-10), then you do not have some idea on which modules to use, then using all modules can lead to overfitting. See “Rule of Ten”.
4) I want to see the BN network of consensus eigengenes and how they change in different phases of depression.
You can use the draw.bn()
function to plot a BN. However, training a BN need many (at least hundreds) of samples and I guess you have only tens of samples. Then, training a BN for a specific phase of depression would be even more difficult because of the limited number of samples. I recommend you use all your samples to train ONE BN and then the information you are looking for will be in the dependency table of the Disease variable. Specifically, I recommend you set
use.Disease=TRUE
and use.Effect=FALSE
. This is the opposite of the approach in out Scientific Report paper (see “The BN design” in Supplementary Note S1).
Thanks
I have around 20 to 26 modules, Here, I am telling a range coz the number of modules can vary depending on the merge tree cut height selected. Except for 2 phases for which I have 15 samples, I have 20 samples.
Based on what I understand from your text, I first need to compute pigengenes using compute.pigengene() function. here I will label the 5 phases and provide the data and the computed eigengenes as input.
next, I will use the output of compute.pigengene() function, i.e "pigengene" object and use it in learn.bn() function, right?
Can you be more precise how to use the other parameters in this case?
where can I find the dependency table of the Disease variable in the example you have used in your package?
finally, Once everything goes well in lear.bn step I will draw the network using draw.bn, right?
Thanks
Ram