Question: Pigengene : BN network using consensus module
0
7 months ago by
rammohanshukla10 wrote:

Hi Habil,

I performed consensus WGCNA analysis to find conserved modules across 5 phases of a disease.

Now, as a next step, I would like to use the eigengenes from these conserved modules and make a Bayesian network similar to what you have made in your papers and use the disorders to see how the different phases (control, Episode1, etc) are associated with different modules.

My questions are:

1) How to use pigengene for pre-calculated eigengenes for consensus modules?

2) If I use your pipeline, you have shown only two conditions i.e aml and mds but I have 5 conditions, can pigengene work with that?

3) can I perform logistic regression with eigengenes and different phases of disease which are coded as 0 or 1?

4)I want to see the BN network of consensus eigengenes and how they change in different phases of depression.

Ram

pigengene • 247 views
modified 7 months ago by Habil Zare170 • written 7 months ago by rammohanshukla10
Answer: Pigengene : BN network using consensus module
3
7 months ago by
Habil Zare170
United States/Austin Area
Habil Zare170 wrote:

The paper that you mentioned is a general introduction to Pigengene. Our following paper is more focused on Bayesian networks (BNs): Agrahari, Rupesh, et al. "Applications of Bayesian network models in predicting types of hematological malignancies." Scientific Reports 8.1 (2018): 6951. I answer your questions below:

1) How to use pigengene for pre-calculated eigengenes for consensus modules?
This can be done if you inspect the one.step.pigengene() function. Specifically,
- Use check.pigengene.input() for QC:
c1 <- check.pigengene.input(Data=Data, Labels=Labels, na.rm=TRUE) Data <- c1$Data Labels <- c1$Labels
- Use compute.pigengene() to compute a pigengene object.
- Train your model using learn.bn().

2) If I use your pipeline, you have shown only two conditions i.e aml and mds but I have 5 conditions, can pigengene work with that?
Yes, e.g., I made a mock example from the compute.pigengene() example:
library(Pigengene) data(aml) data(mds) data(eigengenes33) d1 <- rbind(aml,mds) Labels <- c(rep("AML",nrow(aml)),rep("MDS",nrow(mds))) names(Labels) <- rownames(d1) ## Let's add another condition: Labels[1:60] <- "Mock" modules33 <- eigengenes33$modules[colnames(d1)] ## Computing: pigengene <- compute.pigengene(Data=d1, Labels=Labels, modules=modules33, saveFile="pigengene.RData", doPlot=TRUE, verbose=3) class(pigengene) plot(pigengene, DiseaseColors=1:3, fontsize=12) Fitting a BN is also possible. I checked the learn.bn() example using the above mock Labels. 3) Can I perform logistic regression with eigengenes and different phases of disease which are coded as 0 or 1? Yes. in the paper, we showed that eigengenes are informative features (biomarkers), which can be used efficiently in different predictive models including decision trees, BNs, etc. Logistic regression is not an exception. How many modules do you have? If you have many modules (20-30) and few samples (5-10), then you do not have some idea on which modules to use, then using all modules can lead to overfitting. See “Rule of Ten”. 4) I want to see the BN network of consensus eigengenes and how they change in different phases of depression. You can use the draw.bn() function to plot a BN. However, training a BN need many (at least hundreds) of samples and I guess you have only tens of samples. Then, training a BN for a specific phase of depression would be even more difficult because of the limited number of samples. I recommend you use all your samples to train ONE BN and then the information you are looking for will be in the dependency table of the Disease variable. Specifically, I recommend you set use.Disease=TRUE and use.Effect=FALSE. This is the opposite of the approach in out Scientific Report paper (see “The BN design” in Supplementary Note S1). ADD COMMENTlink modified 7 months ago • written 7 months ago by Habil Zare170 Thanks I have around 20 to 26 modules, Here, I am telling a range coz the number of modules can vary depending on the merge tree cut height selected. Except for 2 phases for which I have 15 samples, I have 20 samples. Based on what I understand from your text, I first need to compute pigengenes using compute.pigengene() function. here I will label the 5 phases and provide the data and the computed eigengenes as input. next, I will use the output of compute.pigengene() function, i.e "pigengene" object and use it in learn.bn() function, right? Can you be more precise how to use the other parameters in this case? where can I find the dependency table of the Disease variable in the example you have used in your package? finally, Once everything goes well in lear.bn step I will draw the network using draw.bn, right? Thanks Ram ADD REPLYlink written 7 months ago by rammohanshukla10 Answer: Pigengene : BN network using consensus module 3 7 months ago by Habil Zare170 United States/Austin Area Habil Zare170 wrote: - The input to the compute.pigengene() function includes data and labels, NOT the eigengenes. They are computed by this function. - The parameters are explained in the manual of each function, e.g., see ?learn.bn. Is there a particular parameter you have a question on? - After you have the BN structure learned (see examplelearn.bn)), you can fit it to the data using: fit <- bnlearn::bn.fit(x=learnt$consensus1$BN, data=learnt$consensus1$Data,method="bayes",iss=10) And now fit$Disease is the conditional probability table. Also, see the tables corresponding to children of this node, e.g., fit$ME9. ADD COMMENTlink written 7 months ago by Habil Zare170 Thanks again, I think I am now understanding the concepts, but few more question though Suppose I have two conditions 1) Control and 2) schizophrenia patient, I believe we cannot have 0 and 1 code assigned to control and schizophrenia but instead, the program will compute a conditional probability between the two conditions and label it as "disease", right? Cant, we have modules which are the child of schizophrenia instead? Can I add a column of 0s and 1s representing the disease next to modules in eigengene (as in examplelearn.bn)) and then perform the learn.bn ??Please see this paper where the authors have constructed BN network with effects (for instance schizophrenia in the present example) as one of the node. How to interpret the output of fit$ME9 which is given below:

> fit$ME9 Parameters of node ME9 (multinomial distribution) Conditional probability table: Disease ME9 AML MDS (-0.029,-0.00833] 0.1433172 0.6015779 (-0.00833,0.00635] 0.2447665 0.1518738 (0.00635,0.0366] 0.6119163 0.2465483 What are the ranges in column ME9? does the bnNum parameter in bn.learn function represent the number of permutations? I think this question depends on my understanding of the Bayesian network, which is very poor. Can you provide a reference paper which can give a simple "Bayesian network for Dummies" kind of explanation? Is there a way I can bring and edit the graph in Cytoscape or any such program? ADD REPLYlink written 7 months ago by rammohanshukla10 Answer: Pigengene : BN network using consensus module 0 7 months ago by Habil Zare170 United States/Austin Area Habil Zare170 wrote: 1- In the learn.bn function, Labels is a named vector of characters. It can have "0" and "1" values if you prefer that. I am not sure I get your question right. "Disease" is just the name of the variable that has Labels values. The name of the variable does not matter. Modules can be a child of this node. 2- In Fig3. a in the paper you referred to, the to each case 3 different conditions (i.e., relevant AD traits) are associated: 1) β-amyloid, 2) tau tangles and 3) cognitive decline. These conditions are modeled using 3 nodes in the BN: Amyloid, tau, and CogDec, correspondingly. You have only 1 condition which can get 2 values: schizophrenia or control. 3- What are the ranges in column ME9? The value of each eigengene is discretized into 3 levels because use.Hartemink is TRUE by default. See the manual. 4- The only parent of ME9 is Desease therefore, based on the Markov property, given the disease type, the probability of each level of ME9 is determined independently of the rest of the network. The fit$ME9 is the conditional probability table, e.g., if the disease is AML, the levels of the ME9 eigengene are observed with probabilities: 0.14, 0.25, 0.61, respectively.

5- To fit a BN to data, we start from a random structure and try to maximize a score. We repeat this process bnNum times. So, the more the better because there will be more chances that you find a "good" network. For 10-20 modules, bnNum=1000 should be OK, and <100 might be too few.

6- A reference paper which can give a simple "Bayesian network for Dummies" kind of explanation?
Have a look at the wikipedia page on BNs, and if it is not simple enough let me know to see if I can find something simpler.

7- Is there a way I can bring and edit the graph in Cytoscape or any such program?
No unless you are willing to have a look at the draw.bn() code, learn how to get the network data, and use one of the many tools that are developed for plotting fancy graphs.