Pigengene : BN network using consensus module
3
0
Entering edit mode
@rammohanshukla-11495
Last seen 5.9 years ago

Hi Habil,

I performed consensus WGCNA analysis to find conserved modules across 5 phases of a disease. 

Now, as a next step, I would like to use the eigengenes from these conserved modules and make a Bayesian network similar to what you have made in your papers and use the disorders to see how the different phases (control, Episode1, etc) are associated with different modules.

My questions are:

1) How to use pigengene for pre-calculated eigengenes for consensus modules?

2) If I use your pipeline, you have shown only two conditions i.e aml and mds but I have 5 conditions, can pigengene work with that?

3) can I perform logistic regression with eigengenes and different phases of disease which are coded as 0 or 1?

4)I want to see the BN network of consensus eigengenes and how they change in different phases of depression.

Thanks in advance for your kind help

Ram

 

Pigengene • 1.7k views
ADD COMMENT
3
Entering edit mode
Habil Zare ▴ 200
@habil-zare-7836
Last seen 12 months ago
United States/Austin Area

The paper that you mentioned is a general introduction to Pigengene. Our following paper is more focused on Bayesian networks (BNs): Agrahari, Rupesh, et al. "Applications of Bayesian network models in predicting types of hematological malignancies." Scientific Reports 8.1 (2018): 6951. I answer your questions below:

1) How to use pigengene for pre-calculated eigengenes for consensus modules?
This can be done if you inspect the one.step.pigengene() function. Specifically, 
- Use check.pigengene.input() for QC:
c1 <- check.pigengene.input(Data=Data, Labels=Labels, na.rm=TRUE)
Data <- c1$Data
Labels <- c1$Labels

- Use compute.pigengene() to compute a pigengene object.
- Train your model using learn.bn().

2) If I use your pipeline, you have shown only two conditions i.e aml and mds but I have 5 conditions, can pigengene work with that?
Yes, e.g., I made a mock example from the compute.pigengene() example:
library(Pigengene)
data(aml)
data(mds)
data(eigengenes33)
d1 <- rbind(aml,mds)
Labels <- c(rep("AML",nrow(aml)),rep("MDS",nrow(mds)))
names(Labels) <- rownames(d1)
## Let's add another condition:
Labels[1:60] <- "Mock"
modules33 <- eigengenes33$modules[colnames(d1)]
## Computing:
pigengene <- compute.pigengene(Data=d1, Labels=Labels, modules=modules33,
   saveFile="pigengene.RData", doPlot=TRUE, verbose=3)
class(pigengene)
plot(pigengene, DiseaseColors=1:3, fontsize=12)

Fitting a BN is also possible. I checked the learn.bn() example using the above mock Labels.

3) Can I perform logistic regression with eigengenes and different phases of disease which are coded as 0 or 1?
Yes. in the paper, we showed that eigengenes are informative features (biomarkers), which can be used efficiently in different predictive models including decision trees, BNs, etc. Logistic regression is not an exception.

How many modules do you have? If you have many modules (20-30) and few samples (5-10), then you do not have some idea on which modules to use, then using all modules can lead to overfitting. See “Rule of Ten”.

4) I want to see the BN network of consensus eigengenes and how they change in different phases of depression.
You can use the draw.bn() function to plot a BN. However, training a BN need many (at least hundreds) of samples and I guess you have only tens of samples. Then, training a BN for a specific phase of depression would be even more difficult because of the limited number of samples. I recommend you use all your samples to train ONE BN and then the information you are looking for will be in the dependency table of the Disease variable. Specifically, I recommend you set
use.Disease=TRUE and use.Effect=FALSE. This is the opposite of the approach in out Scientific Report paper (see “The BN design” in Supplementary Note S1). 

 

ADD COMMENT
0
Entering edit mode

Thanks

I have around 20 to 26 modules, Here, I am telling a range coz the number of modules can vary depending on the merge tree cut height selected. Except for 2 phases for which I have 15 samples, I have 20 samples.

Based on what I understand from your text,  I first need to compute pigengenes using compute.pigengene() function. here I will label the 5 phases and provide the data and the computed eigengenes as input.

next, I will use the output of compute.pigengene() function, i.e "pigengene" object and use it in learn.bn() function, right?

Can you be more precise how to use the other parameters in this case? 

where can I find the dependency table of the Disease variable in the example you have used in your package?

finally, Once everything goes well in lear.bn step I will draw the network using draw.bn, right?

Thanks

Ram

ADD REPLY
3
Entering edit mode
Habil Zare ▴ 200
@habil-zare-7836
Last seen 12 months ago
United States/Austin Area

- The input to the compute.pigengene() function includes data and labels, NOT the eigengenes. They are computed by this function.

- The parameters are explained in the manual of each function, e.g., see ?learn.bn. Is there a particular parameter you have a question on?

- After you have the BN structure learned (see examplelearn.bn)), you can fit it to the data using:

fit <- bnlearn::bn.fit(x=learnt$consensus1$BN, data=learnt$consensus1$Data,method="bayes",iss=10)

And now fit$Disease is the conditional probability table. Also, see the tables corresponding to children of this node, e.g., fit$ME9.

 

ADD COMMENT
0
Entering edit mode

Thanks again,

I think I am now understanding the concepts, but few more question though

Suppose I have two conditions 1) Control and 2) schizophrenia patient, I believe we cannot have 0 and 1 code assigned to control and schizophrenia but instead, the program will compute a conditional probability between the two conditions and label it as "disease", right? Cant, we have modules which are the child of schizophrenia instead? Can I add a column of 0s and 1s representing the disease next to modules in eigengene (as in examplelearn.bn)) and then perform the learn.bn ??Please see this paper where the authors have constructed BN  network with effects (for instance schizophrenia in the present example) as one of the node. 

How to interpret the output of fit$ME9 which is given below:

> fit$ME9

  Parameters of node ME9 (multinomial distribution)

Conditional probability table:
 
                    Disease
ME9                        AML       MDS
  (-0.029,-0.00833]  0.1433172 0.6015779
  (-0.00833,0.00635] 0.2447665 0.1518738
  (0.00635,0.0366]   0.6119163 0.2465483

What are the ranges in column ME9?

does the bnNum parameter in bn.learn function represent the number of permutations? I think this question depends on my understanding of the Bayesian network, which is very poor. Can you provide a reference paper which can give a simple "Bayesian network for Dummies" kind of explanation?

Is there a way I can bring and edit the graph in Cytoscape or any such program?

 

 

ADD REPLY
0
Entering edit mode
Habil Zare ▴ 200
@habil-zare-7836
Last seen 12 months ago
United States/Austin Area

1- In the learn.bn function, Labels is a named vector of characters. It can have "0" and "1" values if you prefer that. I am not sure I get your question right. "Disease" is just the name of the variable that has Labels values. The name of the variable does not matter. Modules can be a child of this node.

2- In Fig3. a in the paper you referred to, the to each case 3 different conditions (i.e., relevant AD traits) are associated: 1) β-amyloid, 2) tau tangles and 3) cognitive decline. These conditions are modeled using 3 nodes in the BN: Amyloid, tau, and CogDec, correspondingly. You have only 1 condition which can get 2 values: schizophrenia or control. 

3- What are the ranges in column ME9?
The value of each eigengene is discretized into 3 levels because use.Hartemink is TRUE by default. See the manual.

4- The only parent of ME9 is Desease therefore, based on the Markov property, given the disease type, the probability of each level of ME9 is determined independently of the rest of the network. The fit$ME9 is the conditional probability table, e.g., if the disease is AML, the levels of the ME9 eigengene are observed with probabilities: 0.14, 0.25, 0.61, respectively. 

5- To fit a BN to data, we start from a random structure and try to maximize a score. We repeat this process bnNum times. So, the more the better because there will be more chances that you find a "good" network. For 10-20 modules, bnNum=1000 should be OK, and <100 might be too few.

6- A reference paper which can give a simple "Bayesian network for Dummies" kind of explanation?
Have a look at the wikipedia page on BNs, and if it is not simple enough let me know to see if I can find something simpler.

7- Is there a way I can bring and edit the graph in Cytoscape or any such program?
No unless you are willing to have a look at the draw.bn() code, learn how to get the network data, and use one of the many tools that are developed for plotting fancy graphs. 

 

 

ADD COMMENT

Login before adding your answer.

Traffic: 446 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6