Entering edit mode
I am hoping to conduct a weighted gene coexpression network analysis across four variables: two different tissue types and gender (M/F) for frogs. Which WGCNA tutorial from the site (http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/) do you suggest using as a guide? Will the package properly analyze my data if I have the RNA seq from a total of 28 samples, at 10 of one tissue type and 18 of the other tissue type? I worry that the minimum amount of samples needed at 15 is for EACH dataset, and not the collective whole amount.
Thank you, Mr. Langfelder, for your time and response. I will be looking at four gonad and four brain tissue samples in female and male Loggerhead turtles. The raw data will have run through FastQC, Trimmomatic, and an RNA-Seq De novo assembly using Trinity.
I am hoping to look at the resulting transciptome reads for significant differences in genes that may be unique to either the tissue or gender of a Loggerhead turtle. In essence, I am looking for patterns across the following 2X2 table:
In other words,
With those research questions in mind, which WGCNA tutorial/s should I refer to for guidance? Thank you in advance!
That's a lot of questions. First, I would run a standard association analysis (regression models for individual genes). You can run the analysis in several different ways, regressing gene expression on tissue, gender, and tissue times gender interaction. This will give you differentially expressed genes with respect to tissue, gender, and genes whose differential expression with respect to gender changes with tissue (and vice-versa).
In the first pass, I would focus on tissue differences since these are likely to be stronger than gender differences. Run WGCNA on the entire data set and relate module eigengenes to the same variables as the genes. If it turns out that tissue dominates the signal, you can run another analysis in which you remove the effect of tissue using ComBat or similar (essentially treating tissue as an unwanted batch effect). For all these you can follow Tutorial I of WGCNA, but always use signed networks (the tutorial uses unsigned networks); multiple function take the argument networkType which you want to set to "signed hybrid".
Finally, if you're not satisfied, you can run WGCNA in each tissue separately (or at least where you have enough samples) and look for modules associated with gender in the particular tissue. To do cross-tissue comparisons, you could look at preservation of network modules (look up "Is my network module preserved" and the associated tutorials) to find modules that are unique to on tissue. If you suspect that there may be modules that are common to both tissues, you can run consensus WGCNA treating each tissue as a separate data set (again, assuming you have enough samples), then relate the consensus modules to gender in each tissue. Consensus module analysis is described in Tutorial II. Again, make sure you use signed networks,
That just about covers all possible WGCNA analyses that I can think of, so it's a lot of work, and your sample numbers are quite low, which will only make it more challenging.
Peter
Dear Dr. Langfelder,
Thank you for taking the time to respond. A couple of follow-up questions for this WGCNA novice, if you don't mind:
1. Should I be using the weighted or nonweighted correlation analysis?
2. In terms of the standard association analysis, can you link me to the specific tutorial section that will result in a gene expression regression? I am confused as to what the X and Y coordinates of the regression model would be...
3. Is there a specific command line for running WGCNA on samples, or it is the whole combination of 1) data input and cleaning + 2) Network construction and module detection + 3) Relating modules to external clinical traits and identifying important genes + 4) Interfacing network analysis with other data such as functional annotation and gene ontology + 5) Network visualization using WGCNA functions + 6) Export of networks to external software?
Thank you once again!
Connie
Sorry for my late reply. We always recommend weighted network analysis (if you meant signed or unsigned, use signed).
For standard association analysis, you can look at Tutorial III, section 4. However, standard association analysis is not really the focus of WGCNA, so you may get better results from other approaches (such as lmFit in limma).
I don't really understand your third question about running WGCNA on samples. In general, I preprocess (clean etc) each data set once, then analyze it in possibly several ways. Each analysis involves the 5 steps (2 through 6) that you wrote above, although I usually skip the network visualization step since it isn't really needed, and I often don't need to export networks to external software.
Peter