I am working with WGCNA package on a bacterial species. Everything seems OK, but when I checked the co-expression of 4 genes (belong to a locus) that already known to be co-transcribed, only 2 of them were placed in the same module and the two others were in two different module.
I can not understand the reason.
I want to know if I can rely on WGCNA for the remaining cases or not.
I also want to know the reasons for observing such cases
Well, it's difficult to say what is going on without seeing the data.
However, one thing you can do is to check the correlations among the 4
genes in your data. Create a matrix of the expression of the 4 genes
and use cor() on them to see whether they are really co-expressed in
your data. If they are strongly co-expressed, it could be that you
need to make the module identification a bit more conservative (lower
deepSplit argument). There could be other issues as well, but you can
start with these two suggesstions.
Peter
I computed the correlation among four genes. The correlations between gene A,B,C and D was as follow:
0.971778
D-C
-0.5559
C-B
-0.19767
B-A
-0.56853
D-B
0.326305
D-A
0.37033
C-A
I had included 252 samples( heterogeneous conditions). Do you think those negative correlations or low positive correlations are due to the heterogeneity in samples conditions.
Another questions is how I can argue the observed results in my paper. In the previous works it has been frequently been indicated that these 4 genes are co-transcribed, because they are in the same locus.
Well, the correlations answer your question why the genes ended up in
different modules, presumably except for D and C which would obviously end
up in the same module. Sorry, I cannot help you with the data
interpretation or troubleshooting.
Dear Peter,
First of all thank you for the comments.
I computed the correlation among four genes. The correlations between gene A,B,C and D was as follow:
0.37033
C-A
I had included 252 samples( heterogeneous conditions). Do you think those negative correlations or low positive correlations are due to the heterogeneity in samples conditions.
Another questions is how I can argue the observed results in my paper. In the previous works it has been frequently been indicated that these 4 genes are co-transcribed, because they are in the same locus.
Regards
Nazanin