Question

EGAD R-package: How to generate a 2-column matrix with gene-interactions required for "make_annotations()"-function?

0

Entering edit mode

m4rc.b3ringer • 0

@m4rcb3ringer-23899

Last seen 3.7 years ago

Dear everyone,

I am trying to build and evaluate co-expression networks based on RNA-seq data. Co-expression networks are built by first providing a gene-expression matrix, where each row is a gene and each column is a sample and, or treatment. Thus each cell in the matrix shows an expression value of a gene in a sample/treatment. Then these expression values are correlated, resulting in a pairwise correlation matrix, where each gene is correlated to each gene. This correlation matrix is then transformed into an adjacency matrix, resulting in a matrix showing the weighted relationships between genes as values between 0 and 1.

Using the EGAD R-package (see Resources below) one can create and evaluate such co-expression networks.

Problem

To evaluate a co-expression network, one has to generate a binary annotation matrix. Such a matrix has genes as rows and annotations (e.g. GO-terms) as columns and each cell has either the value 1, or 0, indicating whether the gene belongs to the annotation, or not. The problem is to generate this annotation matrix, one has to provide a 2-column interaction-matrix where each row represents one gene-gene interaction. For the examples in the EGAD user guide (see resources below), such an interaction-matrix is provided. But it is not explained how to generate such a matrix for co-expression networks, where interactions are not binary, but weighted. What one could do is to threshold the adjacency matrix and create a binary network from the co-expression network, i.e. defining all interactions with weight >= 0.7 as real connections (binary = 1) and everything with weight < 0.7 as no connection (binary = 0). But that seems to bias the entire analysis towards setting this threshold.

Question

How does one get a 2-column gene-interaction matrix from a co-expression network required as input for the "make_annotations()"-function?

Resources

EGAD R-package (https://www.bioconductor.org/packages/release/bioc/html/EGAD.html) Paper by Ballouz et al. (2016, doi: 10.1093/bioinformatics/btw695) EGAD user guide (https://www.bioconductor.org/packages/release/bioc/vignettes/EGAD/inst/doc/EGAD.pdf)

EGAD_1.16.0 EGAD R network analysis • 1.0k views

ADD COMMENT • link updated 2.1 years ago by Andrés • 0 • written 3.7 years ago by m4rc.b3ringer • 0

0

Entering edit mode

Hello, have you had any luck? I'm also using WGCNA to make co-expression networks, and would like to try the EGAD package to test which tweaks "improve" my networks.

Two possible workarounds I'm thinking are:

Setting as "connection" any two genes belonging to the same module.
Setting adjacency=0 (zero) any two genes NOT belonging to the same module, while leaving the intramodular non-binary adjacencies.

Thanks in advance.

ADD REPLY • link 2.1 years ago Andrés • 0

score 0 · Answer 1 · 2020-07-24

Dear everyone,

As input for the EGAD::make_annotations() function, the data may also be a table with two columns, each row a gene and in the second column an annotation. The annotation can be anything you want to test for, but using GO-terms, or KEGG-terms is recommended.

Finally a binary annotation matrix can also be generated like so:

#annotation file
head(data)

gene GO_term
gene1 GO:0018742
gene2 GO:0042803
gene3 GO:0009636

#list of genes
genes <- rownames(data)
head(genes)

[1] "gene1" "gene2" "gene3"

#list of all involved GO-terms
goterms <- unique(data[,2])
head(goterms)

[1] "GO:0018742" "GO:0042803" "GO:0009636"

#generate binary annotation matrix as input for downstream functions
annotations <- EGAD::make_annotations(data, genes, goterms)