Data input for aracne2regulon function in viper package
1
1
Entering edit mode
maya.kappil ▴ 20
@mayakappil-18569
Last seen 2.7 years ago

Hello,

I have run the viper workflow using the test data-set, but I'm having trouble using my own data using a regulatory network generated with ARACNe-AP. Unfortunately, ARACNe-AP no longer seems to output adj files as used in the viper vignette. Instead I have a text file that represents a consolidated ARACNe-AP network based on 100 bootstraps: Regulator Target MI pvalue 1 RNF11 FAM177A1 0.4596049 0.0000000000 2 SERPINE2 SERPING1 0.5120350 0.0000000000 3 GPRC5A RRAD 0.6129169 0.0000000000

But, I think there might be a formatting issue that prevents matching between the expression data and the ARACNe network to generate the regulon object. Would you have advice as to how I can resolve this issue?

create expression eSet

exprs<-as.matrix(read.table("RICHSnormbatch.txt",header=TRUE,sep="\t",row.names=1,as.is=TRUE,check.names=F)) head(exprs[1:5,1:5]) pData<-read.csv("../../Covariates.csv",row.names=1,header=TRUE) phenoData<-new("AnnotatedDataFrame",data=pData) dset<-ExpressionSet(assayData=exprs, phenoData=phenoData, annotation="hg19") dset ExpressionSet (storageMode: lockedEnvironment) assayData: 12135 features, 200 samples element names: exprs protocolData: none phenoData sampleNames: 21001 21004 ... 22078 (200 total) varLabels: ID idx ... U_MoBT (53 total) varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' Annotation: hg19

str(rownames(exprs)) chr [1:12135] "NOC2L" "KLHL17" "HES4" "ISG15" "AGRN" str(tmp) data.frame': 43095 obs. of 3 variables: $tf : Factor w/ 179 levels "\"ABHD12\"","\"AFG3L1P\"",..: 133 145 62 44 60 169 75 79 86 172 ...$ target: Factor w/ 11522 levels "\"A2LD1\"","\"A2M\"",..: 3372 8671 8373 11374 1817 10933 9535 7678 9794 575 ... $mi : num 0.301 0.336 0.402 0.177 0.212 ... tmp <- tmp[rowSums(matrix(as.matrix(tmp[, 1:2]) %in% rownames(exprs), nrow(tmp), 2)) == 2, ] tmp [1] tf target mi <0 rows> (or 0-length row.names) sessionInfo sessionInfo() R version 3.5.1 (2018-07-02) Platform: i386-w64-mingw32/i386 (32-bit) Running under: Windows 7 (build 7601) Service Pack 1 Matrix products: default locale: [1] LCCOLLATE=EnglishUnited States.1252 LCCTYPE=EnglishUnited States.1252 [3] LCMONETARY=EnglishUnited States.1252 LCNUMERIC=C [5] LC TIME=English_United States.1252 attached base packages: [1] grid parallel stats graphics grDevices utils datasets methods [9] base other attached packages: [1] viper1.16.0 Biobase2.42.0 Rgraphviz2.26.0 graph1.60.0 [5] BiocGenerics0.28.0 tidyr0.8.3 dplyr0.8.0.1 minet3.40.0 loaded via a namespace (and not attached): [1] fansi0.4.0 splines3.5.1 R62.4.0 assertthat0.2.0 [5] utf81.1.4 e10711.7-0.1 knitr1.21 survival2.42-3 [9] cli1.0.1 tidyselect0.2.5 pillar1.3.1 segmented1.0-0 [13] compiler3.5.1 tibble2.0.1 lattice0.20-35 pkgconfig2.0.2 [17] Matrix1.2-14 purrr0.3.1 KernSmooth2.23-15 rstudioapi0.9.0 [21] MASS7.3-50 glue1.3.0 xfun0.5 stats43.5.1 [25] BiocManager1.30.4 magrittr1.5 rlang0.3.1 yaml2.2.0 [29] tools3.5.1 mixtools1.1.0 crayon1.3.4 class7.3-15 [33] Rcpp_1.0.0 viper • 1.1k views ADD COMMENT 1 Entering edit mode A little late, but in case it helps anyone else -- I had to do three things to use output from ARACNe-AP with aracne2regulon: 1. drop the p-value column, 2. remove the header, and, 3. set format="3col" in the call to aracne2regulon().. ADD REPLY 0 Entering edit mode Hello Keith, I am currently trying to use my ARACNE-AP generated networks in Viper. I did all three steps you mentioned above but my problem lies in the length of the network and the gene expression matrix. When I try to use the gene expression matrix as is and give the 3col aracne-ap output I get the following error: Error in tapply(1:nrow(tmp), as.vector(tmp$tf), function(pos, tmp) { : arguments must have same length


I can subset the gene expression matrix according to regulators in the network but I don't think that is very reasonable. Since it wouldn't be the same gene expression matrix that I fed into ARACNE-AP. I was wondering if you came across the same problem or not and if so did you subset the gene expression matrix?

1
Entering edit mode

Hi, I'm afraid not, so I don't think I can be much help, unfortunately. In the past, the authors of ARACNe-AP were generally quite responsive and helpful though, so you might consider reaching out to them directly. Best of luck!

0
Entering edit mode

Thank you!

0
Entering edit mode

Hi Luna_P, I think I am in a very similar situation. I get the same error message and modified my network file as mentioned above but I still get the "arguments must have same length" message. Did you come up with a solution by any chance ? Many thanks Best Mika

0
Entering edit mode

As @Keith Hughitt mentioned, the ARACNe network.txt file output has to be pre-processed before running aracne2regulon() function:

1. drop the p-value column, 2. remove the header, 3. remove the index, and, 4. set format="3col" in the call to aracne2regulon()

Removing the index solved the arguments must have same length error

Hope it helps!

Theo

0
Entering edit mode
reef103 • 0
@reef103-8824
Last seen 10 months ago
United States

Hi Luna_P , I'm sorry I didn;t see this before. The error might be due to a lack of match between the genes in the network (regulators and targets) and the geneIDs in your expression matrix (rownames). Can you please, check whether you are using the same geneIDs in the ARACNe network and expression matrix? Best, Mariano

1
Entering edit mode

Hi Mariano, Thank you for your input. I have checked the geneIDs as you mentioned and they are the same. Would you be able to share a reproducible example (matrix and network) that I could run and compare to my input files please ? Many thanks Mikael

0
Entering edit mode

Hi Mariano,

I'm in the same boat, I'm getting a bunch of errors trying to use an aracne.network provided network with Entrez gene ID expression data and I don't know if it's input file related. The gene ID's are the same but I'm getting the following error:

Error in cor(t(expset[rownames(expset) %in% tf, ]), t(expset[rownames(expset) %in%  :
incompatible dimensions


My problem is very similar to those in this question, can you share a reproducible example with me? Preferably from text files as I suspect it's something with my file formatting and indexes getting mixed up.