Hello,
I have run the viper workflow using the test data-set, but I'm having trouble using my own data using a regulatory network generated with ARACNe-AP. Unfortunately, ARACNe-AP no longer seems to output adj files as used in the viper vignette. Instead I have a text file that represents a consolidated ARACNe-AP network based on 100 bootstraps: Regulator Target MI pvalue 1 RNF11 FAM177A1 0.4596049 0.0000000000 2 SERPINE2 SERPING1 0.5120350 0.0000000000 3 GPRC5A RRAD 0.6129169 0.0000000000
But, I think there might be a formatting issue that prevents matching between the expression data and the ARACNe network to generate the regulon object. Would you have advice as to how I can resolve this issue?
Load library
load(viper)
create expression eSet
exprs<-as.matrix(read.table("RICHSnormbatch.txt",header=TRUE,sep="\t",row.names=1,as.is=TRUE,check.names=F)) head(exprs[1:5,1:5]) pData<-read.csv("../../Covariates.csv",row.names=1,header=TRUE) phenoData<-new("AnnotatedDataFrame",data=pData) dset<-ExpressionSet(assayData=exprs, phenoData=phenoData, annotation="hg19") dset ExpressionSet (storageMode: lockedEnvironment) assayData: 12135 features, 200 samples element names: exprs protocolData: none phenoData sampleNames: 21001 21004 ... 22078 (200 total) varLabels: ID idx ... U_MoBT (53 total) varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' Annotation: hg19
set location of ARACNe network
adjfile<-"viper/RICHS_ARACNe.txt" regul <- aracne2regulon(adjfile, dset,format="3col",verbose = TRUE) Loading the dataset... Generating the regulon objects... Error in tapply(1:nrow(tmp), tmp$tf, function(pos, tmp) { : arguments must have same length
Looking at source code, the network text file may not be in correct format
adjfile<-"viper/RICHS_ARACNe.txt" tmp <- t(sapply(strsplit(readLines(adjfile), "\t"), function(x) x[1:3])) head(tmp) [,1] [,2] [,3]
[1,] "\"RNF11\"" "\"FAM177A1\"" "0.45960485911494" [2,] "\"SERPINE2\"" "\"SERPING1\"" "0.512034971750754" [3,] "\"GPRC5A\"" "\"RRAD\"" "0.612916935813727"aracne <- data.frame(tf = tmp[, 1], target = tmp[, 2], mi = as.numeric(tmp[, 3])/max(as.numeric(tmp[, 3]))) head(aracne) tf target mi 1 "RNF11" "FAM177A1" 0.3013416 2 "SERPINE2" "SERPING1" 0.3357176 3 "GPRC5A" "RRAD" 0.4018612
tmp <- aracne[!is.na(aracne$mi), ] head(aracne) tf target mi 1 "RNF11" "FAM177A1" 0.3013416 2 "SERPINE2" "SERPING1" 0.3357176 3 "GPRC5A" "RRAD" 0.4018612
str(rownames(exprs)) chr [1:12135] "NOC2L" "KLHL17" "HES4" "ISG15" "AGRN" str(tmp) data.frame': 43095 obs. of 3 variables: $ tf : Factor w/ 179 levels "\"ABHD12\"","\"AFG3L1P\"",..: 133 145 62 44 60 169 75 79 86 172 ... $ target: Factor w/ 11522 levels "\"A2LD1\"","\"A2M\"",..: 3372 8671 8373 11374 1817 10933 9535 7678 9794 575 ... $ mi : num 0.301 0.336 0.402 0.177 0.212 ...
tmp <- tmp[rowSums(matrix(as.matrix(tmp[, 1:2]) %in% rownames(exprs), nrow(tmp), 2)) == 2, ]
tmp [1] tf target mi
<0 rows> (or 0-length row.names)
sessionInfo
sessionInfo()
R version 3.5.1 (2018-07-02) Platform: i386-w64-mingw32/i386 (32-bit) Running under: Windows 7 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LCCOLLATE=EnglishUnited States.1252 LCCTYPE=EnglishUnited States.1252
[3] LCMONETARY=EnglishUnited States.1252 LCNUMERIC=C
[5] LCTIME=English_United States.1252
attached base packages:
[1] grid parallel stats graphics grDevices utils datasets methods
[9] base
other attached packages:
[1] viper1.16.0 Biobase2.42.0 Rgraphviz2.26.0 graph1.60.0
[5] BiocGenerics0.28.0 tidyr0.8.3 dplyr0.8.0.1 minet3.40.0
loaded via a namespace (and not attached):
[1] fansi0.4.0 splines3.5.1 R62.4.0 assertthat0.2.0
[5] utf81.1.4 e10711.7-0.1 knitr1.21 survival2.42-3
[9] cli1.0.1 tidyselect0.2.5 pillar1.3.1 segmented1.0-0
[13] compiler3.5.1 tibble2.0.1 lattice0.20-35 pkgconfig2.0.2
[17] Matrix1.2-14 purrr0.3.1 KernSmooth2.23-15 rstudioapi0.9.0
[21] MASS7.3-50 glue1.3.0 xfun0.5 stats43.5.1
[25] BiocManager1.30.4 magrittr1.5 rlang0.3.1 yaml2.2.0
[29] tools3.5.1 mixtools1.1.0 crayon1.3.4 class7.3-15
[33] Rcpp_1.0.0
A little late, but in case it helps anyone else -- I had to do three things to use output from ARACNe-AP with
aracne2regulon
: 1. drop the p-value column, 2. remove the header, and, 3. setformat="3col"
in the call toaracne2regulon()
..Hello Keith,
I am currently trying to use my ARACNE-AP generated networks in Viper. I did all three steps you mentioned above but my problem lies in the length of the network and the gene expression matrix. When I try to use the gene expression matrix as is and give the 3col aracne-ap output I get the following error:
I can subset the gene expression matrix according to regulators in the network but I don't think that is very reasonable. Since it wouldn't be the same gene expression matrix that I fed into ARACNE-AP. I was wondering if you came across the same problem or not and if so did you subset the gene expression matrix?
Thank you in advance!
Hi, I'm afraid not, so I don't think I can be much help, unfortunately. In the past, the authors of ARACNe-AP were generally quite responsive and helpful though, so you might consider reaching out to them directly. Best of luck!
Thank you!
Hi Luna_P, I think I am in a very similar situation. I get the same error message and modified my network file as mentioned above but I still get the "arguments must have same length" message. Did you come up with a solution by any chance ? Many thanks Best Mika
As @Keith Hughitt mentioned, the ARACNe network.txt file output has to be pre-processed before running
aracne2regulon()
function:format="3col"
in the call toaracne2regulon()
Removing the index solved the
arguments must have same length
errorHope it helps!
Theo