DEsubs - Error in read.table during DEsubs command
2
0
Entering edit mode
Talip ▴ 10
@talip-zengin-14290
Last seen 3 days ago
Türkiye

Hi,

I am using DEsubs for subpathway search for my differentially expressed genes. I get error below when running the DEsubs command. I used this code:

if (.Platform[['OS.type']] == 'unix') { 
options('DEsubs_CACHE'=file.path(path.expand("~"), 'DEsubs') ) }

library("DEsubs")

GeneExp_Matrix <- get(load("GeneExp_Matrix.rda"))
sign_genes <- get(load("sign_genes.rda"))

myRankedList <- sign_genes[c('adj.P.Val')]

DEsubs.out <- DEsubs(org='hsa',
                     mRNAexpr=GeneExp_Matrix,
                     mRNAnomenclature='ensembl_gene_id',
                     pathways='All',
                     DEtool=NULL, DEpar=0.05,
                     CORtool='pearson', CORpar=0.6,
                     subpathwayType='community.walktrap',
                     rankedList=myRankedList,
                     verbose=TRUE)

Pruning nodes...Error in read.table(file, dec = ".", sep = " ") : 
'file' must be a character string or connection

 

I checked the file structures of mine and the results are below:

 

str(GeneExp_Matrix)
num [1:56830, 1:124] 10139 0 6725 1173 2273 ...
- attr(*, "dimnames")=List of 2
  ..$ : chr [1:56830] "ENSG00000000003" "ENSG00000000005" "ENSG00000000419" "ENSG00000000457" ...
  ..$ : chr [1:124] "TCGA-38-4625-01A-01R-1206-07" "TCGA-38-4625-11A-01R-1758-07" "TCGA-38-4626-01A-01R-1206-07" "TCGA-38-4626-11A-01R-1758-07" ...

str(myRankedList)
'data.frame':    7292 obs. of  1 variable:
$ adj.P.Val: num  5.40e-52 4.91e-51 4.91e-51 3.29e-50 2.76e-49 ...

 

I checked also the files in toy-data of the package as below:

 

load(system.file('extdata', 'data.RData', package='DEsubs'))
str(mRNAexpr)
 num [1:6880, 1:8] 0 114 1647 31 128 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:6880] "226" "229" "230" "217" ...
  ..$ : chr [1:8] "Sample1" "Sample2" "Sample3" "Sample4" ...

str(rankedList)
 Named num [1:5023] 0.84 0.129 0.915 0.958 0.442 ...
 - attr(*, "names")= chr [1:5023] "226" "229" "230" "217" ...

 

Note: The rownames of rankedList in toy-data is Entrez Gene ID but I want to use Ensembl Gene ID.

How can I fix this error?

Thanks for any help.

 

sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: i686-pc-linux-gnu (32-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=tr_TR.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=tr_TR.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=tr_TR.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=tr_TR.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] DEsubs_1.4.0   locfit_1.5-9.1

loaded via a namespace (and not attached):
  [1] jsonlite_1.5               munsell_0.5.0              latticeExtra_0.6-28       
  [4] dplyr_0.7.6                DESeq_1.30.0               pillar_1.1.0              
  [7] pkgconfig_2.0.1            compiler_3.4.3             DBI_0.7                   
 [10] lazyeval_0.2.1             pkgmaker_0.27              tibble_1.4.2              
 [13] R6_2.2.2                   yaml_2.2.0                 graph_1.56.0              
 [16] BiocGenerics_0.24.0        rngtools_1.3.1             data.table_1.11.4         
 [19] annotate_1.56.2            xtable_1.8-2               gdata_2.18.0              
 [22] circlize_0.4.4             tools_3.4.3                stringr_1.3.1             
 [25] shape_1.4.4                bibtex_0.4.2               AnnotationDbi_1.40.0      
 [28] EBSeq_1.18.0               Biobase_2.38.0             registry_0.5              
 [31] foreach_1.4.4              digest_0.6.15              KernSmooth_2.23-15        
 [34] GenomeInfoDb_1.14.0        codetools_0.2-15           withr_2.1.2               
 [37] stats4_3.4.3               devtools_1.13.6            base64enc_0.1-3           
 [40] tidyselect_0.2.4           gplots_3.0.1               genefilter_1.60.0         
 [43] pheatmap_1.0.10            testthat_2.0.0             scales_1.0.0              
 [46] memoise_1.1.0              stringi_1.2.4              doRNG_1.7.1               
 [49] limma_3.34.9               assertthat_0.2.0           GlobalOptions_0.1.0       
 [52] purrr_0.2.5                GenomeInfoDbData_1.0.0     lattice_0.20-35           
 [55] bit64_0.9-7                Rcpp_0.12.18               caTools_1.17.1.1          
 [58] bindrcpp_0.2.2             Hmisc_4.1-1                Formula_1.2-3             
 [61] ggplot2_3.0.0              htmlTable_1.12             grid_3.4.3                
 [64] blob_1.1.0                 GenomicRanges_1.30.3       plyr_1.8.4                
 [67] survival_2.41-3            RBGL_1.54.0                edgeR_3.20.9              
 [70] DelayedArray_0.4.1         acepack_1.4.1              matrixStats_0.54.0        
 [73] rpart_4.1-12               NBPSeq_0.3.0               glue_1.3.0                
 [76] magrittr_1.5               igraph_1.2.2               S4Vectors_0.16.0          
 [79] blockmodeling_0.3.1        SummarizedExperiment_1.8.1 gridExtra_2.3             
 [82] samr_2.0                   htmlwidgets_1.2            checkmate_1.8.5           
 [85] parallel_3.4.3             htmltools_0.3.6            RSQLite_2.0               
 [88] rstudioapi_0.7             knitr_1.20                 nnet_7.3-12               
 [91] gtable_0.2.0               zlibbioc_1.24.0            colorspace_1.3-2          
 [94] geneplotter_1.56.0         cluster_2.0.6              gtools_3.8.1              
 [97] XVector_0.18.0             RCurl_1.95-4.11            DESeq2_1.18.1             
[100] bitops_1.0-6               RColorBrewer_1.1-2         Matrix_1.2-11             
[103] foreign_0.8-69             bit_1.1-12                 doParallel_1.0.11         
[106] IRanges_2.12.0             bindr_0.1.1                reshape2_1.4.3            
[109] XML_3.98-1.9               iterators_1.0.9            splines_3.4.3             
[112] rlang_0.2.2                BiocParallel_1.12.0        backports_1.1.2           
[115] qvalue_2.10.0            
DEsubs error • 1.5k views
ADD COMMENT
0
Entering edit mode
@panos-balomenos-9147
Last seen 4.8 years ago
Greece/University of Patras

Hi Talip,

The error seems to have originated from passing a data frame as a ranked list rather than a named vector.

I was able to reproduce your error with the package's example as follows:

load(system.file('extdata', 'data.RData', package='DEsubs')) 
DEsubs.run <- DEsubs(
    org='hsa',
    mRNAexpr=mRNAexpr,
    mRNAnomenclature='entrezgene',
    pathways='All',
    DEtool=NULL, DEpar=0.05,
    CORtool='pearson', CORpar=0.7,
    subpathwayType='community.walktrap',
    rankedList=data.frame('adj.P.Val'=rankedList),
    verbose=TRUE)

Pruning nodes...Error in read.table(file, dec = ".", sep = " ") : 
'file' must be a character string or connection

So, in your case, you just have to replace 

myRankedList <- sign_genes[c('adj.P.Val')]

with

myRankedList <- setNames(sign_genes[, 'adj.P.Val'], rownames(sign_genes))

, which converts the single column of the data frame to a named vector, assuming the identifiers are stored as rownames.

Best regards,

Panos

 

ADD COMMENT
0
Entering edit mode

Hi Panos,

Thanks very much for your reply. I changed my code as you advised, it works now. But the DEsubs.out object does not have subnetworks as result. The structure is as below:

> str(myRankedList)
 Named num [1:7292] 5.40e-52 4.91e-51 4.91e-51 3.29e-50 2.76e-49 ...
 - attr(*, "names")= chr [1:7292] "ENSG00000224215" "ENSG00000198358" "ENSG00000204305" "ENSG00000248551" ...

> DEsubs.out <- DEsubs(org='hsa',
+                      mRNAexpr=GeneExp_Matrix,
+                      mRNAnomenclature='ensembl_gene_id',
+                      pathways='All',
+                      DEtool=NULL,
+                      DEpar=0.05,
+                      CORtool='pearson',
+                      CORpar=0.7,
+                      subpathwayType='community.walktrap',
+                      rankedList=myRankedList,
+                      verbose=TRUE)
Pruning nodes...done.
Pruning edges...done.
Performing subpathway analysis ( community.walktrap )...done.

> str(DEsubs.out)
List of 4
 $ org             : chr "hsa"
 $ mRNAnomenclature: chr "ensembl_gene_id"
 $ edgeList        : logi[0 , 1:3]
 $ DEgenes         : Named num(0)
  ..- attr(*, "names")= chr(0)

Is the problem depending on using of Ensembl Gene IDs? Should I use Entrez Gene IDs?

ADD REPLY
0
Entering edit mode

Hi,

I tried again with entrez gene IDs as below, but subnetworks are not generated.

> str(myRankedList)
Named num [1:4412] 4.91e-51 2.76e-49 3.30e-49 2.43e-48 5.59e-48 ...
- attr(*, "names")= chr [1:4412] "177" "857" "6445" "142683" ...

> DEsubs.out <- DEsubs(org='hsa',
+                      mRNAexpr=GeneExp_Matrix,
+                      mRNAnomenclature='entrezgene',
+                      pathways='All',
+                      DEtool=NULL,
+                      DEpar=0.05,
+                      CORtool='pearson',
+                      CORpar=0.7,
+                      subpathwayType='community.walktrap',
+                      rankedList=myRankedList,
+                      verbose=TRUE)
Pruning nodes...done.
Pruning edges...done.
Performing subpathway analysis ( community.walktrap )...done.

> str(DEsubs.out)
List of 4
$ org             : chr "hsa"
$ mRNAnomenclature: chr "entrezgene"
$ edgeList        : logi[0 , 1:3]
$ DEgenes         : Named num(0)
  ..- attr(*, "names")= chr(0)

Then I tried with DEtool option with voom+limma and edgeR, there was problems for voom+limma but edgeR option worked.

> DEsubs.out <- DEsubs(org='hsa',
+                      mRNAexpr=GeneExp_Matrix,
+                      mRNAnomenclature='ensembl_gene_id',
+                      pathways='All',
+                      DEtool='voom+limma',
+                      DEpar=0.05,
+                      CORtool='pearson',
+                      CORpar=0.7,
+                      subpathwayType='community.walktrap',
+                      rankedList=NULL,
+                      verbose=TRUE)
Pruning nodes...done.
Pruning edges...Error in if (type != 3 && CR[i] * reg > thr) { :
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero
2: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero
3: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero
4: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero
5: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero
6: In cor(exprEdgelistSource[i, ], exprEdgelistDestin[i, ], method = CORtool) :
  the standard deviation is zero

> DEsubs.out <- DEsubs(org='hsa',
+                      mRNAexpr=GeneExp_Matrix,
+                      mRNAnomenclature='ensembl_gene_id',
+                      pathways='All',
+                      DEtool='edgeR',
+                      DEpar=0.05,
+                      CORtool='pearson',
+                      CORpar=0.7,
+                      subpathwayType='community.walktrap',
+                      rankedList=NULL,
+                      verbose=TRUE)
Pruning nodes...done.
Pruning edges...done.
Performing subpathway analysis ( community.walktrap )...done.

> str(DEsubs.out)
List of 5
 $ org                           : chr "hsa"
 $ mRNAnomenclature              : chr "ensembl_gene_id"
 $ edgeList                      :'data.frame':    9 obs. of  5 variables:
  ..$ E1      : chr [1:9] "51806" "8674" "1437" "3567" ...
  ..$ E2      : chr [1:9] "816" "8773" "3559" "149233" ...
  ..$ edgeType: num [1:9] 1 1 1 1 1 1 1 1 1
  ..$ pathway : chr [1:9] "04114" "04130" "04630" "04630" ...
  ..$ corr    : num [1:9] 0.987 0.787 0.776 0.753 0.751 ...
 $ DEgenes                       : Named num [1:17] 4.47e-06 2.53e-02 1.28e-02 8.43e-11 2.13e-06 ...
  ..- attr(*, "names")= chr [1:17] "816" "8773" "5294" "2798" ...
 $ subAnalysis.community.walktrap:List of 8
  ..$ sub1: chr [1:3] "940" "5294" "2322"
  ..$ sub2: chr [1:2] "3567" "149233"
  ..$ sub3: chr [1:2] "1437" "3559"
  ..$ sub4: chr [1:2] "51806" "816"
  ..$ sub5: chr [1:2] "114548" "3553"
  ..$ sub6: chr [1:2] "8674" "8773"
  ..$ sub7: chr [1:2] "2796" "2798"
  ..$ sub8: chr [1:2] "405753" "50506"

But I used voom+limma separately and I want to give significant genes with adjusted p values and Ensembl IDs to DEsubs. How can I fix this problem?

ADD REPLY
0
Entering edit mode
@panos-balomenos-9147
Last seen 4.8 years ago
Greece/University of Patras

Hi Talip,

If you update to the latest version 1.6.3, you can supply the adjusted P-values with any type of identifier (from the ones specified in the manual). Give it a couple of days until the necessary changes have been propagated. Meanwhile, you can simply convert the identifiers to Entrez Ids manually: 

if ( package.version('DEsubs') < '1.6.3' )
{
    require('org.Hs.eg.db') 
    geneInfo <- AnnotationDbi::select( 
        x=get('org.Hs.eg.db'), keys=names(myRankedList), keytype='ENSEMBL', columns='ENTREZID' )
    map <- setNames( geneInfo[, 'ENTREZID'], geneInfo[, 'ENSEMBL'] )
    names(myRankedList) <- map[names(myRankedList)]   
}
ADD COMMENT
0
Entering edit mode

Thanks for your interest. I am waiting for new version.

ADD REPLY
0
Entering edit mode

I have tried new version and it works now. Thanks very much for your interest.

ADD REPLY

Login before adding your answer.

Traffic: 588 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6