TCGAanalyze_DMR error when calculating mean methylation difference between two groups.
Entering edit mode
Last seen 2.7 years ago


I'm using TCGAanalyze_DMR to compare matched datasets. The TCGA files represent tumor/normal samples. The following code will download and prepare each file into a single RangedSummarizedExperiment object:


query.met.lusc <- GDCquery(project = "TCGA-LUSC", legacy = TRUE, 
    data.category = "DNA methylation",
    platform = "Illumina Human Methylation 450",
    barcode = c("TCGA-43-6771-01A-11D-1818-05", "TCGA-43-6771-11A-01D-1818-05"))

met.lusc.450 <- GDCprepare(query = query.met.lusc,
                        save = TRUE,
                        save.filename = "LUSCmet-paired.rda" ,
                        summarizedExperiment = TRUE)

When I try to run TCGAanalyze_DMR on the object met.lusc.450 it stops because of an error in the rowMeans function.

result <- TCGAanalyze_DMR(met.lusc.450, groupCol = "barcode", 
                          group1 = "TCGA-43-6771-01A-11D-1818-05",
                          group2 = "TCGA-43-6771-11A-01D-1818-05")

Calculating the diference between the mean methylation of the groups...
Error in rowMeans(m[, idx1], na.rm = TRUE) : 
  'x' must be an array of at least two dimensions.

The error occurs in the file methylation.R in the TCGAbiolinks package. The code in methylation.R that raises the error is:

m <- assay(data)
idx1 <- which(colData(data)[,groupCol] == group1)
idx2 <- which(colData(data)[,groupCol] == group2)
mean.g1 <- rowMeans(m[,idx1], na.rm = TRUE)
mean.g2 <- rowMeans(m[,idx2], na.rm = TRUE)
diffmean <- mean.g2 - mean.g1

If I run my SummarizedExperiment object directly in this code I get the following:

m <- assay(met.lusc.450)

           TCGA-43-6771-01A-11D-1818-05 TCGA-43-6771-11A-01D-1818-05
cg00000029                    0.3120898                    0.2771894
cg00000165                    0.3568547                    0.1523665
cg00000236                    0.9145645                    0.8747849
cg00000289                    0.7837589                    0.7889066
cg00000292                    0.8549143                    0.7540823
cg00000321                    0.3683245                    0.2493837

[1] 395843      2

The object 'm' is a matrix class with two dimensions. The data are Beta values for each experimental group, i.e. tumor/normal. If I run the rest of my code directly I get:

idx1 <- which(colData(met.lusc.450)[,"barcode"] == "TCGA-43-6771-01A-11D-1818-05") # tumor
idx2 <- which(colData(met.lusc.450)[,"barcode"] == "TCGA-43-6771-11A-01D-1818-05") # normal

mean.g1 <- rowMeans(m[,idx1], na.rm = TRUE)
mean.g2 <- rowMeans(m[,idx2], na.rm = TRUE)
Error in rowMeans(m[, idx1], na.rm = TRUE) : 
  'x' must be an array of at least two dimensions

My questions (finally!):

1) The object passed to rowMeans (m[,idx1]) is a numeric vector of one dimension. This explains the error. Is this what TCGAanalyze_DMR expects to find?

cg00000029 cg00000165 cg00000236 cg00000289 cg00000292 cg00000321 
 0.3120898  0.3568547  0.9145645  0.7837589  0.8549143  0.3683245 

2) If I run rowMeans after casting 'm' as a dataframe it seems to work but I'm not sure if the result makes sense.

mean.g1 <- rowMeans(data.frame(m[,idx1]), na.rm = TRUE)
mean.g2 <- rowMeans(data.frame(m[,idx2]), na.rm = TRUE)
diffmean <- mean.g2 - mean.g1

cg00000029 cg00000165 cg00000236 cg00000289 cg00000292 cg00000321 
 0.3120898  0.3568547  0.9145645  0.7837589  0.8549143  0.3683245 

cg00000029 cg00000165 cg00000236 cg00000289 cg00000292 cg00000321 
 0.2771894  0.1523665  0.8747849  0.7889066  0.7540823  0.2493837 

  cg00000029   cg00000165   cg00000236   cg00000289   cg00000292   cg00000321 
-0.034900310 -0.204488265 -0.039779578  0.005147694 -0.100831923 -0.118940805 

Any help is appreciated!

sessionInfo() R version 3.5.2 (2018-12-20) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale: [1] enUS.UTF-8/enUS.UTF-8/enUS.UTF-8/C/enUS.UTF-8/en_US.UTF-8

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] SummarizedExperiment1.12.0 DelayedArray0.8.0 BiocParallel1.16.6 matrixStats0.54.0
[5] Biobase2.42.0 GenomicRanges1.34.0 GenomeInfoDb1.18.2 IRanges2.16.0
[9] S4Vectors0.20.1 TCGAbiolinks2.10.4 AnnotationHub2.14.5 BiocGenerics0.28.0

loaded via a namespace (and not attached): [1] backports1.1.3 circlize0.4.5 aroma.light3.12.0 plyr1.8.4
[5] selectr0.4-1 ConsensusClusterPlus1.46.0 lazyeval0.2.1 splines3.5.2
[9] ggplot23.1.0 sva3.30.1 digest0.6.18 foreach1.4.4
[13] htmltools0.3.6 magrittr1.5 memoise1.1.0 cluster2.0.7-1
[17] doParallel1.0.14 limma3.38.3 ComplexHeatmap1.20.0 Biostrings2.50.2
[21] readr1.3.1 annotate1.60.1 sesameData1.0.0 R.utils2.8.0
[25] prettyunits1.0.2 colorspace1.4-0 blob1.1.1 rvest0.3.2
[29] ggrepel0.8.0 xfun0.5 dplyr0.8.0.1 crayon1.3.4
[33] RCurl1.95-4.12 jsonlite1.6 genefilter1.64.0 zoo1.8-4
[37] survival2.43-3 iterators1.0.10 glue1.3.0 survminer0.4.3
[41] gtable0.2.0 sesame1.0.0 zlibbioc1.28.0 XVector0.22.0
[45] GetoptLong0.1.7 wheatmap0.1.0 shape1.4.4 scales1.0.0
[49] DESeq1.34.1 DBI1.0.0 edgeR3.24.3 ggthemes4.1.0
[53] Rcpp1.0.0 xtable1.8-3 progress1.2.0 cmprsk2.2-7
[57] bit1.1-14 matlab1.0.2 km.ci0.5-2 preprocessCore1.44.0
[61] httr1.4.0 RColorBrewer1.1-2 pkgconfig2.0.2 XML3.98-1.19
[65] R.methodsS31.7.1 locfit1.5-9.1 DNAcopy1.56.0 tidyselect0.2.5
[69] rlang0.3.1 later0.8.0 AnnotationDbi1.44.0 munsell0.5.0
[73] tools3.5.2 downloader0.4 generics0.0.2 RSQLite2.1.1
[77] ExperimentHub1.8.0 broom0.5.1 evaluate0.13 stringr1.4.0
[81] yaml2.2.0 knitr1.22 bit640.9-7 survMisc0.5.5
[85] purrr0.3.1 randomForest4.6-14 EDASeq2.16.3 nlme3.1-137
[89] mime0.6 R.oo1.22.0 xml21.2.0 biomaRt2.38.0
[93] compiler3.5.2 rstudioapi0.9.0 curl3.3 interactiveDisplayBase1.20.0 [97] tibble2.0.1 geneplotter1.60.0 stringi1.3.1 GenomicFeatures1.34.6
[101] lattice0.20-38 Matrix1.2-16 KMsurv0.1-5 pillar1.3.1
[105] BiocManager1.30.4 GlobalOptions0.1.0 data.table1.12.0 bitops1.0-6
[109] httpuv1.4.5.1 rtracklayer1.42.2 R62.4.0 latticeExtra0.6-28
[113] hwriter1.3.2 promises1.0.1 ShortRead1.40.0 gridExtra2.3
[117] codetools0.2-16 assertthat0.2.0 rjson0.2.20 GenomicAlignments1.18.1
[121] Rsamtools1.34.1 GenomeInfoDbData1.2.0 mgcv1.8-27 hms0.4.2
[125] grid3.5.2 tidyr0.8.3 rmarkdown1.12 ggpubr0.2
[129] shiny_1.2.0

TCGAbiolinks SummarizedExperiment TCGAanalyze_DMR methylation R • 522 views

Login before adding your answer.

Traffic: 424 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6