Question: TCGAanalyze_DMR error when calculating mean methylation difference between two groups.
0
gravatar for gerald.mccollam
6 months ago by
gerald.mccollam0 wrote:

Hi,

I'm using TCGAanalyze_DMR to compare matched datasets. The TCGA files represent tumor/normal samples. The following code will download and prepare each file into a single RangedSummarizedExperiment object:

library(TCGAbiolinks)
library(SummarizedExperiment)

query.met.lusc <- GDCquery(project = "TCGA-LUSC", legacy = TRUE, 
    data.category = "DNA methylation",
    platform = "Illumina Human Methylation 450",
    barcode = c("TCGA-43-6771-01A-11D-1818-05", "TCGA-43-6771-11A-01D-1818-05"))
GDCdownload(query.met.lusc)

met.lusc.450 <- GDCprepare(query = query.met.lusc,
                        save = TRUE,
                        save.filename = "LUSCmet-paired.rda" ,
                        summarizedExperiment = TRUE)

When I try to run TCGAanalyze_DMR on the object met.lusc.450 it stops because of an error in the rowMeans function.

result <- TCGAanalyze_DMR(met.lusc.450, groupCol = "barcode", 
                          group1 = "TCGA-43-6771-01A-11D-1818-05",
                          group2 = "TCGA-43-6771-11A-01D-1818-05")

Group1:TCGA-43-6771-01A-11D-1818-05
Group2:TCGA-43-6771-11A-01D-1818-05
Calculating the diference between the mean methylation of the groups...
Error in rowMeans(m[, idx1], na.rm = TRUE) : 
  'x' must be an array of at least two dimensions.

The error occurs in the file methylation.R in the TCGAbiolinks package. The code in methylation.R that raises the error is:

m <- assay(data)
idx1 <- which(colData(data)[,groupCol] == group1)
idx2 <- which(colData(data)[,groupCol] == group2)
mean.g1 <- rowMeans(m[,idx1], na.rm = TRUE)
mean.g2 <- rowMeans(m[,idx2], na.rm = TRUE)
diffmean <- mean.g2 - mean.g1

If I run my SummarizedExperiment object directly in this code I get the following:

m <- assay(met.lusc.450)
head(m)

           TCGA-43-6771-01A-11D-1818-05 TCGA-43-6771-11A-01D-1818-05
cg00000029                    0.3120898                    0.2771894
cg00000165                    0.3568547                    0.1523665
cg00000236                    0.9145645                    0.8747849
cg00000289                    0.7837589                    0.7889066
cg00000292                    0.8549143                    0.7540823
cg00000321                    0.3683245                    0.2493837

dim(m)
[1] 395843      2

The object 'm' is a matrix class with two dimensions. The data are Beta values for each experimental group, i.e. tumor/normal. If I run the rest of my code directly I get:

idx1 <- which(colData(met.lusc.450)[,"barcode"] == "TCGA-43-6771-01A-11D-1818-05") # tumor
idx2 <- which(colData(met.lusc.450)[,"barcode"] == "TCGA-43-6771-11A-01D-1818-05") # normal

mean.g1 <- rowMeans(m[,idx1], na.rm = TRUE)
mean.g2 <- rowMeans(m[,idx2], na.rm = TRUE)
Error in rowMeans(m[, idx1], na.rm = TRUE) : 
  'x' must be an array of at least two dimensions

My questions (finally!):

1) The object passed to rowMeans (m[,idx1]) is a numeric vector of one dimension. This explains the error. Is this what TCGAanalyze_DMR expects to find?

head(m[,idx1])
cg00000029 cg00000165 cg00000236 cg00000289 cg00000292 cg00000321 
 0.3120898  0.3568547  0.9145645  0.7837589  0.8549143  0.3683245 

2) If I run rowMeans after casting 'm' as a dataframe it seems to work but I'm not sure if the result makes sense.

mean.g1 <- rowMeans(data.frame(m[,idx1]), na.rm = TRUE)
mean.g2 <- rowMeans(data.frame(m[,idx2]), na.rm = TRUE)
diffmean <- mean.g2 - mean.g1

head(mean.g1)
cg00000029 cg00000165 cg00000236 cg00000289 cg00000292 cg00000321 
 0.3120898  0.3568547  0.9145645  0.7837589  0.8549143  0.3683245 

head(mean.g2)
cg00000029 cg00000165 cg00000236 cg00000289 cg00000292 cg00000321 
 0.2771894  0.1523665  0.8747849  0.7889066  0.7540823  0.2493837 

head(diffmean)
  cg00000029   cg00000165   cg00000236   cg00000289   cg00000292   cg00000321 
-0.034900310 -0.204488265 -0.039779578  0.005147694 -0.100831923 -0.118940805 

Any help is appreciated!


sessionInfo() R version 3.5.2 (2018-12-20) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale: [1] enUS.UTF-8/enUS.UTF-8/enUS.UTF-8/C/enUS.UTF-8/en_US.UTF-8

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] SummarizedExperiment1.12.0 DelayedArray0.8.0 BiocParallel1.16.6 matrixStats0.54.0
[5] Biobase2.42.0 GenomicRanges1.34.0 GenomeInfoDb1.18.2 IRanges2.16.0
[9] S4Vectors0.20.1 TCGAbiolinks2.10.4 AnnotationHub2.14.5 BiocGenerics0.28.0

loaded via a namespace (and not attached): [1] backports1.1.3 circlize0.4.5 aroma.light3.12.0 plyr1.8.4
[5] selectr0.4-1 ConsensusClusterPlus1.46.0 lazyeval0.2.1 splines3.5.2
[9] ggplot23.1.0 sva3.30.1 digest0.6.18 foreach1.4.4
[13] htmltools0.3.6 magrittr1.5 memoise1.1.0 cluster2.0.7-1
[17] doParallel1.0.14 limma3.38.3 ComplexHeatmap1.20.0 Biostrings2.50.2
[21] readr1.3.1 annotate1.60.1 sesameData1.0.0 R.utils2.8.0
[25] prettyunits1.0.2 colorspace1.4-0 blob1.1.1 rvest0.3.2
[29] ggrepel0.8.0 xfun0.5 dplyr0.8.0.1 crayon1.3.4
[33] RCurl1.95-4.12 jsonlite1.6 genefilter1.64.0 zoo1.8-4
[37] survival2.43-3 iterators1.0.10 glue1.3.0 survminer0.4.3
[41] gtable0.2.0 sesame1.0.0 zlibbioc1.28.0 XVector0.22.0
[45] GetoptLong0.1.7 wheatmap0.1.0 shape1.4.4 scales1.0.0
[49] DESeq1.34.1 DBI1.0.0 edgeR3.24.3 ggthemes4.1.0
[53] Rcpp1.0.0 xtable1.8-3 progress1.2.0 cmprsk2.2-7
[57] bit1.1-14 matlab1.0.2 km.ci0.5-2 preprocessCore1.44.0
[61] httr1.4.0 RColorBrewer1.1-2 pkgconfig2.0.2 XML3.98-1.19
[65] R.methodsS31.7.1 locfit1.5-9.1 DNAcopy1.56.0 tidyselect0.2.5
[69] rlang0.3.1 later0.8.0 AnnotationDbi1.44.0 munsell0.5.0
[73] tools3.5.2 downloader0.4 generics0.0.2 RSQLite2.1.1
[77] ExperimentHub1.8.0 broom0.5.1 evaluate0.13 stringr1.4.0
[81] yaml2.2.0 knitr1.22 bit640.9-7 survMisc0.5.5
[85] purrr0.3.1 randomForest4.6-14 EDASeq2.16.3 nlme3.1-137
[89] mime0.6 R.oo1.22.0 xml21.2.0 biomaRt2.38.0
[93] compiler3.5.2 rstudioapi0.9.0 curl3.3 interactiveDisplayBase1.20.0 [97] tibble2.0.1 geneplotter1.60.0 stringi1.3.1 GenomicFeatures1.34.6
[101] lattice0.20-38 Matrix1.2-16 KMsurv0.1-5 pillar1.3.1
[105] BiocManager1.30.4 GlobalOptions0.1.0 data.table1.12.0 bitops1.0-6
[109] httpuv1.4.5.1 rtracklayer1.42.2 R62.4.0 latticeExtra0.6-28
[113] hwriter1.3.2 promises1.0.1 ShortRead1.40.0 gridExtra2.3
[117] codetools0.2-16 assertthat0.2.0 rjson0.2.20 GenomicAlignments1.18.1
[121] Rsamtools1.34.1 GenomeInfoDbData1.2.0 mgcv1.8-27 hms0.4.2
[125] grid3.5.2 tidyr0.8.3 rmarkdown1.12 ggpubr0.2
[129] shiny_1.2.0

ADD COMMENTlink modified 6 months ago • written 6 months ago by gerald.mccollam0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 399 users visited in the last hour