TCGAanalyze_DMR error when calculating mean methylation difference between two groups.
0
0
Entering edit mode
@geraldmccollam-20422
Last seen 4.8 years ago

Hi,

I'm using TCGAanalyze_DMR to compare matched datasets. The TCGA files represent tumor/normal samples. The following code will download and prepare each file into a single RangedSummarizedExperiment object:

library(TCGAbiolinks)
library(SummarizedExperiment)

query.met.lusc <- GDCquery(project = "TCGA-LUSC", legacy = TRUE, 
    data.category = "DNA methylation",
    platform = "Illumina Human Methylation 450",
    barcode = c("TCGA-43-6771-01A-11D-1818-05", "TCGA-43-6771-11A-01D-1818-05"))
GDCdownload(query.met.lusc)

met.lusc.450 <- GDCprepare(query = query.met.lusc,
                        save = TRUE,
                        save.filename = "LUSCmet-paired.rda" ,
                        summarizedExperiment = TRUE)

When I try to run TCGAanalyze_DMR on the object met.lusc.450 it stops because of an error in the rowMeans function.

result <- TCGAanalyze_DMR(met.lusc.450, groupCol = "barcode", 
                          group1 = "TCGA-43-6771-01A-11D-1818-05",
                          group2 = "TCGA-43-6771-11A-01D-1818-05")

Group1:TCGA-43-6771-01A-11D-1818-05
Group2:TCGA-43-6771-11A-01D-1818-05
Calculating the diference between the mean methylation of the groups...
Error in rowMeans(m[, idx1], na.rm = TRUE) : 
  'x' must be an array of at least two dimensions.

The error occurs in the file methylation.R in the TCGAbiolinks package. The code in methylation.R that raises the error is:

m <- assay(data)
idx1 <- which(colData(data)[,groupCol] == group1)
idx2 <- which(colData(data)[,groupCol] == group2)
mean.g1 <- rowMeans(m[,idx1], na.rm = TRUE)
mean.g2 <- rowMeans(m[,idx2], na.rm = TRUE)
diffmean <- mean.g2 - mean.g1

If I run my SummarizedExperiment object directly in this code I get the following:

m <- assay(met.lusc.450)
head(m)

           TCGA-43-6771-01A-11D-1818-05 TCGA-43-6771-11A-01D-1818-05
cg00000029                    0.3120898                    0.2771894
cg00000165                    0.3568547                    0.1523665
cg00000236                    0.9145645                    0.8747849
cg00000289                    0.7837589                    0.7889066
cg00000292                    0.8549143                    0.7540823
cg00000321                    0.3683245                    0.2493837

dim(m)
[1] 395843      2

The object 'm' is a matrix class with two dimensions. The data are Beta values for each experimental group, i.e. tumor/normal. If I run the rest of my code directly I get:

idx1 <- which(colData(met.lusc.450)[,"barcode"] == "TCGA-43-6771-01A-11D-1818-05") # tumor
idx2 <- which(colData(met.lusc.450)[,"barcode"] == "TCGA-43-6771-11A-01D-1818-05") # normal

mean.g1 <- rowMeans(m[,idx1], na.rm = TRUE)
mean.g2 <- rowMeans(m[,idx2], na.rm = TRUE)
Error in rowMeans(m[, idx1], na.rm = TRUE) : 
  'x' must be an array of at least two dimensions

My questions (finally!):

1) The object passed to rowMeans (m[,idx1]) is a numeric vector of one dimension. This explains the error. Is this what TCGAanalyze_DMR expects to find?

head(m[,idx1])
cg00000029 cg00000165 cg00000236 cg00000289 cg00000292 cg00000321 
 0.3120898  0.3568547  0.9145645  0.7837589  0.8549143  0.3683245 

2) If I run rowMeans after casting 'm' as a dataframe it seems to work but I'm not sure if the result makes sense.

mean.g1 <- rowMeans(data.frame(m[,idx1]), na.rm = TRUE)
mean.g2 <- rowMeans(data.frame(m[,idx2]), na.rm = TRUE)
diffmean <- mean.g2 - mean.g1

head(mean.g1)
cg00000029 cg00000165 cg00000236 cg00000289 cg00000292 cg00000321 
 0.3120898  0.3568547  0.9145645  0.7837589  0.8549143  0.3683245 

head(mean.g2)
cg00000029 cg00000165 cg00000236 cg00000289 cg00000292 cg00000321 
 0.2771894  0.1523665  0.8747849  0.7889066  0.7540823  0.2493837 

head(diffmean)
  cg00000029   cg00000165   cg00000236   cg00000289   cg00000292   cg00000321 
-0.034900310 -0.204488265 -0.039779578  0.005147694 -0.100831923 -0.118940805 

Any help is appreciated!


sessionInfo() R version 3.5.2 (2018-12-20) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS Sierra 10.12.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale: [1] enUS.UTF-8/enUS.UTF-8/enUS.UTF-8/C/enUS.UTF-8/en_US.UTF-8

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] SummarizedExperiment1.12.0 DelayedArray0.8.0 BiocParallel1.16.6 matrixStats0.54.0
[5] Biobase2.42.0 GenomicRanges1.34.0 GenomeInfoDb1.18.2 IRanges2.16.0
[9] S4Vectors0.20.1 TCGAbiolinks2.10.4 AnnotationHub2.14.5 BiocGenerics0.28.0

loaded via a namespace (and not attached): [1] backports1.1.3 circlize0.4.5 aroma.light3.12.0 plyr1.8.4
[5] selectr0.4-1 ConsensusClusterPlus1.46.0 lazyeval0.2.1 splines3.5.2
[9] ggplot23.1.0 sva3.30.1 digest0.6.18 foreach1.4.4
[13] htmltools0.3.6 magrittr1.5 memoise1.1.0 cluster2.0.7-1
[17] doParallel1.0.14 limma3.38.3 ComplexHeatmap1.20.0 Biostrings2.50.2
[21] readr1.3.1 annotate1.60.1 sesameData1.0.0 R.utils2.8.0
[25] prettyunits1.0.2 colorspace1.4-0 blob1.1.1 rvest0.3.2
[29] ggrepel0.8.0 xfun0.5 dplyr0.8.0.1 crayon1.3.4
[33] RCurl1.95-4.12 jsonlite1.6 genefilter1.64.0 zoo1.8-4
[37] survival2.43-3 iterators1.0.10 glue1.3.0 survminer0.4.3
[41] gtable0.2.0 sesame1.0.0 zlibbioc1.28.0 XVector0.22.0
[45] GetoptLong0.1.7 wheatmap0.1.0 shape1.4.4 scales1.0.0
[49] DESeq1.34.1 DBI1.0.0 edgeR3.24.3 ggthemes4.1.0
[53] Rcpp1.0.0 xtable1.8-3 progress1.2.0 cmprsk2.2-7
[57] bit1.1-14 matlab1.0.2 km.ci0.5-2 preprocessCore1.44.0
[61] httr1.4.0 RColorBrewer1.1-2 pkgconfig2.0.2 XML3.98-1.19
[65] R.methodsS31.7.1 locfit1.5-9.1 DNAcopy1.56.0 tidyselect0.2.5
[69] rlang0.3.1 later0.8.0 AnnotationDbi1.44.0 munsell0.5.0
[73] tools3.5.2 downloader0.4 generics0.0.2 RSQLite2.1.1
[77] ExperimentHub1.8.0 broom0.5.1 evaluate0.13 stringr1.4.0
[81] yaml2.2.0 knitr1.22 bit640.9-7 survMisc0.5.5
[85] purrr0.3.1 randomForest4.6-14 EDASeq2.16.3 nlme3.1-137
[89] mime0.6 R.oo1.22.0 xml21.2.0 biomaRt2.38.0
[93] compiler3.5.2 rstudioapi0.9.0 curl3.3 interactiveDisplayBase1.20.0 [97] tibble2.0.1 geneplotter1.60.0 stringi1.3.1 GenomicFeatures1.34.6
[101] lattice0.20-38 Matrix1.2-16 KMsurv0.1-5 pillar1.3.1
[105] BiocManager1.30.4 GlobalOptions0.1.0 data.table1.12.0 bitops1.0-6
[109] httpuv1.4.5.1 rtracklayer1.42.2 R62.4.0 latticeExtra0.6-28
[113] hwriter1.3.2 promises1.0.1 ShortRead1.40.0 gridExtra2.3
[117] codetools0.2-16 assertthat0.2.0 rjson0.2.20 GenomicAlignments1.18.1
[121] Rsamtools1.34.1 GenomeInfoDbData1.2.0 mgcv1.8-27 hms0.4.2
[125] grid3.5.2 tidyr0.8.3 rmarkdown1.12 ggpubr0.2
[129] shiny_1.2.0

TCGAbiolinks SummarizedExperiment TCGAanalyze_DMR methylation R • 1.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 768 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6