Minfi can't combineArrays
1
0
Entering edit mode
@sophie-marion-de-proce-11984
Last seen 3.4 years ago
University of Edinburgh

Hi,

I'm trying to combine methylation RGsets together using minfi's combineArrays. For this specific command, I have one 450k and one EPIC RGset. This command worked using the same datasets a few months ago, but now I get an error message as shown below. What am I doing wrong? Could it be due to the updated minfi version? (1_38 vs 1_36)

Thanks!

Sophie.

> rgSetProstate
class: RGChannelSet 
dim: 574981 1735 
metadata(0):
assays(2): Green Red
rownames(574981): 10600322 10600328 ... 74810485 74810492
rowData names(0):
colnames(1735): 6057833166_R02C02 6057833166_R04C02 ...
  GSM4199991_202226400157_R07C01 GSM4199992_202226400157_R08C01
colData names(1): ArrayTypes
Annotation
  array: IlluminaHumanMethylation450k
  annotation: ilmn12.hg19

> rgSetEPIC
class: RGChannelSet 
dim: 1051539 240 
metadata(0):
assays(2): Green Red
rownames(1051539): 1600101 1600111 ... 99810990 99810992
rowData names(0):
colnames(240): GSM2998021_201868500150_R01C01
  GSM2998022_201868500150_R03C01 ... GSM4199991_202226400157_R07C01
  GSM4199992_202226400157_R08C01
colData names(0):
Annotation
  array: IlluminaHumanMethylationEPIC
  annotation: ilm10b4.hg19

> rgSetCombined <- combineArrays(rgSetProstate,rgSetEPIC, outType = "IlluminaHumanMethylation450k",verbose = TRUE)
[convertArray] Casting as IlluminaHumanMethylation450k
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function 'colData<-' for signature '"RGChannelSet", "character"'


> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.5 (Nitrogen)

Matrix products: default
BLAS:   /gpfs/igmmfs01/software/pkg/el7/apps/R/4.0.3/lib64/R/lib/libRblas.so
LAPACK: /gpfs/igmmfs01/software/pkg/el7/apps/R/4.0.3/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.0
 [2] IlluminaHumanMethylationEPICmanifest_0.3.0        
 [3] IlluminaHumanMethylation450kmanifest_0.4.0        
 [4] minfi_1.38.0                                      
 [5] bumphunter_1.34.0                                 
 [6] locfit_1.5-9.4                                    
 [7] iterators_1.0.13                                  
 [8] foreach_1.5.1                                     
 [9] Biostrings_2.60.0                                 
[10] XVector_0.32.0                                    
[11] SummarizedExperiment_1.22.0                       
[12] Biobase_2.52.0                                    
[13] MatrixGenerics_1.4.0                              
[14] matrixStats_0.59.0                                
[15] GenomicRanges_1.44.0                              
[16] GenomeInfoDb_1.28.0                               
[17] IRanges_2.26.0                                    
[18] S4Vectors_0.30.0                                  
[19] BiocGenerics_0.38.0                               

loaded via a namespace (and not attached):
  [1] rjson_0.2.20              ellipsis_0.3.2           
  [3] siggenes_1.66.0           mclust_5.4.7             
  [5] base64_2.0                rstudioapi_0.13          
  [7] bit64_4.0.5               AnnotationDbi_1.54.0     
  [9] fansi_0.5.0               xml2_1.3.2               
 [11] codetools_0.2-18          splines_4.1.0            
 [13] sparseMatrixStats_1.4.0   cachem_1.0.5             
 [15] scrime_1.3.5              Rsamtools_2.8.0          
 [17] annotate_1.70.0           dbplyr_2.1.1             
 [19] png_0.1-7                 HDF5Array_1.20.0         
 [21] readr_1.4.0               compiler_4.1.0           
 [23] httr_1.4.2                assertthat_0.2.1         
 [25] Matrix_1.3-4              fastmap_1.1.0            
 [27] limma_3.48.0              prettyunits_1.1.1        
 [29] tools_4.1.0               glue_1.4.2               
 [31] GenomeInfoDbData_1.2.6    dplyr_1.0.6              
 [33] rappdirs_0.3.3            doRNG_1.8.2              
 [35] Rcpp_1.0.6                vctrs_0.3.8              
 [37] rhdf5filters_1.4.0        multtest_2.48.0          
 [39] preprocessCore_1.54.0     nlme_3.1-152             
 [41] rtracklayer_1.52.0        DelayedMatrixStats_1.14.0
 [43] stringr_1.4.0             lifecycle_1.0.0          
 [45] restfulr_0.0.13           rngtools_1.5             
 [47] XML_3.99-0.6              beanplot_1.2             
 [49] zlibbioc_1.38.0           MASS_7.3-54              
 [51] hms_1.1.0                 rhdf5_2.36.0             
 [53] GEOquery_2.60.0           RColorBrewer_1.1-2       
 [55] yaml_2.2.1                curl_4.3.1               
 [57] memoise_2.0.0             biomaRt_2.48.0           
 [59] reshape_0.8.8             stringi_1.6.2            
 [61] RSQLite_2.2.7             genefilter_1.74.0        
 [63] BiocIO_1.2.0              GenomicFeatures_1.44.0   
 [65] filelock_1.0.2            BiocParallel_1.26.0      
 [67] rlang_0.4.11              pkgconfig_2.0.3          
 [69] bitops_1.0-7              nor1mix_1.3-0            
 [71] lattice_0.20-44           purrr_0.3.4              
 [73] Rhdf5lib_1.14.0           GenomicAlignments_1.28.0 
 [75] bit_4.0.4                 tidyselect_1.1.1         
 [77] plyr_1.8.6                magrittr_2.0.1           
 [79] R6_2.5.0                  generics_0.1.0           
 [81] DelayedArray_0.18.0       DBI_1.1.1                
 [83] pillar_1.6.1              survival_3.2-11          
 [85] KEGGREST_1.32.0           RCurl_1.98-1.3           
 [87] tibble_3.1.2              crayon_1.4.1             
 [89] utf8_1.2.1                BiocFileCache_2.0.0      
 [91] progress_1.2.2            grid_4.1.0               
 [93] data.table_1.14.0         blob_1.2.1               
 [95] digest_0.6.27             xtable_1.8-4             
 [97] tidyr_1.1.3               illuminaio_0.34.0        
 [99] openssl_1.4.4             askpass_1.1              
[101] quadprog_1.5-8
RGset combinearrays 450k minfi • 2.2k views
ADD COMMENT
1
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

That error appears to come from minfi:::.harmonizeDataFrames. What do you get if you do

z <- minfi:::.harmonizeDataFrames(minfi:::.pDataFix(colData(rgSetProstate)), minfi:::.pDataFix(colData(rgSetEPIC)))

And note that there are three colons (:) and a period(.) between minfi and the function names. If I do

> example(combineArrays)
## and then do
>  z <- minfi:::.harmonizeDataFrames(minfi:::.pDataFix(colData(RGsetEx)), 
                                     minfi:::.pDataFix(colData(RGsetEPIC)))

## I get

> z
$x
DataFrame with 6 rows and 13 columns
                  Sample_Name Sample_Well Sample_Plate Sample_Group     Pool_ID
                  <character> <character>  <character>  <character> <character>
5723646052_R02C02    GroupA_3          H5           NA       GroupA          NA
5723646052_R04C01    GroupA_2          D5           NA       GroupA          NA
5723646052_R05C02    GroupB_3          C6           NA       GroupB          NA
5723646053_R04C02    GroupB_1          F7           NA       GroupB          NA
5723646053_R05C02    GroupA_1          G7           NA       GroupA          NA
5723646053_R06C02    GroupB_2          H7           NA       GroupB          NA
                       person       age         sex      status       Array
                  <character> <integer> <character> <character> <character>
5723646052_R02C02         id3        83           M      normal      R02C02
5723646052_R04C01         id2        58           F      normal      R04C01
5723646052_R05C02         id3        83           M      cancer      R05C02
5723646053_R04C02         id1        75           F      cancer      R04C02
5723646053_R05C02         id1        75           F      normal      R05C02
5723646053_R06C02         id2        58           F      cancer      R06C02
                        Slide               Basename              filenames
                  <character>            <character>            <character>
5723646052_R02C02  5723646052 ../extdata/572364605.. ../extdata/572364605..
5723646052_R04C01  5723646052 ../extdata/572364605.. ../extdata/572364605..
5723646052_R05C02  5723646052 ../extdata/572364605.. ../extdata/572364605..
5723646053_R04C02  5723646053 ../extdata/572364605.. ../extdata/572364605..
5723646053_R05C02  5723646053 ../extdata/572364605.. ../extdata/572364605..
5723646053_R06C02  5723646053 ../extdata/572364605.. ../extdata/572364605..

$y
DataFrame with 3 rows and 13 columns
                    Sample_Name Sample_Well Sample_Plate Sample_Group
                    <character> <character>  <character>  <character>
200144450018_R04C01   NA12878.1          NA           NA       Group1
200144450019_R07C01   NA12878.2          NA           NA       Group2
200144450021_R05C01   NA12878.3          NA           NA       Group3
                        Pool_ID      person       age         sex      status
                    <character> <character> <integer> <character> <character>
200144450018_R04C01          NA          NA        NA          NA          NA
200144450019_R07C01          NA          NA        NA          NA          NA
200144450021_R05C01          NA          NA        NA          NA          NA
                          Array        Slide               Basename
                    <character>  <character>            <character>
200144450018_R04C01      R04C01 200144450018 ../extdata/200144450..
200144450019_R07C01      R07C01 200144450019 ../extdata/200144450..
200144450021_R05C01      R05C01 200144450021 ../extdata/200144450..
                                 filenames
                               <character>
200144450018_R04C01 ../extdata/200144450..
200144450019_R07C01 ../extdata/200144450..
200144450021_R05C01 ../extdata/200144450..

It looks like you should get a character value for one or more of the items in that list. In which case you probably don't have the same columns for the colData slots in your two RGChannelSets.

ADD COMMENT
0
Entering edit mode

Thank you very much for looking into this! Yes, the output is different, as shown below. I realise now that I was using my previous combined dataset (rgSetCombined) with the original rgSetEPIC, and these have different colData, because the combined dataset has colData names(1): ArrayTypes but rgSetEPIC has colData names(0):. So my issue is now, how can I combine a previously combined dataset, which has colData names(1): ArrayTypes, with my newly created dataset, which has no colData names? I have tried manually setting the colData for the rgSet450k but combineArrays gives me the same error message. I will copy this below.

> z <- minfi:::.harmonizeDataFrames(minfi:::.pDataFix(colData(rgSetProstate)), minfi:::.pDataFix(colData(rgSetEPIC)))
> z
$x
DataFrame with 1735 rows and 1 column
                                           ArrayTypes
                                          <character>
6057833166_R02C02              IlluminaHumanMethyla..
6057833166_R04C02              IlluminaHumanMethyla..
6164655052_R05C02              IlluminaHumanMethyla..
6164655053_R02C02              IlluminaHumanMethyla..
6164655053_R05C02              IlluminaHumanMethyla..
...                                               ...
GSM4199988_202226400157_R04C01 IlluminaHumanMethyla..
GSM4199989_202226400157_R05C01 IlluminaHumanMethyla..
GSM4199990_202226400157_R06C01 IlluminaHumanMethyla..
GSM4199991_202226400157_R07C01 IlluminaHumanMethyla..
GSM4199992_202226400157_R08C01 IlluminaHumanMethyla..
$y
  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
 [26] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
 [51] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
 [76] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[101] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[126] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[151] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[176] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[201] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[226] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

> colData(rgSet450k) <- cbind(colData(rgSet450k), ArrayTypes=rep("IlluminaHumanMethylation450k",1186))
> rgSet450k
class: RGChannelSet 
dim: 622399 1186 
metadata(0):
assays(2): Green Red
rownames(622399): 10600313 10600322 ... 74810490 74810492
rowData names(0):
colnames(1186): 7786915164_R01C01 7786915164_R01C02 ...
  GSM2430485_9934398075_R05C02 GSM2430486_9934398075_R06C02
colData names(1): ArrayTypes
Annotation
  array: IlluminaHumanMethylation450k
  annotation: ilmn12.hg19
> rgSetProstate
class: RGChannelSet 
dim: 574981 1735 
metadata(0):
assays(2): Green Red
rownames(574981): 10600322 10600328 ... 74810485 74810492
rowData names(0):
colnames(1735): 6057833166_R02C02 6057833166_R04C02 ...
  GSM4199991_202226400157_R07C01 GSM4199992_202226400157_R08C01
colData names(1): ArrayTypes
Annotation
  array: IlluminaHumanMethylation450k
  annotation: ilmn12.hg19
> rgSetCombined <- combineArrays(rgSetProstate,rgSet450k,
+                   outType = "IlluminaHumanMethylation450k",
+                   verbose = TRUE)
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘colData<-’ for signature ‘"RGChannelSet", "character"’
> z <- minfi:::.harmonizeDataFrames(minfi:::.pDataFix(colData(rgSetProstate)), minfi:::.pDataFix(colData(rgSet450k)))
> z
$x
DataFrame with 1735 rows and 1 column
                                           ArrayTypes
                                          <character>
6057833166_R02C02              IlluminaHumanMethyla..
6057833166_R04C02              IlluminaHumanMethyla..
6164655052_R05C02              IlluminaHumanMethyla..
6164655053_R02C02              IlluminaHumanMethyla..
6164655053_R05C02              IlluminaHumanMethyla..
...                                               ...
GSM4199988_202226400157_R04C01 IlluminaHumanMethyla..
GSM4199989_202226400157_R05C01 IlluminaHumanMethyla..
GSM4199990_202226400157_R06C01 IlluminaHumanMethyla..
GSM4199991_202226400157_R07C01 IlluminaHumanMethyla..
GSM4199992_202226400157_R08C01 IlluminaHumanMethyla..

$y
   [1] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
   [3] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
   [5] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
   [7] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
   [9] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [11] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [13] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [15] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [17] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [19] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [21] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [23] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [25] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [27] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [29] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [31] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [33] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [35] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [37] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [39] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [41] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [43] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [45] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [47] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [49] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [51] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [53] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [55] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [57] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [59] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [61] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [63] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [65] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [67] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [69] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [71] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [73] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [75] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [77] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [79] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [81] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [83] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [85] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [87] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [89] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [91] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [93] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [95] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [97] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
  [99] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"
 [101] "IlluminaHumanMethylation450k" "IlluminaHumanMethylation450k"

[...] continues until [1186]

ADD REPLY
1
Entering edit mode

What you have in your colData doesn't meet the expectations that I would normally have. The minfi:::.pDataFix function is supposed to generate a DataFrame containing only "Slide", "Array", "Sample_Name", "Basename", and "SampleID", so it's weird that your combined RGChannelSet has only 'ArrayTypes', which isn't what I would expect. Did you change that?

I would also not expect a colData for the newly generated EPIC array to be empty. It should look like the example data that I presented above. So the next step is to figure out why you are ending up with RGChannelSets that don't meet expectations.

ADD REPLY
0
Entering edit mode

I didn't realise that wasn't the normal RGset format. I created them by using the idat files in specified folders. I can't remember the reason, but I am using several datasets from GEO and TCGA, and it was difficult to combine it all together. And I created a separate sample sheet, which is also time-consuming. But there may be a better way to do this?

rgSet450k <- read.metharray.exp(base="/path/to/idatFiles/",recursive=TRUE,verbose=TRUE,force=TRUE)
ADD REPLY
1
Entering edit mode

Ah, I get it. Usually you have a csv file that comes from the Illumina software and you use read.metharray.sheet to generate a 'targets' file that has all the stuff in it that I expected. If you are getting the data from a bunch of different places it might be a pain to make a fake csv file, so using the 'base' argument instead is probably the way to go.

ADD REPLY
0
Entering edit mode

After some trial and error, what solved this for me was to remove the colData column from rgSet450k instead of adding the same column to rgSetProstate, so neither rgSet has any column in colData.

Thank you!

> colData(rgSetProstate) <- colData(rgSetProstate)[,-1]
> z <- minfi:::.harmonizeDataFrames(minfi:::.pDataFix(colData(rgSetProstate)), minfi:::.pDataFix(colData(rgSet450k)))
> z
$x
DataFrame with 1735 rows and 0 columns

$y
DataFrame with 1186 rows and 0 columns
> rgSetProstate
class: RGChannelSet 
dim: 574981 1735 
metadata(0):
assays(2): Green Red
rownames(574981): 10600322 10600328 ... 74810485 74810492
rowData names(0):
colnames(1735): 6057833166_R02C02 6057833166_R04C02 ...
  GSM4199991_202226400157_R07C01 GSM4199992_202226400157_R08C01
colData names(0):
Annotation
  array: IlluminaHumanMethylation450k
  annotation: ilmn12.hg19
> rgSet450k
class: RGChannelSet 
dim: 622399 1186 
metadata(0):
assays(2): Green Red
rownames(622399): 10600313 10600322 ... 74810490 74810492
rowData names(0):
colnames(1186): 7786915164_R01C01 7786915164_R01C02 ...
  GSM2430485_9934398075_R05C02 GSM2430486_9934398075_R06C02
colData names(0):
Annotation
  array: IlluminaHumanMethylation450k
  annotation: ilmn12.hg19
> rgSetCombined <- combineArrays(rgSetProstate,rgSet450k,
+                   outType = "IlluminaHumanMethylation450k",
+                   verbose = TRUE)
> rgSetCombined
class: RGChannelSet 
dim: 574981 2921 
metadata(0):
assays(2): Green Red
rownames(574981): 10600322 10600328 ... 74810485 74810492
rowData names(0):
colnames(2921): 6057833166_R02C02 6057833166_R04C02 ...
  GSM2430485_9934398075_R05C02 GSM2430486_9934398075_R06C02
colData names(1): ArrayTypes
Annotation
  array: IlluminaHumanMethylation450k
  annotation: ilmn12.hg19
ADD REPLY

Login before adding your answer.

Traffic: 968 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6