I am reading Illumina Human HT-12 v4 Expression BeadChip with read.idat from the limma package. While the reading works apparently without problems, the resulting object has lots (3270, to be exact) of empty strings for genes$Symbols.
What may be causing this?
> idatfiles <- list.files( path = "../array", pattern = ".idat$", full.names = TRUE ) > bgxfile <- list.files( path = "../array", pattern = ".bgx$", full.names = TRUE ) > x <- read.idat( idatfiles, bgxfile, dateinfo = T ) > length( which( y$genes$Symbol == "", arr.ind = F ) ) [1] 3270 > y$genes[8446,] Probe_Id Array_Address_Id Symbol 8446 ILMN_1906423 5310327
And here is one example of the correnponding annotation from the bgx file:
Homo sapiens Unigene Hs.390407 ILMN_89369 HS.390407 Hs.390407 Hs.390407 27828963 BX097705 ILMN_1906423 0005310327 S 640 GAGAGGCAGGGTGAAGAGGTCGAAGGAGCCTGAGTTAGCAGGGATGAGCA 2 - 87520225-87520274 BX097705 NCI_CGAP_Kid5 Homo sapiens cDNA clone IMAGp998E053890, mRNA sequence
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux stretch/sid
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets methods
[9] base
other attached packages:
[1] GO.db_3.3.0 SummarizedExperiment_1.2.3 GenomicRanges_1.24.3
[4] GenomeInfoDb_1.8.7 RColorBrewer_1.1-2 pheatmap_1.0.8
[7] ggplot2_2.2.0 pathview_1.12.0 gage_2.22.0
[10] org.Hs.eg.db_3.3.0 AnnotationDbi_1.34.4 IRanges_2.6.1
[13] S4Vectors_0.10.3 Biobase_2.32.0 BiocGenerics_0.18.0
[16] illuminaio_0.14.0 limma_3.28.21
loaded via a namespace (and not attached):
[1] Rcpp_0.12.8 plyr_1.8.4 XVector_0.12.1 tools_3.3.2
[5] zlibbioc_1.18.0 digest_0.6.10 base64_2.0 RSQLite_1.1
[9] memoise_1.0.0 tibble_1.2 gtable_0.2.0 png_0.1-7
[13] KEGGgraph_1.30.0 graph_1.50.0 DBI_0.5-1 Rgraphviz_2.16.0
[17] curl_2.3 httr_1.2.1 Biostrings_2.40.2 grid_3.3.2
[21] R6_2.2.0 XML_3.98-1.5 org.Bt.eg.db_3.3.0 scales_0.4.1
[25] KEGGREST_1.12.3 assertthat_0.1 colorspace_1.3-1 openssl_0.9.5
[29] lazyeval_0.2.0 munsell_0.4.3
Thanks. I will investigate some other probes, but now I am less anxious abut the subject.