Probe summerization of Human HT-12 V4 BeadChip arrays
Entering edit mode
Seymoo • 0
Last seen 3.4 years ago

I am not familiar with Iluumina arrays I need some hints because I am trying to work with a data set from Human HT-12 V4 BeadChip array deposited at GEO : "GSE73255"

I am following to 2 approaches to get the data

explaned in Beadarray package:


url <- "" download.file(paste(url, filenm, sep=""), destfile=filenm)

gse <- getGEO(filename=filenm)


explained in GEO:

gset <- getGEO("GSE73255", GSEMatrix =TRUE, getGPL=FALSE)
if (length(gset) > 1) idx <- grep("GPL6947", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]
gset <- exprs(gset)

based on the pData(gset)$data_processing this file has been normalized with by Bioconductor (3.0) lumi pipeline with loess normalization , if I am not mistaken?!

When I try to summarize the expression to have one probe per gene using beadarray as follow:



summaryData <- as(gse, "ExpressionSetIllumina") orsummaryData <- as(gset, "ExpressionSetIllumina")

I get error Error in object@channelData[[1]] : subscript out of bounds in R

What am I doing wrong at this stage??

I also like to know if I can use the RAW data and perform RMA normalization on this type of data?

I appreciate if anyone could help me with the answer.


beadarray bead chip illuminahumanv4.db normalization RMA • 1.2k views
Entering edit mode

I can only say I use limma code for importing and normalising this type of array and then use the limma avereps function to average to genes, it is easy, I never got on well with beadarray for some reason.

Entering edit mode

Thanks for the hints Chris! I have always been working with Affy arrays so I have not much of idea about the the beadarrays. But I am gonna look into what you have suggested.

Entering edit mode

What version of Bioconductor are you using? It seems fine for me on Bioconductor 3.6.


gse <- getGEO("GSE33126")[[1]]
eset <- as(gse, "ExpressionSetIllumina")

I wouldn't recommend averaging the probes for the same gene though. Some of the probes on these arrays can be badly annotated, so by averaging you can dilute the signal for the gene. If you really want one measurement for a gene, what I usually do is pick the probe with the highest variance.

By converting the GEOquery object to a beadarray one, you get all the information about the probe annotation



Entering edit mode

Hi @Mark,

I am using

Using Bioconductor 3.4 (BiocInstaller 1.24.0), R 3.3.2 (2016-10-31)

I have not manage to solve the problem yet. I manage to download the data matrix with

gse <- getGEO("GSE73255", GSEMatrix = FALSE)


eset <- as(gse, "ExpressionSetIllumina")

gives previous error!
Would it be possible for you to try with `GSE73255` instead? I also appreciate if you could explain how can I proceed to pick the probe with highest variance for each gene? 


Entering edit mode
Last seen 36 minutes ago
United States

These data are from Illumina arrays, so by definition you cannot run RMA! That algorithm is intended for Affymetrix arrays, not Illumina.

You can use getGEOSuppFiles to download the raw data, but those data are simply a file where they have summarized the beads to an average detection value, as well as the detection p-value, so you don't get the IDAT files, and you will have to figure out how to stuff those data into a useful container. Probably the easiest thing to do would be to extract the AVG_Signal columns and put into a limma EList object, and then normalize using a loess normalization.

If you don't know what all that means, you would be better off to find somebody local who can help, as this is a non-trivial exercise for a newcomer.


Login before adding your answer.

Traffic: 446 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6