Question

How to read Illumina data used in the MACQ project (GSE5350)

0

Entering edit mode

naf • 0

@naf-20627

Last seen 6.7 years ago

Hello,

I am analysing Affymetryx and Illumina data from the MACQ project for the moment. I used the following commands to read and preprocess (normalisation, background correction,..) the Affymetry CEL files:

files <- list.files('DONNEES-AFFYMETRIX/')
files <- paste0('DONNEES-AFFYMETRIX/',files)
rawfiles <- ReadAffy(filenames=files)
expressionset <- gcrma(rawfiles)

I don't know what commands to use to read AND preprocess ILLUMINA file (eg. ILM1A1.txt ) downloaded from NCBI:

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE5350 :

I started with the same commands as for the Affymetrix cell files;namely

files <- list.files('DONNEES-ILLUMINA/')
files <- paste0('DONNEES-ILLUMINA/',files)

and tried to read the files with read.ilmn but with no success; the file content looks like this:

getID   ILM_1_A1    BEAD_STDEV-A1   Avg_NBEADS-A1   Detection-A1
GI_10047089-S   59.4    3.3 45  0.8431114
GI_10047091-S   90.4    3.1 50  0.99802241
GI_10047093-S   539.8   13.5    43  1
GI_10047099-S   563.6   21.2    32  1

I read about the existence of "rsn" "ssn" "lumiR" and "lumiR.batch" for preprocessing( normalisation...) but couldn't use them as I can't read the data to start with. I have spent quite a long time searching the internet for answers but couldn't find any. Could you perhaps help me with that?? It would be really great!

microarray normalization Illumina data MACQ project gene expression • 1.9k views

ADD COMMENT • link updated 6.7 years ago by Gordon Smyth 53k • written 6.7 years ago by naf • 0

0

Entering edit mode

It is worth noting the read.ilmn is a limma package function, while the other functions you mention are in the lumi package.

ADD REPLY • link 6.7 years ago Gordon Smyth 53k

Gordon Smyth · Answer 1 · 2019-05-06

0

Entering edit mode

Wei Shi ★ 3.6k

@wei-shi-2183

Last seen 6 days ago

Australia/Melbourne

The MAQC Illumina microarray data deposited on GEO do not have the original format. So the default parameter values of read.ilmn function won't work. Below is an example command to read in one of the files:

library(limma)
x <- read.ilmn("ILM_1_A1.txt",probeid="TargetID",expr="ILM_")

ADD COMMENT • link updated 6.7 years ago by Gordon Smyth 53k • written 6.7 years ago by Wei Shi ★ 3.6k

0

Entering edit mode

thank you very much Wei Shi, but how would you read the batch files, similarly to the affymetrix files where I did:

files <- list.files('DONNEES-AFFYMETRIX/')
files <- paste0('DONNEES-AFFYMETRIX/',files)
rawfiles <- ReadAffy(filenames=files)
expressionset <- gcrma(rawfiles)

What would be the equivalent for the Illumina files to be read together in a batch.(above in the last two code lines: rawfiles....) Also do I need to preprocess the data before I use it? (normalisation etc....)?

thanks a lot for your help, I have been looking everywhere for a solution!

ADD REPLY • link updated 6.7 years ago by Gordon Smyth 53k • written 6.7 years ago by naf • 0

score 0 · Answer 2 · 2019-05-06

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 2 hours ago

WEHI, Melbourne, Australia

There isn't a ready-made solution for reading lots of single-array Illumina data files, as from GEO, because Illumina raw data files normally have data from all the microarrays in one file. The beauty of R however is that you can easily solve problems like this in a few lines of programming. In this case, just read all the files separately with Wei's code and store them in a list:

library(limma)
Input <- list()
for (f in files) Input[[f]] <- read.ilmn(f, probeid="TargetID", expr="ILM_")

Then cbind them together into one EListRaw object:

x <- do.call(cbind, Input)

Then normalize:

y <- neqc(x)

Now you're ready to analyze. y is an EList object and the normalized log-expression values are in y$E. Most limma functions will operate on y as an object.

The above code runs in a few seconds on my PC.

ADD COMMENT • link 6.7 years ago Gordon Smyth 53k

0

Entering edit mode

Gordon, just so great! thanks a lot for your answer. I am busy trying it out now...

ADD REPLY • link 6.7 years ago naf • 0

0

Entering edit mode

it is actually working!! Thank you Gordon, you saved my life :-).
Now I just need to harmonize probe names between Affymetrix and Illumina platforms. There are packages for that. What a long process before working on the actual data and getting results!

ADD REPLY • link 6.7 years ago naf • 0

0

Entering edit mode

I'm glad it worked for you. If my answer solved for the problem for you, then you could mark my answered as "accepted" by clicking on the icon next to my answer. Leaving the answer "unliked" and "unaccepted" is interpreted as indicating you didn't find it helpful. It's up to you of course but that's how the site is intended to work.

ADD REPLY • link 6.7 years ago Gordon Smyth 53k