Question

Help on Loading AgilentData into LIMMA

0

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 5 hours ago

WEHI, Melbourne, Australia

You need to state the versions of R and limma that you are using. See "While sending mail to the list": Item 4 of the posting guide at http://www.bioconductor.org/docs/postingGuide.html for a suggestion of how to do this. Gordon >[BioC] Help on Loading AgilentData into LIMMA >Nataliya Yeremenko eremenko at science.uva.nl >Tue Nov 1 23:43:30 CET 2005 > >I need help on loading data to LIMMA >I'm working with 44K Agilent microarrays. > >I'm novice in BioConductor, so I'm trying to follow instructions on the >packages >to deal with my data. > >I'm trying to load Feature Extraction raw data with > > RG <- read.maimages(files = targets$fileName, path = loadPath, source >= "agilent") >and data seems to be loaded fine, where targets is experiment >description file. >It is loading all my 20 replicates, >but when I checked RG data set with dim(RG) or show(RG) >I realized that only 6195 rows have been loaded. >It is the same situation independent on how many arrays I'm loading 1 or 28. >Is there limit for number of spots in the package? > > >I've tried to start as well with arrayMagic package >and again didn't succeed to load even single array, >After 10 min of loading I see error message stating >that line 17000 has not proper amount of arguments. > >Thank you in advance for any suggestion > >-- >Dr. Nataliya Yeremenko > >Universiteit van Amsterdam >Faculty of Science >IBED/AMB (Aquatische Microbiologie) >Nieuwe Achtergracht 127 >NL-1018WS Amsterdam >the Netherlands

limma arrayMagic limma arrayMagic • 835 views

ADD COMMENT • link updated 18.5 years ago by Nataliya Yeremenko ▴ 100 • written 18.5 years ago by Gordon Smyth 50k

score 0 · Answer 1 · 2005-11-03

The problem with Agilent special characters (backslash sequences) that Sean mentions was caused by a bug in R 2.1 which has been fixed in R 2.2.0, hence my question about version numbers. You should upgrade to R 2.2.0 and the current version of limma anyway (from CRAN) because versions 2.3.X of limma have a 5-fold speed improvement in reading Agilent files, thanks to work by Marcus Davy. Gordon

score 0 · Answer 2 · 2005-11-08

> Date: Tue, 08 Nov 2005 00:45:45 +0100 > From: Nataliya Yeremenko <eremenko at="" science.uva.nl=""> > Subject: [BioC] Help on Loading AgilentData into LIMMA > To: BioC Mailing List <bioconductor at="" stat.math.ethz.ch=""> > > I'm coming back to my problem of import of Agilent data into the > Bioconductor > limma package. > Version of R is 2.2.0 > Limma as well is the newest possible as I've installed Bioconductor only > two weeks ago. Thanks for reporting the R version. Your limma is not the newest possible however. You presumably have limma 2.2.0 whereas limma 2.3.3 is available on CRAN. Please do upgrade limma from CRAN as I suggested to you a couple of days ago. Not only will you be trying out the current software, but you'll find that limma will read your Agilent files many times faster. See the User's Guide Section 2.1 on the difference between installing from Bioconductor and CRAN. > Each "target" file is Agilent 44K Human oligo microarray, > produced by FeatureExtraction 7.5. > I'm importing data into limma with: > > RG <- read.maimages(files = targets$fileName, path = loadPath, source > = "agilent") > Afterwards checking the dimensions of RG with dim(RG) - 6195 rows only, > with no difference how many target files I've been loading. > > I go further and checked the same function on another data set - > Agilent custom 11K oligo microarrays extracted as well with Feature > Extraction 7.5 > (with the same default settings of Feature extraction procedure as for 44K). > And to my surprise the target files have been loaded completely into LIMMA. > Dim(RG) - 8635 rows. > > So the problem is that of 44K - maybe target files are to big? As Sean Davis has already mentioned, file size is very unlikely to be the problem. There is no size limit. We had guessed before when you hadn't reported your R version that you might be experiencing a known bug in R 2.1 which made it not possible to read Agilent files containing backslash sequences. However you're using R 2.2.0. You seem to be experiencing a new problem we have not seen before. If you email one of your 44K input files to me directly (zip it up first) then we will trouble- shoot it. Gordon > Does anybody have any suggestions? > > -- > Dr. Nataliya Yeremenko > > Universiteit van Amsterdam > Faculty of Science > IBED/AMB (Aquatische Microbiologie) > Nieuwe Achtergracht 127 > NL-1018WS Amsterdam > the Netherlands > > tel. + 31 20 5257089 > fax + 31 20 5257064

score 0 · Answer 3 · 2005-11-08

Dear Gordon Dear David Thanx a lot for the suggestion - it works fine now. I'm only starter in the field so much more questions will come soon, meanwhile I'll try different possibilities offered by LIMMA. I downloaded and installed manually newer version of LIMMA from CRAN as you suggested. Regards -- Dr. Nataliya Yeremenko Universiteit van Amsterdam Faculty of Science IBED/AMB (Aquatische Microbiologie) Nieuwe Achtergracht 127 NL-1018WS Amsterdam the Netherlands tel. + 31 20 5257089 fax + 31 20 5257064

score 0 · Answer 4 · 2005-11-09

Dear Nataliya, David Pritchard (U Washington) has written off-line and has diagnosed the problem. The Agilent 44K arrays contain a single double-quote character on the last line that limma reads, which converts the rest of the file into a gene description. The solution is to tell limma not to look for quote characters in your file, that is you should use RG <- read.maimages(files=targets$fileName, path=loadPath, source="agilent", quote="") Hopefully this will work for you Gordon > Date: Tue, 08 Nov 2005 00:45:45 +0100 > From: Nataliya Yeremenko <eremenko at="" science.uva.nl=""> > Subject: [BioC] Help on Loading AgilentData into LIMMA > To: BioC Mailing List <bioconductor at="" stat.math.ethz.ch=""> > > I'm coming back to my problem of import of Agilent data into the > Bioconductor > limma package. > Version of R is 2.2.0 > Limma as well is the newest possible as I've installed Bioconductor only > two weeks ago. > > Each "target" file is Agilent 44K Human oligo microarray, > produced by FeatureExtraction 7.5. > I'm importing data into limma with: > > RG <- read.maimages(files = targets$fileName, path = loadPath, source > = "agilent") > Afterwards checking the dimensions of RG with dim(RG) - 6195 rows only, > with no difference how many target files I've been loading. > > I go further and checked the same function on another data set - > Agilent custom 11K oligo microarrays extracted as well with Feature > Extraction 7.5 > (with the same default settings of Feature extraction procedure as for 44K). > And to my surprise the target files have been loaded completely into LIMMA. > Dim(RG) - 8635 rows. > > So the problem is that of 44K - maybe target files are to big? > > Does anybody have any suggestions? > > -- > Dr. Nataliya Yeremenko > > Universiteit van Amsterdam > Faculty of Science > IBED/AMB (Aquatische Microbiologie) > Nieuwe Achtergracht 127 > NL-1018WS Amsterdam > the Netherlands > > tel. + 31 20 5257089 > fax + 31 20 5257064