Question: processing agilent data by limma
gravatar for Agaz Hussain Wani
9 days ago by
Agaz Hussain Wani250 wrote:

I am trying to perform differential expression of  Agilent data GSE9210 by using limma. Following is the example code

target = file.path("/path/to/file")
SDRF <- read.delim(target,check.names=FALSE,stringsAsFactors=FALSE)
x <- read.maimages(SDRF[,"File"], source="agilent", green.only=TRUE )


Some of the problems:

1) There are 58 sample in the study, and it processes only 50 samples from GSM232970-GSM233019.

On reaching sample GSM233020, it gives an error

Error in RG[[a]][, i] <- obj[, columns[[a]]] :
  number of items to replace is not a multiple of replacement length

May be 1) is related to File format for single channel analysis of Agilent microarray data with Limma?

Is there any way to process samples from a single study in one batch.

2) I tried to process the files in batches, means GSM232970-GSM233019

in one batch and the remaining in another batch that worked fine, but it generates character type for second batch for which I am not able to run

y <- backgroundCorrect(x, method="normexp"),
which generates
Error in E -
Eb : non-numeric argument to binary operator

which is an obvious reason. Why it generates character than numeric. Example is shown below.

an object of class "EListRaw"
     GSM233020 GSM233021
[1,] "1452"    "1373.5"
[2,] "77"      "109.5"  
[3,] "86"      "131"    
[4,] "320"     "898"    
[5,] "137.5"   "236"    
14904 more rows ...

     GSM233020 GSM233021
[1,] "45"      "50"     
[2,] "44"      "51"     
[3,] "44"      "51"     
[4,] "44"      "52"     
[5,] "45"      "52"     
14904 more rows ...

GSM233020 GSM233020.txt
GSM233021 GSM233021.txt

  Row Col ProbeUID ControlType    ProbeName        GeneName  SystematicName
1   1   1        0           1 BrightCorner    BrightCorner    BrightCorner
2   1   2        1          -1    (-)3xSLv1 NegativeControl NegativeControl
3   1   3        2           0 A_23_P146576       NM_021996       NM_021996
4   1   4        3           0 A_23_P125016    A_23_P125016    A_23_P125016
5   1   5        4           0  A_23_P28555          STAMBP       NM_006463
3 Homo sapiens globoside alpha-1,3-N-acetylgalactosaminyltransferase 1 (GBGT1), mRNA
4                                                                            Unknown
5             Homo sapiens STAM binding protein (STAMBP), transcript variant 1, mRNA
14904 more rows ...

[1] "agilent"



ADD COMMENTlink modified 6 days ago by Gordon Smyth32k • written 9 days ago by Agaz Hussain Wani250
gravatar for James W. MacDonald
8 days ago by
United States
James W. MacDonald45k wrote:

Providing stylized code that isn't really what you ran is not helpful. Unless you really have a path on your computer called /path/to/file, which would be sort of cool.

As for question #1, you get an error reading in a file. Did you try reading in just that one file? How about all files but that one? I find that I cannot read that one file in, but I can read in all the other files, which doesn't seem like a question for Bioconductor, but instead for the GEO curators.

ADD COMMENTlink written 8 days ago by James W. MacDonald45k


  1.  I tried reading all the files at once, and it stoped at  GSM233020.txt as mentioned above.

  2.  I can read the sample GSM233020.txt, by x <- read.maimages(SDRF[,"File"][51], source="agilent", green.only=TRUE ) but it generated the data as shown above. Even if I run from that sample saperately, it read the files  x <- read.maimages(SDRF[,"File"][51:57], source="agilent", green.only=TRUE ).


ADD REPLYlink modified 6 days ago • written 6 days ago by Agaz Hussain Wani250

No, you can't read GSM233020.txt. Sure, you get a result, but the result is wrong. As I told you in my answer, the end of that file is corrupted.

ADD REPLYlink modified 6 days ago • written 6 days ago by Gordon Smyth32k

Thank you for providing the useful information.

ADD REPLYlink written 6 days ago by Agaz Hussain Wani250
gravatar for Gordon Smyth
6 days ago by
Gordon Smyth32k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth32k wrote:

The file GSM233020.txt on GEO is corrupted, and therefore can't be read correctly by limma or any other program. As James has said, all the other files for that GEO series are fine

I suggest you write to the original authors of the series and ask for the correct raw data file. Alternatively, just omit that file and read all the others.

ADD COMMENTlink modified 6 days ago • written 6 days ago by Gordon Smyth32k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 472 users visited in the last hour