Question: processing agilent data by limma
gravatar for Agaz Hussain Wani
8 months ago by
Agaz Hussain Wani260 wrote:

I am trying to perform differential expression of  Agilent data GSE9210 by using limma. Following is the example code

target = file.path("/path/to/file")
SDRF <- read.delim(target,check.names=FALSE,stringsAsFactors=FALSE)
x <- read.maimages(SDRF[,"File"], source="agilent", green.only=TRUE )


Some of the problems:

1) There are 58 sample in the study, and it processes only 50 samples from GSM232970-GSM233019.

On reaching sample GSM233020, it gives an error

Error in RG[[a]][, i] <- obj[, columns[[a]]] :
  number of items to replace is not a multiple of replacement length

May be 1) is related to File format for single channel analysis of Agilent microarray data with Limma?

Is there any way to process samples from a single study in one batch.

2) I tried to process the files in batches, means GSM232970-GSM233019

in one batch and the remaining in another batch that worked fine, but it generates character type for second batch for which I am not able to run

y <- backgroundCorrect(x, method="normexp"),
which generates
Error in E -
Eb : non-numeric argument to binary operator

which is an obvious reason. Why it generates character than numeric. Example is shown below.

an object of class "EListRaw"
     GSM233020 GSM233021
[1,] "1452"    "1373.5"
[2,] "77"      "109.5"  
[3,] "86"      "131"    
[4,] "320"     "898"    
[5,] "137.5"   "236"    
14904 more rows ...

     GSM233020 GSM233021
[1,] "45"      "50"     
[2,] "44"      "51"     
[3,] "44"      "51"     
[4,] "44"      "52"     
[5,] "45"      "52"     
14904 more rows ...

GSM233020 GSM233020.txt
GSM233021 GSM233021.txt

  Row Col ProbeUID ControlType    ProbeName        GeneName  SystematicName
1   1   1        0           1 BrightCorner    BrightCorner    BrightCorner
2   1   2        1          -1    (-)3xSLv1 NegativeControl NegativeControl
3   1   3        2           0 A_23_P146576       NM_021996       NM_021996
4   1   4        3           0 A_23_P125016    A_23_P125016    A_23_P125016
5   1   5        4           0  A_23_P28555          STAMBP       NM_006463
3 Homo sapiens globoside alpha-1,3-N-acetylgalactosaminyltransferase 1 (GBGT1), mRNA
4                                                                            Unknown
5             Homo sapiens STAM binding protein (STAMBP), transcript variant 1, mRNA
14904 more rows ...

[1] "agilent"



ADD COMMENTlink modified 8 months ago by Gordon Smyth35k • written 8 months ago by Agaz Hussain Wani260
gravatar for James W. MacDonald
8 months ago by
United States
James W. MacDonald48k wrote:

Providing stylized code that isn't really what you ran is not helpful. Unless you really have a path on your computer called /path/to/file, which would be sort of cool.

As for question #1, you get an error reading in a file. Did you try reading in just that one file? How about all files but that one? I find that I cannot read that one file in, but I can read in all the other files, which doesn't seem like a question for Bioconductor, but instead for the GEO curators.

ADD COMMENTlink written 8 months ago by James W. MacDonald48k


  1.  I tried reading all the files at once, and it stoped at  GSM233020.txt as mentioned above.

  2.  I can read the sample GSM233020.txt, by x <- read.maimages(SDRF[,"File"][51], source="agilent", green.only=TRUE ) but it generated the data as shown above. Even if I run from that sample saperately, it read the files  x <- read.maimages(SDRF[,"File"][51:57], source="agilent", green.only=TRUE ).


ADD REPLYlink modified 8 months ago • written 8 months ago by Agaz Hussain Wani260

No, you can't read GSM233020.txt. Sure, you get a result, but the result is wrong. As I told you in my answer, the end of that file is corrupted.

ADD REPLYlink modified 8 months ago • written 8 months ago by Gordon Smyth35k

Thank you for providing the useful information.

ADD REPLYlink written 8 months ago by Agaz Hussain Wani260
gravatar for Gordon Smyth
8 months ago by
Gordon Smyth35k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth35k wrote:

The file GSM233020.txt on GEO is corrupted, and therefore can't be read correctly by limma or any other program. As James has said, all the other files for that GEO series are fine

I suggest you write to the original authors of the series and ask for the correct raw data file. Alternatively, just omit that file and read all the others.

ADD COMMENTlink modified 8 months ago • written 8 months ago by Gordon Smyth35k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 181 users visited in the last hour