Question: Single channel analysis of two-color agilent array using limma
0
gravatar for pm2015
4.0 years ago by
pm20150
United States
pm20150 wrote:

Hello

I fairly new to limma for analyzing two-color microarray data. I am trying to analyze a somewhat unusually designed experiment. Here's the design:

Filename           Cy3          Cy5

File1                M_cont      M_cont

File2                M_10hr      M_10hr

File1                M_14hr      M_14hr

File1                P_cont      P_cont

File1                M_10hr      M_10hr

File1                M_hr      M_14hr

I am following the instructions on limma userguide chapter 12. However, I keep getting the following error after this step:

corfit <- intraspotCorrelation(MA, design)

Warning messages:
1: In statmod::remlscore(y, X, Z) : reml: Max iterations exceeded
2: In statmod::remlscore(y, X, Z) : reml: Max iterations exceeded


Please help!

ADD COMMENTlink modified 4.0 years ago by Gordon Smyth38k • written 4.0 years ago by pm20150
1

That's not an error, it's a warning. It's just telling you that for two genes, the remlscore function in the statmod package (that's called internally by intraspotCorrelation) wasn't able to converge to a solution. You should still be able to proceed with the rest of the analysis, using the corfit object.

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by Aaron Lun24k

Hi Aaron

Thanks for your reply. I thought so too but I get the following error in the next step:

 >fit <- lmscFit(MA, design, correlation=corfit$consensus)

Error in if (abs(correlation) >= 1) stop("correlation must be strictly between -1 and 1") :
  missing value where TRUE/FALSE needed

> corfit$consensus
[1] NaN

Any ideas why this might be?

ADD REPLYlink written 4.0 years ago by pm20150

Well, for future reference, it would better to post code up to the error, otherwise you'll just confuse people.

Now, I don't do a lot of two-color microarray analyses, but it seems to me that your two-color data set doesn't really use the two colors at all. Each file uses the same condition for both dyes, which defeats the purpose of a two-color setup. I can't imagine the design you got out of modelMatrix would look particularly healthy.

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by Aaron Lun24k

Thanks for replying. I am just analyzing a dataset from a microarray designed by someone else in lab. I have worked with a lot of affymetrix datasets in the past. I was also very surprised by why this array was done this way. Doesn't make much sense to me either. Anyhow trying to see if I can still get something meaningful out of this. 

ADD REPLYlink written 4.0 years ago by pm20150
Answer: Single channel analysis of two-color agilent array using limma
0
gravatar for Gordon Smyth
4.0 years ago by
Gordon Smyth38k
Walter and Eliza Hall Institute of Medical Research, Melbourne, Australia
Gordon Smyth38k wrote:

First, can you confirm that you really do have RNA from the same treatment condition hybridized to both channels of each array? As Aaron has remarked, that defeats the purpose of two colour microarrays and drastically decreases the precision with which you can detect DE genes. Why would you design the hybridizations like that?

Anway, for intraspotCorrelation() to return an NA consensus correlation, there must be a serious problem with either the data or the design matrix. You haven't given us enough information to diagnose any problem. If you want more help could you please:

1. Show the correct targets frame (the one in your original post looks to have a typo or two).

2. Show the code you used to construct the design matrix.

3. Show the output from summary(corfit$atanh.correlation)

4. Show the output from summary(MA$M) and summary(MA$A).

Please give the extra information as a comment on this answer (rather than as a new answer).

ADD COMMENTlink modified 4.0 years ago • written 4.0 years ago by Gordon Smyth38k

Hi Gordon

Thanks for your reply. From the name of the files it does seem like RNA from the same condition has been hybridized to both channels. I do not have any further info on the hybridizations besides the names of the files themselves. I really don't understand why they were done this way.

Sorry about the typos in the targets frame. Here's my correct targets frame:

           SlideNumber
PANC1-0              1
PANC1-10             2
PANC1-14             3
MIAPAC2-0            4
MIAPAC2-10           5
MIAPAC2-14           6
                                                               FileName
PANC1-0      jhu_252665229289_S01_GE2_107_Sep09_1_3_POG-CY3_POR-CY5.txt
PANC1-10   jhu_252665229289_S01_GE2_107_Sep09_1_4_P10G-CY3_P10R-CY5.txt
PANC1-14   jhu_252665229290_S01_GE2_107_Sep09_1_1_P14G-CY3_P14R-CY5.txt
MIAPAC2-0    jhu_252665229290_S01_GE2_107_Sep09_1_2_MOG-CY3_MOR-CY5.txt
MIAPAC2-10 jhu_252665229290_S01_GE2_107_Sep09_1_3_M10G-CY3_M10R-CY5.txt
MIAPAC2-14 jhu_252665229290_S01_GE2_107_Sep09_1_4_M14G-CY3_M14R-CY5.txt
                 Name Cy3 Cy5
PANC1-0       PANC1-0  P0  P0
PANC1-10     PANC1-10 P10 P10
PANC1-14     PANC1-14 P14 P14
MIAPAC2-0   MIAPAC2-0  M0  M0
MIAPAC2-10 MIAPAC2-10 M10 M10
MIAPAC2-14 MIAPAC2-14 M14 M14

>RG <- backgroundCorrect(RG, method="normexp", offset=50)

>MA <- normalizeWithinArrays(RG, method="loess")

>MA <- normalizeBetweenArrays(MA, method="Aquantile")

>targets2 <- targetsA2C(targets)

 >u <- unique(targets2$Target)       

>f <- factor(targets2$Target, levels=u)

>design <- model.matrix(~0+f)

> colnames(design) <- u
>  corfit <- intraspotCorrelation(MA, design)

Warning messages:
1: In statmod::remlscore(y, X, Z) : reml: Max iterations exceeded
2: In statmod::remlscore(y, X, Z) : reml: Max iterations exceeded
3: In statmod::remlscore(y, X, Z) : reml: Max iterations exceeded
4: In statmod::remlscore(y, X, Z) : reml: Max iterations exceeded
5: In statmod::remlscore(y, X, Z) : reml: Max iterations exceeded
6: In statmod::remlscore(y, X, Z) : reml: Max iterations exceeded
7: In statmod::remlscore(y, X, Z) : reml: Max iterations exceeded
8: In statmod::remlscore(y, X, Z) : reml: Max iterations exceeded
9: In statmod::remlscore(y, X, Z) : reml: Max iterations exceeded

> fit <- lmscFit(MA, design, correlation=corfit$consensus)
Error in if (abs(correlation) >= 1) stop("correlation must be strictly between -1 and 1") : 
  missing value where TRUE/FALSE needed

>summary(corfit$atanh.correlation)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
     NA      NA      NA     NaN      NA      NA   44495 

>summary(MA$M)
 jhu_252665229289_S01_GE2_107_Sep09_1_3_POG-CY3_POR-CY5
 Min.   :-1.290555                                     
 1st Qu.:-0.077030                                     
 Median : 0.006482                                     
 Mean   : 0.005537                                     
 3rd Qu.: 0.087279                                     
 Max.   : 0.969519                                     
 jhu_252665229289_S01_GE2_107_Sep09_1_4_P10G-CY3_P10R-CY5
 Min.   :-1.672056                                       
 1st Qu.:-0.094154                                       
 Median : 0.006573                                       
 Mean   : 0.009162                                       
 3rd Qu.: 0.108370                                       
 Max.   : 2.211291                                       
 jhu_252665229290_S01_GE2_107_Sep09_1_1_P14G-CY3_P14R-CY5
 Min.   :-2.303812                                       
 1st Qu.:-0.124000                                       
 Median : 0.003292                                       
 Mean   :-0.002939                                       
 3rd Qu.: 0.125831                                       
 Max.   : 1.403780                                       
 jhu_252665229290_S01_GE2_107_Sep09_1_2_MOG-CY3_MOR-CY5
 Min.   :-1.566965                                     
 1st Qu.:-0.069998                                     
 Median : 0.003873                                     
 Mean   : 0.006056                                     
 3rd Qu.: 0.079320                                     
 Max.   : 1.666165                                     
 jhu_252665229290_S01_GE2_107_Sep09_1_3_M10G-CY3_M10R-CY5
 Min.   :-3.841923                                       
 1st Qu.:-0.108153                                       
 Median : 0.003021                                       
 Mean   :-0.009084                                       
 3rd Qu.: 0.109940                                       
 Max.   : 1.922013                                       
 jhu_252665229290_S01_GE2_107_Sep09_1_4_M14G-CY3_M14R-CY5
 Min.   :-6.689630                                       
 1st Qu.:-0.137185                                       
 Median : 0.005364                                       
 Mean   : 0.014088                                       
 3rd Qu.: 0.154397                                       
 Max.   : 6.796540                                       
> summary(MA$A)
 jhu_252665229289_S01_GE2_107_Sep09_1_3_POG-CY3_POR-CY5
 Min.   : 5.690                                        
 1st Qu.: 6.096                                        
 Median : 6.945                                        
 Mean   : 7.732                                        
 3rd Qu.: 8.858                                        
 Max.   :17.765                                        
 jhu_252665229289_S01_GE2_107_Sep09_1_4_P10G-CY3_P10R-CY5
 Min.   : 5.690                                          
 1st Qu.: 6.097                                          
 Median : 6.945                                          
 Mean   : 7.732                                          
 3rd Qu.: 8.858                                          
 Max.   :17.765                                          
 jhu_252665229290_S01_GE2_107_Sep09_1_1_P14G-CY3_P14R-CY5
 Min.   : 5.700                                          
 1st Qu.: 6.097                                          
 Median : 6.945                                          
 Mean   : 7.732                                          
 3rd Qu.: 8.858                                          
 Max.   :17.765                                          
 jhu_252665229290_S01_GE2_107_Sep09_1_2_MOG-CY3_MOR-CY5
 Min.   : 5.690                                        
 1st Qu.: 6.097                                        
 Median : 6.945                                        
 Mean   : 7.732                                        
 3rd Qu.: 8.858                                        
 Max.   :17.765                                        
 jhu_252665229290_S01_GE2_107_Sep09_1_3_M10G-CY3_M10R-CY5
 Min.   : 5.690                                          
 1st Qu.: 6.097                                          
 Median : 6.945                                          
 Mean   : 7.732                                          
 3rd Qu.: 8.858                                          
 Max.   :17.765                                          
 jhu_252665229290_S01_GE2_107_Sep09_1_4_M14G-CY3_M14R-CY5
 Min.   : 5.690                                          
 1st Qu.: 6.097                                          
 Median : 6.945                                          
 Mean   : 7.732                                          
 3rd Qu.: 8.858                                          
 Max.   :17.765          

summary(MA$A)
 jhu_252665229289_S01_GE2_107_Sep09_1_3_POG-CY3_POR-CY5 jhu_252665229289_S01_GE2_107_Sep09_1_4_P10G-CY3_P10R-CY5
 Min.   : 5.690                                         Min.   : 5.690                                          
 1st Qu.: 6.096                                         1st Qu.: 6.097                                          
 Median : 6.945                                         Median : 6.945                                          
 Mean   : 7.732                                         Mean   : 7.732                                          
 3rd Qu.: 8.858                                         3rd Qu.: 8.858                                          
 Max.   :17.765                                         Max.   :17.765                                          
 jhu_252665229290_S01_GE2_107_Sep09_1_1_P14G-CY3_P14R-CY5 jhu_252665229290_S01_GE2_107_Sep09_1_2_MOG-CY3_MOR-CY5
 Min.   : 5.700                                           Min.   : 5.690                                        
 1st Qu.: 6.097                                           1st Qu.: 6.097                                        
 Median : 6.945                                           Median : 6.945                                        
 Mean   : 7.732                                           Mean   : 7.732                                        
 3rd Qu.: 8.858                                           3rd Qu.: 8.858                                        
 Max.   :17.765                                           Max.   :17.765                                        
 jhu_252665229290_S01_GE2_107_Sep09_1_3_M10G-CY3_M10R-CY5
 Min.   : 5.690                                          
 1st Qu.: 6.097                                          
 Median : 6.945                                          
 Mean   : 7.732                                          
 3rd Qu.: 8.858                                          
 Max.   :17.765                                          
 jhu_252665229290_S01_GE2_107_Sep09_1_4_M14G-CY3_M14R-CY5
 Min.   : 5.690                                          
 1st Qu.: 6.097                                          
 Median : 6.945                                          
 Mean   : 7.732                                          
 3rd Qu.: 8.858                                          
 Max.   :17.765 

Hope this helps in the diagnosis of the problem.

ADD REPLYlink modified 4.0 years ago by Gordon Smyth38k • written 4.0 years ago by pm20150
2

Thanks for the corrected targets information. I see the problem now.

The experiment has a different treatment on each array, and no two arrays share the same treatment. This makes it impossible to distinguish inter-array variation from inter-treatment variation, making it also impossible to estimate the intraspot correlation. Hence the function just returns NAs.

I have to tell you that there is no statistically rigorous way to analyze this experiment. With a different treatment on every array, there is essentially no replication. The two channels on each array are wasted as replicates without knowing the intraspot correlation.

The only way to try to rescue this experiment is to put in a preset value for the intraspot correlation. The following article

  http://www.biomedcentral.com/1471-2105/14/165

shows that intraspot correlations tend to be between about 0.65 and 0.9 for two colour arrays. So I suggest you put in a value of about 0.8:

fit <- lmscFit(MA, design, correlation=0.8)

and proceed from there.

ADD REPLYlink modified 4.0 years ago • written 4.0 years ago by Gordon Smyth38k

Thanks a lot. I will proceed as you suggested.

ADD REPLYlink written 4.0 years ago by pm20150

Hello,

if I try to put a preset value for the intraspot correlation using your command, I get the following error in R:

Fehler in fit$effects[(fit$rank + 1):ny, , drop = FALSE] :
  Indizierung außerhalb der Grenzen

It's in German, but 'Fehler' means 'error' and 'Indizierung außerhalb der Grenzen' could be translated into 'indexing outside borders'.

I'm not sure whether this indicates a R or limma error?

I would like to perform "Separate Channel Analysis of Two-Color Data" (Chap 12 in the limma manual) but don't have replicates in this experiment.

I would appreciate any suggestions.

Best regards

Alfons

 

ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by a.weig0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 292 users visited in the last hour