Limma import .gpr file error
2
0
Entering edit mode
@guillaume-robert-18902
Last seen 4.1 years ago
France/Nantes/Inovarion

Hi all,

Sorry for this maybe naive question, but I never analysed microarray before.

I'm trying to import .gpr files from a public dataset (https://www.ebi.ac.uk/arrayexpress/experiments/E-TABM-104/) in limma thanks to the function read.maimages().

As specified in the documentation I've created a "target" file with the file names and the RNA information on cy3 and cy3, and implemented a function to filter flags, then I try to read the .gpr files that I've downloaded :

targets=readTargets("targets.csv",sep="\t", row.names="Name")

f <- function(x) as.numeric(x$Flags > -49)

RG <- read.maimages(files=targets$FileName,source="genepix",columns=list(R="GenePix:F635 Mean",G="GenePix:F532 Mean",Rb="GenePix:B635 Median",Gb="GenePix:B532 Median"), wt.fun=f)

Unfortunately I get the following error message :

Error in readGPRHeader(fullname) : 
  File is not in Axon Text File (ATF) format

After some research I think it comes from the fact that the .gpr files have no headers. But I'm not really sure if it comes from that, and if it is the case, what "standard" header I could put to have the files imported.

If anyone has encountered the same issue and would advise me it would be really helpful.

Thanks,

limma gpr microarray • 1.8k views
ADD COMMENT
4
Entering edit mode
@james-w-macdonald-5106
Last seen 33 minutes ago
United States

Here are the headers:

> scan("25A_12477146.gpr", nlines = 1, what = "c", sep = "\t")
Read 54 items
 [1] " metaColumn"                        "metaRow"                           
 [3] "column"                             "row"                               
 [5] "Reporter identifier"                "GenePix:% > B532+1SD"              
 [7] "GenePix:% > B532+2SD"               "GenePix:% > B635+1SD"              
 [9] "GenePix:% > B635+2SD"               "GenePix:Autoflag"                  
[11] "GenePix:B Pixels"                   "GenePix:B532"                      
[13] "GenePix:B532 CV"                    "GenePix:B532 Mean"                 
[15] "GenePix:B532 Median"                "GenePix:B532 SD"                   
[17] "GenePix:B635"                       "GenePix:B635 CV"                   
[19] "GenePix:B635 Mean"                  "GenePix:B635 Median"               
[21] "GenePix:B635 SD"                    "GenePix:Circularity"               
[23] "GenePix:Dia."                       "GenePix:F Pixels"                  
[25] "GenePix:F532 % Sat."                "GenePix:F532 CV"                   
[27] "GenePix:F532 Mean"                  "GenePix:F532 Mean - B532"          
[29] "GenePix:F532 Median"                "GenePix:F532 Median - B532"        
[31] "GenePix:F532 SD"                    "GenePix:F532 Total Intensity"      
[33] "GenePix:F635 % Sat."                "GenePix:F635 CV"                   
[35] "GenePix:F635 Mean"                  "GenePix:F635 Mean - B635"          
[37] "GenePix:F635 Median"                "GenePix:F635 Median - B635"        
[39] "GenePix:F635 SD"                    "GenePix:F635 Total Intensity"      
[41] "GenePix:Flags"                      "GenePix:Log Ratio (635/532)"       
[43] "GenePix:Mean of Ratios (635/532)"   "GenePix:Median of Ratios (635/532)"
[45] "GenePix:Normalize"                  "GenePix:Ratio of Means (635/532)"  
[47] "GenePix:Ratio of Medians (635/532)" "GenePix:Ratios SD (635/532)"       
[49] "GenePix:Rgn R? (635/532)"           "GenePix:Rgn Ratio (635/532)"       
[51] "GenePix:SNR 532"                    "GenePix:SNR 635"                   
[53] "GenePix:Sum of Means (635/532)"     "GenePix:Sum of Medians (635/532)"

Which are not standard, so far as I know. So you have to specify:

> columns <- list(G = "GenePix:F532 Median", Gb = "GenePix:B532 Median" ,R = "GenePix:F635 Median" , Rb = "GenePix:B635 Median")
> z <- read.maimages("25A_12477146.gpr", columns = columns)
Read 25A_12477146.gpr 
>
ADD COMMENT
0
Entering edit mode

Ok thank you very much it's working now.

ADD REPLY
2
Entering edit mode
@gordon-smyth
Last seen 5 hours ago
WEHI, Melbourne, Australia

To reinforce James MacDonald's answer, the raw data files for this experiment on ArrayExpress have been post-processed since being output by GenePix and are no longer in GenePix format. Hence you can't specify source="genepix". Instead you need

cols <- list(R  = "GenePix:F635 Mean,    G  = "GenePix:F532 Mean",
             Rb = "GenePix:B635 Median", Gb = "GenePix:B532 Median")
RG <- read.maimages(targets, annotation="Reporter identifier", columns=cols)

Personally I would skip the wt.fun argument, but it's up to you. The most common flags just mean that a spot has low itensity, but limma is perfectly able to deal with low intensity spots.

ADD COMMENT
0
Entering edit mode

Thank you for your response. Why would you skipped the flag filtering? I've looked into the file and found several probes flagged with either -100, -75 or -50, but I haven't found the meaning of those flags.

ADD REPLY
1
Entering edit mode

Why would you use flag filtering if you don't know what the flags mean?

You shouldn't discard data unless there is strong reason to do so. These flags were designed by GenePix to protect some very simple types of analyses. limma however does a very robust analysis and usually doesn't need this sort of protection. I used to use the flags occasionally but gradually found it wasn't very important.

ADD REPLY
0
Entering edit mode

Ok I will keep the flagged lines then. Thanks for the informations.

ADD REPLY

Login before adding your answer.

Traffic: 790 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6