Question

Limma import .gpr file error

0

Entering edit mode

Guillaume Robert • 0

@guillaume-robert-18902

Last seen 4.1 years ago

France/Nantes/Inovarion

Hi all,

Sorry for this maybe naive question, but I never analysed microarray before.

I'm trying to import .gpr files from a public dataset (https://www.ebi.ac.uk/arrayexpress/experiments/E-TABM-104/) in limma thanks to the function read.maimages().

As specified in the documentation I've created a "target" file with the file names and the RNA information on cy3 and cy3, and implemented a function to filter flags, then I try to read the .gpr files that I've downloaded :

targets=readTargets("targets.csv",sep="\t", row.names="Name")

f <- function(x) as.numeric(x$Flags > -49)

RG <- read.maimages(files=targets$FileName,source="genepix",columns=list(R="GenePix:F635 Mean",G="GenePix:F532 Mean",Rb="GenePix:B635 Median",Gb="GenePix:B532 Median"), wt.fun=f)

Unfortunately I get the following error message :

Error in readGPRHeader(fullname) : 
  File is not in Axon Text File (ATF) format

After some research I think it comes from the fact that the .gpr files have no headers. But I'm not really sure if it comes from that, and if it is the case, what "standard" header I could put to have the files imported.

If anyone has encountered the same issue and would advise me it would be really helpful.

Thanks,

limma gpr microarray • 1.8k views

ADD COMMENT • link updated 5.3 years ago by Gordon Smyth 50k • written 5.3 years ago by Guillaume Robert • 0

score 4 · Accepted Answer · 2019-01-18

Here are the headers:

> scan("25A_12477146.gpr", nlines = 1, what = "c", sep = "\t")
Read 54 items
 [1] " metaColumn"                        "metaRow"                           
 [3] "column"                             "row"                               
 [5] "Reporter identifier"                "GenePix:% > B532+1SD"              
 [7] "GenePix:% > B532+2SD"               "GenePix:% > B635+1SD"              
 [9] "GenePix:% > B635+2SD"               "GenePix:Autoflag"                  
[11] "GenePix:B Pixels"                   "GenePix:B532"                      
[13] "GenePix:B532 CV"                    "GenePix:B532 Mean"                 
[15] "GenePix:B532 Median"                "GenePix:B532 SD"                   
[17] "GenePix:B635"                       "GenePix:B635 CV"                   
[19] "GenePix:B635 Mean"                  "GenePix:B635 Median"               
[21] "GenePix:B635 SD"                    "GenePix:Circularity"               
[23] "GenePix:Dia."                       "GenePix:F Pixels"                  
[25] "GenePix:F532 % Sat."                "GenePix:F532 CV"                   
[27] "GenePix:F532 Mean"                  "GenePix:F532 Mean - B532"          
[29] "GenePix:F532 Median"                "GenePix:F532 Median - B532"        
[31] "GenePix:F532 SD"                    "GenePix:F532 Total Intensity"      
[33] "GenePix:F635 % Sat."                "GenePix:F635 CV"                   
[35] "GenePix:F635 Mean"                  "GenePix:F635 Mean - B635"          
[37] "GenePix:F635 Median"                "GenePix:F635 Median - B635"        
[39] "GenePix:F635 SD"                    "GenePix:F635 Total Intensity"      
[41] "GenePix:Flags"                      "GenePix:Log Ratio (635/532)"       
[43] "GenePix:Mean of Ratios (635/532)"   "GenePix:Median of Ratios (635/532)"
[45] "GenePix:Normalize"                  "GenePix:Ratio of Means (635/532)"  
[47] "GenePix:Ratio of Medians (635/532)" "GenePix:Ratios SD (635/532)"       
[49] "GenePix:Rgn R? (635/532)"           "GenePix:Rgn Ratio (635/532)"       
[51] "GenePix:SNR 532"                    "GenePix:SNR 635"                   
[53] "GenePix:Sum of Means (635/532)"     "GenePix:Sum of Medians (635/532)"

Which are not standard, so far as I know. So you have to specify:

> columns <- list(G = "GenePix:F532 Median", Gb = "GenePix:B532 Median" ,R = "GenePix:F635 Median" , Rb = "GenePix:B635 Median")
> z <- read.maimages("25A_12477146.gpr", columns = columns)
Read 25A_12477146.gpr 
>

score 2 · Accepted Answer · 2019-01-19

2

Entering edit mode

Gordon Smyth 50k

@gordon-smyth

Last seen 5 hours ago

WEHI, Melbourne, Australia

To reinforce James MacDonald's answer, the raw data files for this experiment on ArrayExpress have been post-processed since being output by GenePix and are no longer in GenePix format. Hence you can't specify source="genepix". Instead you need

cols <- list(R  = "GenePix:F635 Mean,    G  = "GenePix:F532 Mean",
             Rb = "GenePix:B635 Median", Gb = "GenePix:B532 Median")
RG <- read.maimages(targets, annotation="Reporter identifier", columns=cols)

Personally I would skip the wt.fun argument, but it's up to you. The most common flags just mean that a spot has low itensity, but limma is perfectly able to deal with low intensity spots.

ADD COMMENT • link 5.3 years ago Gordon Smyth 50k

0

Entering edit mode

Thank you for your response. Why would you skipped the flag filtering? I've looked into the file and found several probes flagged with either -100, -75 or -50, but I haven't found the meaning of those flags.

ADD REPLY • link 5.3 years ago Guillaume Robert • 0

1

Entering edit mode

Why would you use flag filtering if you don't know what the flags mean?

You shouldn't discard data unless there is strong reason to do so. These flags were designed by GenePix to protect some very simple types of analyses. limma however does a very robust analysis and usually doesn't need this sort of protection. I used to use the flags occasionally but gradually found it wasn't very important.