filtering agilent microarray data using limma in R
1
1
Entering edit mode
@maximiliancollatz-11373
Last seen 7.7 years ago

For the comparison of fold changes between a microarray and an RNA-seq experiment i need one fold change per gene from the microarray data. I'm working with agilent microarray data using the limma package. Question one is, is the normalisation afffected by the filtering and hence has to be done before? Question two, how to filter the data. The data is read and stored using the following command:

raw <- read.maimages(targets, source = 'agilent',green.only = TRUE, other.columns = c('gPValFeatEqBG'))

The raw object then looks like this:

> str(raw)
Formal class 'EListRaw' [package "limma"] with 1 slot
  ..@ .Data:List of 6
  .. ..$ : num [1:42303, 1:26] 49018 1472 1471 1440 1405 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : NULL
  .. .. .. ..$ : chr [1:26] "GSM1150975_48h_LDLDL_CT0_1" "GSM1150976_48h_LDLDL_CT1_1" "GSM1150977_48h_LDLDL_CT3_1" "GSM1150965_48h_LDLDL_CT5_1" ...
  .. ..$ : num [1:42303, 1:26] 25 26 26 26 26 26 26 26 26 27 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : NULL
  .. .. .. ..$ : chr [1:26] "GSM1150975_48h_LDLDL_CT0_1" "GSM1150976_48h_LDLDL_CT1_1" "GSM1150977_48h_LDLDL_CT3_1" "GSM1150965_48h_LDLDL_CT5_1" ...
  .. ..$ :'data.frame':	26 obs. of  3 variables:
  .. .. ..$ SampleNumber: int [1:26] 1 2 3 4 5 6 7 8 9 10 ...
  .. .. ..$ FileName    : chr [1:26] "GSM1150975_48h_LDLDL_CT0_1.txt" "GSM1150976_48h_LDLDL_CT1_1.txt" "GSM1150977_48h_LDLDL_CT3_1.txt" "GSM1150965_48h_LDLDL_CT5_1.txt" ...
  .. .. ..$ Condition   : chr [1:26] "CT0" "CT1" "CT3" "CT5" ...
  .. ..$ :'data.frame':	42303 obs. of  10 variables:
  .. .. ..$ Row           : int [1:42303] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..$ Col           : int [1:42303] 1 2 3 4 5 6 7 8 9 10 ...
  .. .. ..$ Start         : int [1:42303] 0 0 0 0 0 0 0 0 0 0 ...
  .. .. ..$ Sequence      : chr [1:42303] "" "" "" "" ...
  .. .. ..$ ProbeUID      : int [1:42303] 0 1 1 1 1 1 1 1 1 1 ...
  .. .. ..$ ControlType   : int [1:42303] 1 1 1 1 1 1 1 1 1 1 ...
  .. .. ..$ ProbeName     : chr [1:42303] "GE_BrightCorner" "DarkCorner" "DarkCorner" "DarkCorner" ...
  .. .. ..$ GeneName      : chr [1:42303] "GE_BrightCorner" "DarkCorner" "DarkCorner" "DarkCorner" ...
  .. .. ..$ SystematicName: chr [1:42303] "GE_BrightCorner" "DarkCorner" "DarkCorner" "DarkCorner" ...
  .. .. ..$ Description   : chr [1:42303] "" "" "" "" ...
  .. ..$ : chr "agilent"
  .. ..$ :List of 1
  .. .. ..$ gPValFeatEqBG: num [1:42303, 1:26] 0.00 6.52e-03 5.08e-02 1.76e-04 1.32e-08 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : chr [1:26] "GSM1150975_48h_LDLDL_CT0_1" "GSM1150976_48h_LDLDL_CT1_1" "GSM1150977_48h_LDLDL_CT3_1" "GSM1150965_48h_LDLDL_CT5_1" ...

 

How can i filter this object to only keep the entry for each gene with the best pValue?(keep only the best probe)

Greetings Max

microarray limma • 1.1k views
ADD COMMENT
0
Entering edit mode

Maybe you need to explain more of the design and your strategy. e.g., is this single or dual channel data? What kind of normalization do you need to do? Loess? And why do you need to filter?

If you want fold changes, just compare the two groups with each other. If you want to use limma, these (log2) FC will apear in the toptables.

ADD REPLY
0
Entering edit mode
@gordon-smyth
Last seen 1 hour ago
WEHI, Melbourne, Australia

Hi Max,

First off, the customized show function for an EListRaw object would be much easier to read:

show(raw)

You should normalize the data before you start filter any probes, except possibly you might remove control probes. For example you might use:

x <- backgroundCorrect(raw, method="normexp")
y <- normalizeBetweenArrays(x, method="quantile")

Then you might have a look at your data:

plotMD(y, column=1, status=y$genes$ControlType)

You can't choose the probe with the best p-value for each gene until you compute a p-value. At the moment, you have read a detection p-value for each observation, but this gives a series of detection p-values for each probe instead of one summary p-value. What were you hoping to do?

ADD COMMENT

Login before adding your answer.

Traffic: 441 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6