Entering edit mode
Peter Davidsen
▴
210
@peter-davidsen-4584
Last seen 9.2 years ago
Dear List,
Although I do realise that my question has more to do with actual data
interpretation that coding using BioC packages, I'm hoping for some
input from other users with experience in microarray data analysis.
I order to support my explanation below, I have made a pdf with
diagnostic plots. I will refer to specific slides as I go along. The
presentation can be downloaded here: https://db.tt/jBqPNxIN
At the moment I'm analysing some microarray data as part of a
collaboration. Unfortunately, I have very little knowledge about the
actual generation/processing of these samples which could help address
my question.
By doing a boxplot on the raw Affymetrix chip data (from the U133plus2
platform), I noticed 2 'batches' based on differences in signal
intensities. Hierarchical clustering using all probesets on the array
supports this devision (Page 1 and 2). Noteworthy, this separation
into batches (i.e. a high and a low intensity batch) can partially be
traced back to the ScanDate of the arrays. That is, the ~100 samples
were scanned over three consecutive days; all samples scanned on the
first day belong to the high intensity batch whereas all samples
scanned on day 3 belong to the low intensity batch. Noteworthy, around
half of the samples scanned on day 2 fall into the high and low
intensity batch, respectively.
When I do a RLE plot (Page 3 - top), the median value for most of the
samples from the low intensity batch is between 0.1 and 0.2 (and not
zero as expected). Further, whereas ~40% of the probesets are called
"present" in the high intensity batch using the simpleaffy package,
only around ~30-35% are called present in the low intensity batch
(Page 3 - bottom).
Now, when I do boxplots specifically for the AFFX control probesets, I
discovered that the intensity is in fact higher in all low intensity
samples (Page 4).
Furthermore, when I focus on the Affy hybridization controls (i.e.
bioB, bioC, BioD and creX) the line plot looks good and the signal
intensity is comparable between samples in the two batches (Page 5,
left side). If I instead plot the poly-A controls I again see a
significant difference in intensity between batches (with
low-intensity samples having a higher signal). In addition, the signal
values consistently follow the order Phe<lys<thr<dap (page="" 5,="" right="" side).="" nb:="" i'm="" a="" bit="" unsure="" as="" to="" the="" importance="" of="" the="" latter="" observation.="" the="" qc="" plots="" presented="" above="" suggest="" to="" me="" that="" the="" rna="" from="" the="" low="" intensity="" samples="" could="" potentially="" suffer="" from="" a="" rna="" degradation="" issue.="" however,="" both="" the="" 3'="" 5'="" ratios="" for="" beta-actin="" and="" gapdh="" as="" well="" as="" rna="" degradation="" plots="" using="" affyplm="" do="" not="" support="" my="" assumption="" regarding="" degraded="" rna="" (pages="" 6="" and="" 7).="" in="" fact,="" the="" ratios="" for="" gapdh="" indicates="" a="" higher="" signal="" intensity="" in="" the="" 5-prime="" end,="" which="" i="" find="" a="" bit="" odd.="" however,="" when="" i="" instead="" take="" advantage="" of="" the="" recent="" affyrnadegradation="" package="" i="" do="" get="" a="" small="" yet="" significant="" difference="" between="" batches="" in="" terms="" of="" the="" computed="" decay="" value="" (aka="" parameter="" d)="" (page="" 8).="" i="" then="" tried="" to="" normalize="" the="" two="" batches="" of="" samples="" independently="" (using="" rma).="" this="" allowed="" me="" to="" compare="" the="" mean="" signal="" intensity="" for="" each="" probeset="" across="" the="" chip="" as="" the="" biological="" samples="" are="" indeed="" comparable="" between="" batches.="" a="" scatterplot="" (page="" 9)="" clearly="" demonstrates="" that="" many="" probesets="" lie="" close="" to="" the="" diagonal="" line="" despite="" the="" overall="" difference="" in="" intensity="" described="" on="" the="" first="" page.="" further,="" by="" correlating="" the="" expression="" of="" specific="" probesets="" to="" an="" established="" physiological="" variable="" it="" is="" apparent="" that="" the="" slight="" drop="" in="" signal="" intensity="" do="" not="" affect="" the="" strong="" association="" to="" the="" physiological="" variable="" (page="" 10).="" if="" i="" instead="" focus="" on="" another="" representative="" probeset--which="" is="" farther="" from="" the="" diagonal="" line--the="" correlation="" to="" the="" same="" physiological="" readout="" is="" clearly="" weaker="" in="" the="" low="" intensity="" batch="" (page="" 11).="" could="" it="" be="" that="" only="" a="" smaller="" subset="" of="" the="" transcripts="" are="" significantly="" affected="" by="" rna="" degradation="" in="" the="" low="" intensity="" samples?="" and="" how="" could="" i="" potentially="" demonstrate="" this?="" in="" relation="" to="" the="" question:="" when="" i="" do="" ma="" plots="" against="" a="" "pseudo="" reference="" chip"="" representing="" the="" probeset-wise="" medians="" across="" all="" ~100="" rma="" normalized="" samples,="" it="" also="" becomes="" apparent="" that="" a="" fraction="" of="" the="" probesets="" for="" most="" of="" the="" low="" intensity="" samples="" lie="" far="" below="" the="" m="0" line="" (see="" page="" 12="" for="" a="" representative="" example).="" however,="" to="" my="" surprise="" only="" a="" very="" small="" fraction="" of="" probesets="" are="" consistently="" below="" m="-1.5." in="" other="" words,="" different="" low-intensity="" samples="" have="" different="" "outlying"="" probesets="" compared="" to="" the="" overall="" median.="" to="" summarize,="" i="" have="" now="" put="" forward="" various="" qc="" plots="" that="" show="" that="" the="" low="" intensity="" samples="" are="" overall="" are="" different.="" as="" i'm="" unsure="" which="" way="" forward="" is="" the="" best="" (nb:="" my="" aim="" to="" the="" do="" a="" standard="" deg="" analysis),="" i="" would="" appreciate="" any="" thoughts="" or="" comments="" from="" members="" of="" this="" list.="" kind="" regards,="" peter="" <="" div="">