Question

What to do with this data? Question on deconfouding and GO analysis

0

Entering edit mode

January Weiner ▴ 370

@january-weiner-3999

Last seen 9.6 years ago

Hello, I've been asked to analyze data from the following experiment. Two types of cells were analyzed either separately (A, B) or in a mixture (AB). In each experiment, either the separated cell types or the mixture was subjected to a treatment. From each such experiment, a single Agilent two-color microarray was prepared, with untreated cells used as a control. Of course, proper significance analysis cannot be done, and I can only use the technical p-values generated by the Agilent software. Due to the nature of the experiment, it is unlikely that another data set can be generated in a foreseeable future. However, the results in general show the expected response to treatment and activation of a number of genes that are supposed to be activated; thus, the technical p-values still give a meaningful "general picture". By manually going through the data it is obvious that in many cases, the response in AB is a weighted average of the responses A and B. I tried to estimate this global weights in a very naive manner, by looking at the correlation between the fold change in experiment AB, and the fold change estimated from experiments A and B for different values of p, the proportion of cells of type A in the mixture AB. My first question is therefore -- is there a recommended solution within Bioconductor that I could apply in such a case? Furthermore, I'd like to look for an interaction effect -- to predict genes, GO terms or pathways that behave "not according to predictions" in the mixture AB. For this, I assume that the technical p-values are meaningful (because I do not have another choice), and run a GO / SPIA analysis on the three microarrays separately. Then, I manually look through the results to find enriched terms which are different for the AB experiment. I wonder whether there is a possibility to compare results of two GO-analyses. One could, for example, look for changes in rank positions of different GO terms (since the p-values in such a set up would probably be not very meaningful). Thanks in advance for any help, suggestions, material for further reading etc., j. -- -------- Dr. January Weiner 3 -------------------------------------- Max Planck Institute for Infection Biology Charit?platz 1 D-10117 Berlin, Germany Web : www.mpiib-berlin.mpg.de Tel : +49-30-28460514

Microarray Pathways GO Microarray Pathways GO • 799 views

ADD COMMENT • link updated 13.9 years ago by Wolfgang Huber ★ 13k • written 13.9 years ago by January Weiner ▴ 370

score 0 · Answer 1 · 2010-05-29

Dear January some suggestions below. On 28/05/10 16:02, January Weiner wrote: > Hello, > > I've been asked to analyze data from the following experiment. > > Two types of cells were analyzed either separately (A, B) or in a > mixture (AB). In each experiment, either the separated cell types or > the mixture was subjected to a treatment. From each such experiment, a > single Agilent two-color microarray was prepared, with untreated cells > used as a control. > > Of course, proper significance analysis cannot be done, and I can only > use the technical p-values generated by the Agilent software. Due to > the nature of the experiment, it is unlikely that another data set can > be generated in a foreseeable future. However, the results in general > show the expected response to treatment and activation of a number of > genes that are supposed to be activated; thus, the technical p-values > still give a meaningful "general picture". > > By manually going through the data it is obvious that in many cases, > the response in AB is a weighted average of the responses A and B. I > tried to estimate this global weights in a very naive manner, by > looking at the correlation between the fold change in experiment AB, > and the fold change estimated from experiments A and B for different > values of p, the proportion of cells of type A in the mixture AB. > > My first question is therefore -- is there a recommended solution > within Bioconductor that I could apply in such a case? I am not sure there is, or there needs to be. It seems that your most basic model is AB = pA + (1-p)B where AB, A and B are the fold changes observed in samples AB, A and B respectively. You can rearrange this to: p = (AB-B) / (A-B) Hence I would do a scatterplot of (A-B) on the x-axis versus (AB-B) on the y-axis and see if you can reasonably fit a regression line. > > Furthermore, I'd like to look for an interaction effect -- to predict > genes, GO terms or pathways that behave "not according to predictions" > in the mixture AB. For this, I assume that the technical p-values are > meaningful (because I do not have another choice), Yes, you do: ignore the p-values, and work with the fold-changes. > and run a GO / SPIA > analysis on the three microarrays separately. Then, I manually look > through the results to find enriched terms which are different for the > AB experiment. > > I wonder whether there is a possibility to compare results of two > GO-analyses. One could, for example, look for changes in rank > positions of different GO terms (since the p-values in such a set up > would probably be not very meaningful). > Have a look at the Category package, in particular its vignette, which takes a slightly more abstracted view of gene set enrichments than "sets of genes with low p-values" - i.e. you can look at enrichment of arbitrarily constructed comparison statistics. Also, at this one, from your (and my) neighbours: Nucleic Acids Res. 2010 GOing Bayesian: model-based gene set analysis of genome-scale data. Bauer S, Gagneur J, Robinson PN. > Thanks in advance for any help, suggestions, material for further reading etc., > > j. > -- Wolfgang Huber EMBL http://www.embl.de/research/units/genome_biology/huber