Anova on 16 CEL files and no annotated data frame(s)
1
0
Entering edit mode
@paul-cristina-5211
Last seen 9.6 years ago
New to both R and Bioconductor 6 weeks (previous SAS and stat experience (way back when) and other programming languages), googled and have read (books, pdf's etc) but have hit a road block. Have bioconductor case studies, and the other older springer books for R and gene analysis all open... I have been given 16 CEL (AFFY on citrus chip ~30K genes) files 4 CELs per treatment , one treatment is healthy control. I simply want to perform ANOVA. I have no annotated data frame there are no subtreatments etc... it's a very simple comparison subtract healthy and find differences between each trt (up and down regulated genes etc). All examples I have read are much more complex and have dataframes pre-loaded. I have normalized and summarized the data sets . both as a group of 16 and by trt, basic code below. > Citruseset <- expresso (Citrus, bgcorrect.method="rma", normalize.method="quantiles", pmcorrect.method="pmonly", summary.method="medianpolish") ##citrus chip has NA's so other methods used for this produced errors and I could find no go-arounds. Design model.matrix file looks like this: 1 0 1 0 0 2 0 1 0 0 3 0 1 0 0 4 0 1 0 0 5 1 0 0 0 6 1 0 0 0 7 1 0 0 0 8 1 0 0 0 9 0 0 1 0 10 0 0 1 0 11 0 0 1 0 12 0 0 1 0 13 0 0 0 1 14 0 0 0 1 15 0 0 0 1 16 0 0 0 1 Scatter, box, and volcano plots (which look like blobs) done on each treatment separately show some but not a lot of differentiation between expression. Venn diagrams comparing one trt to the other three (4 total) produced the same number of differences for all but one comparison which we don't believe is correct. I can't seem to proceed farther without an annotated data.frame for each CEL or for the batch of them... and can't find info about how to create one. Any info or to be pointed to something to read etc would be helpful. Thank you, Tina Paul Whether you think you can or think you can't - Your're right... Henry Ford Cristina Paul Citrus Quarantine Unit USDA-ARS-BA-PSI-MPPL Mailling address: Bldg 004 Rm 118 Beltsville MD 20705 office: 301-504-7657 FAX: 301-504-5449 Cell: 240-286-6709 Physical Address: Range 1 head house #6 This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. [[alternative HTML version deleted]]
• 740 views
ADD COMMENT
0
Entering edit mode
@james-w-macdonald-5106
Last seen 15 hours ago
United States
Hi Christina, On 4/10/2012 8:52 AM, Paul, Cristina wrote: > New to both R and Bioconductor 6 weeks (previous SAS and stat experience (way back when) and other programming languages), googled and have read (books, pdf's etc) but have hit a road block. Have bioconductor case studies, and the other older springer books for R and gene analysis all open... > > I have been given 16 CEL (AFFY on citrus chip ~30K genes) files 4 CELs per treatment , one treatment is healthy control. I simply want to perform ANOVA. I have no annotated data frame there are no subtreatments etc... it's a very simple comparison subtract healthy and find differences between each trt (up and down regulated genes etc). All examples I have read are much more complex and have dataframes pre-loaded. > > I have normalized and summarized the data sets . both as a group of 16 and by trt, basic code below. > > >> Citruseset<- expresso (Citrus, bgcorrect.method="rma", normalize.method="quantiles", pmcorrect.method="pmonly", summary.method="medianpolish") ##citrus chip has NA's so other methods used for this produced errors and I could find no go-arounds. > Design model.matrix file looks like this: > 1 0 1 0 0 > 2 0 1 0 0 > 3 0 1 0 0 > 4 0 1 0 0 > 5 1 0 0 0 > 6 1 0 0 0 > 7 1 0 0 0 > 8 1 0 0 0 > 9 0 0 1 0 > 10 0 0 1 0 > 11 0 0 1 0 > 12 0 0 1 0 > 13 0 0 0 1 > 14 0 0 0 1 > 15 0 0 0 1 > 16 0 0 0 1 > > Scatter, box, and volcano plots (which look like blobs) done on each treatment separately show some but not a lot of differentiation between expression. Venn diagrams comparing one trt to the other three (4 total) produced the same number of differences for all but one comparison which we don't believe is correct. > > I can't seem to proceed farther without an annotated data.frame for each CEL or for the batch of them... and can't find info about how to create one. Any info or to be pointed to something to read etc would be helpful. It would be helpful to see your code rather than a description of what you have done. For instance, you have a perfectly viable design matrix, and have apparently fit a linear model and computed contrasts (else how did you get Venn diagrams?). So it isn't clear to me what else you are looking to do, or what you need an annotated data frame for. Please give us your code, along with any errors you may have encountered, as well as an indication of how the results/output are not meeting your expectations. Best, Jim > > Thank you, > Tina Paul > > > Whether you think you can or think you can't - Your're right... > Henry Ford > > Cristina Paul > > Citrus Quarantine Unit > USDA-ARS-BA-PSI-MPPL > Mailling address: > Bldg 004 Rm 118 > Beltsville MD 20705 > office: 301-504-7657 > FAX: 301-504-5449 > Cell: 240-286-6709 > > Physical Address: > Range 1 head house #6 > > > > > > This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099
ADD COMMENT

Login before adding your answer.

Traffic: 730 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6