Multifactorial affy data
1
0
Entering edit mode
@jonathan-arthur-1200
Last seen 10.2 years ago
Hello all, I have a data set of the form Gene S1 S2 S3 .... Sn where each column is the expression of the gene labelled by the first column in a different sample. The expression data come from Affymetrix arrays. I am new to BioConductor (and microarray analysis in general), so I have a few questions I hope people may be able to help me with: 1) My expression data is the *aggregate* measure as put out by the Affymetrix software. The affy package appears to only deal with the lower level .cel files. Is there a particular reason for this? Is there are package capable of working with the aggregate data? 2) The various samples divide into two sets (disease and control), but also have clinical co-variables (e.g. male and female). I want to find the set of genes differentially expressed between disease and control while at the same time confirming those differences are specifically due to disease status and not to any of the other co-variables (gender, age, etc.) What package(s) should I start with to do this? Thanks in advance. Jonathan -- Dr Jonathan Arthur Sesqui Lecturer in Bioinformatics Central Clinical School, Faculty of Medicine and SUBIT Medical Foundation Building, K25 University of Sydney Ph: +61 2 9036 3132 Email: jarthur@med.usyd.edu.au
Microarray affy Microarray affy • 1.1k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 3 months ago
United States
On Apr 26, 2005, at 1:56 AM, Jonathan Arthur wrote: > Hello all, > > I have a data set of the form > > Gene S1 S2 S3 .... Sn > > where each column is the expression of the gene labelled by the first > column in a different sample. The expression data come from Affymetrix > arrays. I am new to BioConductor (and microarray analysis in general), > so I have a few questions I hope people may be able to help me with: In order to use bioconductor, you will need to have a basic understanding of R. I would strongly suggest you spend a couple of hours with the "introduction to R" manual (available from the R website) learning the basics of data manipulation and finding help (of which there is a HUGE amount in R). > > 1) My expression data is the *aggregate* measure as put out by the > Affymetrix software. The affy package appears to only deal with the > lower level .cel files. Is there a particular reason for this? Is > there are package capable of working with the aggregate data? The affy package (and its associates) deal with .CEL files because the .CEL file is pretty close to "raw" data (there is of course image extraction done). Normalization is a particularly important aspect of dealing with microarray data and is best performed on raw data. The question is still open for discussion, but for many folks, the normalization and summarization methods available via bioconductor offer good alternatives over those offered by Affymetrix directly, so .CEL files are the best source of the raw data for doing the normalization/summarization process. There are of course practical reasons to use .CEL files, also--they are standard and available. As for using aggregate data, most of the methods for microarray analysis work on a kind of "matrix" of values, which you have when you have aggregate data. > > 2) The various samples divide into two sets (disease and control), but > also have clinical co-variables (e.g. male and female). I want to find > the set of genes differentially expressed between disease and control > while at the same time confirming those differences are specifically > due to disease status and not to any of the other co-variables > (gender, age, etc.) If you have covariates, I suggest looking into using limma. It will work just fine with your aggregate data (although you will have to remove the "genes" column so that you have only numeric data). There is an excellent user guide (>70 pages of how-to, examples, etc.), also. The mail archives for R and Bioconductor can be quite helpful, also. Try searching them for answers, as often folks have put quite a lot of energy into answering beginners' questions. 1) Searchable bioconductor archives http://files.protsuggest.org/cgi-bin/biocond.cgi 2) R site search (and archive search) http://finzi.psych.upenn.edu/search.html Sean
ADD COMMENT

Login before adding your answer.

Traffic: 619 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6