expression set and paired designs

0

Entering edit mode

David ▴ 860

@david-3335

Last seen 6.1 years ago

Hi, Here is the experimental design (done by flow cytometry). Collect sample from a set of patients-> measure the expression for a set of genes in different celltypes from the same sample. So the normalized data look like that: celltype(1 or2) geneA geneB geneC patient1 1 40 20 40 patient1 2 37 18 41 patient2 1 40 19 38 patient2 2 38 17 39 patient3 1 10 19 38 patient3 2 20 17 39 ....(n) and then i have my pdata.txt. Sample Disease_stage patient1 moderate_disease patient2 severe_disease patient3 normal What i want to do is to compare the different groups and identify the genes that differentially expressed between the three groups. That i guess would be fine to do by bulding a proper design and runing a paired t.test. But before that I can't construct an eset object as sample names are duplicates. I was wondering if i need to construct two eset objects (one for celltype1 and one for celltype2) ??? Any help would be appreciated. thanks

• 1.2k views

ADD COMMENT • link updated 14.5 years ago by Naomi Altman ★ 6.0k • written 14.4 years ago by David ▴ 860

0

Entering edit mode

Sean Davis 21k

@sean-davis-490

Last seen 4 months ago

United States

On Mon, Dec 7, 2009 at 8:35 AM, David martin <vilanew at="" gmail.com=""> wrote: > Hi, > Here is the experimental design (done by flow cytometry). > > Collect sample from a set of patients-> measure the expression for a set of > genes in different celltypes from the same sample. > > So the normalized data look like that: > > ? ? ? ?celltype(1 or2) geneA ? geneB ? geneC > patient1 ? ? ? ?1 ? ? ? 40 ? ? ?20 ? ? ?40 > patient1 ? ? ? ?2 ? ? ? 37 ? ? ?18 ? ? ?41 > patient2 ? ? ? ?1 ? ? ? 40 ? ? ?19 ? ? ?38 > patient2 ? ? ? ?2 ? ? ? 38 ? ? ?17 ? ? ?39 > patient3 ? ? ? ?1 ? ? ? 10 ? ? ?19 ? ? ?38 > patient3 ? ? ? ?2 ? ? ? 20 ? ? ?17 ? ? ?39 > > ....(n) > > > and then i have my pdata.txt. > > Sample ?Disease_stage > patient1 ? ? ? ?moderate_disease > patient2 ? ? ? ?severe_disease > patient3 ? ? ? ?normal > > > What i want to do is to compare the different groups and identify the genes > that differentially expressed between the three groups. > That i guess would be fine to do by bulding a proper design and runing a > paired t.test. > > > But before that I can't construct an eset object as sample names are > duplicates. I was wondering if i need to construct two eset objects (one for > celltype1 and one for celltype2) ??? I would suggest thinking of a "patient" as a source for a sample, and not the sample itself, in the most general of terms. In other words, label your samples 1..2n (if you have two samples per patient) and then connect the sample ids with the patient ids in the phenoData of the ExpressionSet. Does that make sense? Sean

ADD COMMENT • link 14.5 years ago Sean Davis 21k

0

Entering edit mode

Ok, so for example: patient1 would be ID001 and ID002 for each patient in my data matrix and then in the pData file it would be: Sample Disease_stage celltype ID001 normal 1 ID002 normal 2 and in the design i would have a "Disease_stage" and "Celltype" and my deisgn would look somethink like: Pair <- factor(phenoData$celltype) disease <- factor(phenoData$Disease_stage) I'll try to work it out. thanks, Sean Davis wrote: > On Mon, Dec 7, 2009 at 8:35 AM, David martin <vilanew at="" gmail.com=""> wrote: >> Hi, >> Here is the experimental design (done by flow cytometry). >> >> Collect sample from a set of patients-> measure the expression for a set of >> genes in different celltypes from the same sample. >> >> So the normalized data look like that: >> >> celltype(1 or2) geneA geneB geneC >> patient1 1 40 20 40 >> patient1 2 37 18 41 >> patient2 1 40 19 38 >> patient2 2 38 17 39 >> patient3 1 10 19 38 >> patient3 2 20 17 39 >> >> ....(n) >> >> >> and then i have my pdata.txt. >> >> Sample Disease_stage >> patient1 moderate_disease >> patient2 severe_disease >> patient3 normal >> >> >> What i want to do is to compare the different groups and identify the genes >> that differentially expressed between the three groups. >> That i guess would be fine to do by bulding a proper design and runing a >> paired t.test. >> >> >> But before that I can't construct an eset object as sample names are >> duplicates. I was wondering if i need to construct two eset objects (one for >> celltype1 and one for celltype2) ??? > > I would suggest thinking of a "patient" as a source for a sample, and > not the sample itself, in the most general of terms. In other words, > label your samples 1..2n (if you have two samples per patient) and > then connect the sample ids with the patient ids in the phenoData of > the ExpressionSet. Does that make sense? > > Sean > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 14.4 years ago David ▴ 860

0

Entering edit mode

Naomi Altman ★ 6.0k

@naomi-altman-380

Last seen 3.1 years ago

United States

What you have is a split plot design. The whole block factor is disease severity. The blocks are patients. The subplot factor is cell type. Since there are only 2 cell types, you can readily determine the disease by cell type main effect, by analyzing the differences between the cell types for each patient. However, if the main effects are also of interest, you need to run the split plot design ANOVA for each gene. I am not sure whether you can do this in Limma, using patient as block. If not, you should be able to do it in MAANOVA. --Naomi At 08:35 AM 12/7/2009, David martin wrote: >Hi, >Here is the experimental design (done by flow cytometry). > >Collect sample from a set of patients-> measure the expression for a >set of genes in different celltypes from the same sample. > >So the normalized data look like that: > > celltype(1 or2) geneA geneB geneC >patient1 1 40 20 40 >patient1 2 37 18 41 >patient2 1 40 19 38 >patient2 2 38 17 39 >patient3 1 10 19 38 >patient3 2 20 17 39 > >....(n) > > >and then i have my pdata.txt. > >Sample Disease_stage >patient1 moderate_disease >patient2 severe_disease >patient3 normal > > >What i want to do is to compare the different groups and identify >the genes that differentially expressed between the three groups. >That i guess would be fine to do by bulding a proper design and >runing a paired t.test. > > >But before that I can't construct an eset object as sample names are >duplicates. I was wondering if i need to construct two eset objects >(one for celltype1 and one for celltype2) ??? > >Any help would be appreciated. > >thanks > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Naomi S. Altman 814-865-3791 (voice) Associate Professor Dept. of Statistics 814-863-7114 (fax) Penn State University 814-865-1348 (Statistics) University Park, PA 16802-2111

ADD COMMENT • link 14.5 years ago Naomi Altman ★ 6.0k

0

Entering edit mode

Is there any simple example to start with ? Naomi Altman wrote: > What you have is a split plot design. > > The whole block factor is disease severity. The blocks are patients. > The subplot factor is cell type. > > Since there are only 2 cell types, you can readily determine the disease > by cell type main effect, by analyzing the differences between the cell > types for each > patient. However, if the main effects are also of interest, you need to > run the split plot design ANOVA for each gene. > > I am not sure whether you can do this in Limma, using patient as block. > If not, you should be able to do it in MAANOVA. > > --Naomi > > At 08:35 AM 12/7/2009, David martin wrote: >> Hi, >> Here is the experimental design (done by flow cytometry). >> >> Collect sample from a set of patients-> measure the expression for a >> set of genes in different celltypes from the same sample. >> >> So the normalized data look like that: >> >> celltype(1 or2) geneA geneB geneC >> patient1 1 40 20 40 >> patient1 2 37 18 41 >> patient2 1 40 19 38 >> patient2 2 38 17 39 >> patient3 1 10 19 38 >> patient3 2 20 17 39 >> >> ....(n) >> >> >> and then i have my pdata.txt. >> >> Sample Disease_stage >> patient1 moderate_disease >> patient2 severe_disease >> patient3 normal >> >> >> What i want to do is to compare the different groups and identify the >> genes that differentially expressed between the three groups. >> That i guess would be fine to do by bulding a proper design and runing >> a paired t.test. >> >> >> But before that I can't construct an eset object as sample names are >> duplicates. I was wondering if i need to construct two eset objects >> (one for celltype1 and one for celltype2) ??? >> >> Any help would be appreciated. >> >> thanks >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor > > Naomi S. Altman 814-865-3791 (voice) > Associate Professor > Dept. of Statistics 814-863-7114 (fax) > Penn State University 814-865-1348 (Statistics) > University Park, PA 16802-2111 > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 14.4 years ago David ▴ 860

0

Entering edit mode

I have modified my pData.txt so that it looks like that: Sample Group celltype sample1 moderate_disease celltype1 sample2 moderate_disease celltype2 sample3 severe_disease celltype1 sample4 severe_disease celltype2 sample5 normal celltype1 sample6 normal celltype2 following your suggestions: My block factor is the Group created as follow: # #Creates design combinations #Got it from the forum. Very helpfull design.list <- function(phenotype,cond) { n <- length(phenotype) k = length(cond) design <- matrix(1,1,(n)) for (i in 1:k) { #print(paste("Reading:",i)) ind = (which(phenotype == cond[i])) design[,ind] = i } design } conditions=factor(c(design.list(phenotype$Group,groups)),labels=c("nor mal","moderate","severe")) celltypes=factor(c(design.list(phenotype$Celltype,celltype)),labels=c( "celltype1","celltype2")) data<- exprs(mydata.eset) fit<- manova(data ~ conditions * celltypes) Error in model.frame.default(formula = data ~ conditions * celltype, drop.unused.levels = TRUE) : variable lengths differ (found for 'conditions') Since i have different levels i can't compute the manova. Can you help me on that ? The main question being, what is the effect of each celltype on disease and what is the effect of both celltypes on disease progression. The first question can be resolved using classical anova but what about the seconf with two variables (celltype1 and celltype2) thanks again for all the help i'm getting, David martin wrote: > Is there any simple example to start with ? > > > Naomi Altman wrote: >> What you have is a split plot design. >> >> The whole block factor is disease severity. The blocks are patients. >> The subplot factor is cell type. >> >> Since there are only 2 cell types, you can readily determine the >> disease by cell type main effect, by analyzing the differences between >> the cell types for each >> patient. However, if the main effects are also of interest, you need >> to run the split plot design ANOVA for each gene. >> >> I am not sure whether you can do this in Limma, using patient as >> block. If not, you should be able to do it in MAANOVA. >> >> --Naomi >> >> At 08:35 AM 12/7/2009, David martin wrote: >>> Hi, >>> Here is the experimental design (done by flow cytometry). >>> >>> Collect sample from a set of patients-> measure the expression for a >>> set of genes in different celltypes from the same sample. >>> >>> So the normalized data look like that: >>> >>> celltype(1 or2) geneA geneB geneC >>> patient1 1 40 20 40 >>> patient1 2 37 18 41 >>> patient2 1 40 19 38 >>> patient2 2 38 17 39 >>> patient3 1 10 19 38 >>> patient3 2 20 17 39 >>> >>> ....(n) >>> >>> >>> and then i have my pdata.txt. >>> >>> Sample Disease_stage >>> patient1 moderate_disease >>> patient2 severe_disease >>> patient3 normal >>> >>> >>> What i want to do is to compare the different groups and identify the >>> genes that differentially expressed between the three groups. >>> That i guess would be fine to do by bulding a proper design and >>> runing a paired t.test. >>> >>> >>> But before that I can't construct an eset object as sample names are >>> duplicates. I was wondering if i need to construct two eset objects >>> (one for celltype1 and one for celltype2) ??? >>> >>> Any help would be appreciated. >>> >>> thanks >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at stat.math.ethz.ch >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> Naomi S. Altman 814-865-3791 (voice) >> Associate Professor >> Dept. of Statistics 814-863-7114 (fax) >> Penn State University 814-865-1348 (Statistics) >> University Park, PA 16802-2111 >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor >

ADD REPLY • link 14.4 years ago David ▴ 860

Login before adding your answer.