probe summarization

0

Entering edit mode

Bogdan ▴ 670

@bogdan-2367

Last seen 2.3 years ago

Palo Alto, CA, USA

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070905/ c3bb8669/attachment.pl

• 453 views

ADD COMMENT • link updated 18.5 years ago by James W. MacDonald 68k • written 18.5 years ago by Bogdan ▴ 670

0

Entering edit mode

James W. MacDonald 68k

@james-w-macdonald-5106

Last seen 1 day ago

United States

Hi Bogdan, Bogdan Tanasa wrote: > Hi all, > > I would like to ask for an information: I carry the array analysis for a > large dataset (40 samples * 2 replicates); > the arrays are Affy U133A, and I use GCRMA and invariant set normalization. > Please could you let me know > the way I could do the probe summarization for these arrays. Thanks and best GCRMA _is_ a method to do probe summarization. Maybe you are asking a different question? Best, Jim > regards, > > Bogdan > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald University of Michigan Affymetrix and cDNA Microarray Core 1500 E Medical Center Drive Ann Arbor MI 48109 734-647-5623

ADD COMMENT • link 18.5 years ago James W. MacDonald 68k

0

Entering edit mode

An embedded and charset-unspecified text was scrubbed... Name: not available Url: https://stat.ethz.ch/pipermail/bioconductor/attachments/20070906/ ac6a83fc/attachment.pl

ADD REPLY • link 18.5 years ago Bogdan ▴ 670

0

Entering edit mode

Hi Bogdan, Bogdan Tanasa wrote: > Hi James, > > I used the following instructions in R (mydata <- ReadAffy(), mycomp <- > gcrma (mydata), write.table (mycomp, "mytext.txt", sep="\t") > or I called "mydata<-expresso(...,methods.summarization="median.polish', > ....)". In the results table, I obtained an expression value > per PROBE, and I would like to have an expression value per GENE. I know > that RMA/GCRMA could use median polish to summarize > the probes for a gene and to ask the question more specifically: is there > anything that the code I use is missing ? In the final results > table, I would like to have the expression values for 10000-12000 genes > instead of having expression values for 22000 probes. Thanks, There is a bit of terminology here that is incorrect. You have expression values for 22283 _probesets_, which are based on ~250000 probes. You are correct however that there is some duplication. How you deal with that duplication is not a trivial question to answer. I suppose the easist thing to do would be to use the MBNI re-mapped cdfs that we supply. For instance, to use the Entrez Gene remapped cdf you would do something like this: dat <- ReadAffy(cdfname="hs133av2hsentrezgcdf") biocLite("hs133av2hsentrezgprobe") eset <- gcrma(dat) As with all things, there are positive and negative aspects to using the MBNI cdfs, the bad being the fact that the number of probes per probeset are now highly variable, and one would usually then want to have standard errors that could be propagated through to any differential expression calculations. I think the puma package might be useful here, but I haven't tried it yet. You could also make the assumption that the probeset that has the largest statistic in whatever comparison you are making is 'the right one', and simply use that. The findLargest() function in genefilter is useful in that respect. Best, Jim > > Bogdan > > > > # Read Affy CEL files > data <- ReadAffy()\ > # Normalize and do summation using gcrma > eset <- gcrms (data) > # > # Noe eset contains all the information that you require > # > # to get a data frame of expression values, use exprs command > evals <- exprs (eset) > # > # The command below will tell you that it is a data frame > class (evals) > # > # You can write out tab separated expression values to be used by other > programs using the command > write.table (evals, "expressvals.txt", sep="\t") > # > # > Send me questions if you have any > > On 9/6/07, James W. MacDonald <jmacdon at="" med.umich.edu=""> wrote: > >>Hi Bogdan, >> >>Bogdan Tanasa wrote: >> >>>Hi all, >>> >>>I would like to ask for an information: I carry the array analysis for a >>>large dataset (40 samples * 2 replicates); >>>the arrays are Affy U133A, and I use GCRMA and invariant set >> >>normalization. >> >>>Please could you let me know >>>the way I could do the probe summarization for these arrays. Thanks and >> >>best >> >>GCRMA _is_ a method to do probe summarization. Maybe you are asking a >>different question? >> >>Best, >> >>Jim >> >> >> >>>regards, >>> >>>Bogdan >>> >>> [[alternative HTML version deleted]] >>> >>>_______________________________________________ >>>Bioconductor mailing list >>>Bioconductor at stat.math.ethz.ch >>>https://stat.ethz.ch/mailman/listinfo/bioconductor >>>Search the archives: >> >>http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >>-- >>James W. MacDonald >>University of Michigan >>Affymetrix and cDNA Microarray Core >>1500 E Medical Center Drive >>Ann Arbor MI 48109 >>734-647-5623 >> >> > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald University of Michigan Affymetrix and cDNA Microarray Core 1500 E Medical Center Drive Ann Arbor MI 48109 734-647-5623

ADD REPLY • link 18.5 years ago James W. MacDonald 68k

Login before adding your answer.