Limma final gene expression report

0

Entering edit mode

Ankit Pal ▴ 230

@ankit-pal-1242

Last seen 11.3 years ago

Dear All, While looking at the Limma user guide, I came across the following example > targets <- readTargets("SwirlSample.txt") > RG <- read.maimages(targets$FileName, source="spot") > RG$genes <- readGAL() > RG$printer <- getLayout(RG$genes) > MA <- normalizeWithinArrays(RG) > MA <- normalizeBetweenArrays(MA) > fit <- lmFit(MA, design=c(-1,1,-1,1)) > fit <- eBayes(fit) > options(digits=3) > topTable(fit, n=30, adjust="fdr") ID Name M A t P.Value B control BMP2 -2.21 12.1 -21.1 0.000357 7.96 control BMP2 -2.30 13.1 -20.3 0.000357 7.78 control Dlx3 -2.18 13.3 -20.0 0.000357 7.71 control Dlx3 -2.18 13.5 -19.6 0.000357 7.62 fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78 fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54 I have omitted a few rows and columns. Here we see that after all the data transformations, we get an output where the ranking for the probes in an array is done on the basis of the B value. Notice that there are reapeating names for genes, therefore for a set of replicates, within and across arrays, each spot is reported separately as an individual entity. In the case of BMP2 from the above example, which result do I consider? Is there a way in which I can get a single result for a set of replicates. I am new to this package, so please do let me know if there is a problem in my understanding the concept. Thank you, -Ankit

limma limma • 2.3k views

ADD COMMENT • link updated 20.6 years ago by Gordon Smyth 53k • written 20.6 years ago by Ankit Pal ▴ 230

0

Entering edit mode

michael watson IAH-C ★ 3.4k

@michael-watson-iah-c-378

Last seen 11.3 years ago

There are ways of combining replicate spots in limma, and it is all in the user guide :-) However, many people, myself included, prefer things reported on a spot-by-spot basis. If all replicate spots for a particular gene are reported as significant, I take that as further proof that i) the gene is differentially expressed, ii) my arrays are of good quality, iii) my experimental procedure was of good quality. Think about the case where only one out of two spots is reported - is that because one of the spots was of poor quality? Or because the values for each spot differ by a lot? You would lose this valuable information if you just took the average between replicates. If you *really* want an average value for each spot, simply take the average M value from the output of toTapble. Mick -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch on behalf of Ankit Pal Sent: Tue 10/05/2005 6:15 AM To: bioconductor@stat.math.ethz.ch Cc: Subject: [BioC] Limma final gene expression report Dear All, While looking at the Limma user guide, I came across the following example > targets <- readTargets("SwirlSample.txt") > RG <- read.maimages(targets$FileName, source="spot") > RG$genes <- readGAL() > RG$printer <- getLayout(RG$genes) > MA <- normalizeWithinArrays(RG) > MA <- normalizeBetweenArrays(MA) > fit <- lmFit(MA, design=c(-1,1,-1,1)) > fit <- eBayes(fit) > options(digits=3) > topTable(fit, n=30, adjust="fdr") ID Name M A t P.Value B control BMP2 -2.21 12.1 -21.1 0.000357 7.96 control BMP2 -2.30 13.1 -20.3 0.000357 7.78 control Dlx3 -2.18 13.3 -20.0 0.000357 7.71 control Dlx3 -2.18 13.5 -19.6 0.000357 7.62 fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78 fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54 I have omitted a few rows and columns. Here we see that after all the data transformations, we get an output where the ranking for the probes in an array is done on the basis of the B value. Notice that there are reapeating names for genes, therefore for a set of replicates, within and across arrays, each spot is reported separately as an individual entity. In the case of BMP2 from the above example, which result do I consider? Is there a way in which I can get a single result for a set of replicates. I am new to this package, so please do let me know if there is a problem in my understanding the concept. Thank you, -Ankit _______________________________________________ Bioconductor mailing list Bioconductor@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor

ADD COMMENT • link 20.6 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

Ankit Pal ▴ 230

@ankit-pal-1242

Last seen 11.3 years ago

Dear Mick, Thanks a lot for the reply. I am interested in the spots individually but for further analysis of the spots I need a single representative value for each gene. I have looked up the manual, I did not find a way to combine replicate spots into a single value. Could you tell me what is the method or which section of the manual is it present in. t will be of great help to me. Thank you, -Ankit --- "michael watson (IAH-C)" <michael.watson@bbsrc.ac.uk> wrote: > There are ways of combining replicate spots in > limma, and it is all in the user guide :-) > > However, many people, myself included, prefer things > reported on a spot-by-spot basis. If all replicate > spots for a particular gene are reported as > significant, I take that as further proof that i) > the gene is differentially expressed, ii) my arrays > are of good quality, iii) my experimental procedure > was of good quality. Think about the case where > only one out of two spots is reported - is that > because one of the spots was of poor quality? Or > because the values for each spot differ by a lot? > You would lose this valuable information if you just > took the average between replicates. > > If you *really* want an average value for each spot, > simply take the average M value from the output of > toTapble. > > Mick > > > -----Original Message----- > From: bioconductor-bounces@stat.math.ethz.ch on > behalf of Ankit Pal > Sent: Tue 10/05/2005 6:15 AM > To: bioconductor@stat.math.ethz.ch > Cc: > Subject: [BioC] Limma final gene expression report > > Dear All, > While looking at the Limma user guide, I came across > the following example > > > targets <- readTargets("SwirlSample.txt") > > RG <- read.maimages(targets$FileName, > source="spot") > > > RG$genes <- readGAL() > > RG$printer <- getLayout(RG$genes) > > MA <- normalizeWithinArrays(RG) > > MA <- normalizeBetweenArrays(MA) > > fit <- lmFit(MA, design=c(-1,1,-1,1)) > > fit <- eBayes(fit) > > options(digits=3) > > topTable(fit, n=30, adjust="fdr") > ID Name M A t P.Value B > control BMP2 -2.21 12.1 -21.1 0.000357 7.96 > control BMP2 -2.30 13.1 -20.3 0.000357 7.78 > control Dlx3 -2.18 13.3 -20.0 0.000357 7.71 > control Dlx3 -2.18 13.5 -19.6 0.000357 7.62 > fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78 > fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54 > > I have omitted a few rows and columns. > Here we see that after all the data transformations, > we get an output where the ranking for the probes in > an array is done on the basis of the B value. > Notice that there are reapeating names for genes, > therefore for a set of replicates, within and across > arrays, each spot is reported separately as an > individual entity. > In the case of BMP2 from the above example, which > result do I consider? > Is there a way in which I can get a single result > for > a set of replicates. > I am new to this package, so please do let me know > if > there is a problem in my understanding the concept. > Thank you, > -Ankit > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor > > > >

ADD COMMENT • link 20.6 years ago Ankit Pal ▴ 230

0

Entering edit mode

One other question--are these replicate spots (i.e., the same DNA) or different oligos/clones for the same gene? Sean On May 10, 2005, at 5:59 AM, Ankit Pal wrote: > Dear Mick, > Thanks a lot for the reply. > I am interested in the spots individually but for > further analysis of the spots I need a single > representative value for each gene. > I have looked up the manual, I did not find a way to > combine replicate spots into a single value. > Could you tell me what is the method or which section > of the manual is it present in. > t will be of great help to me. > Thank you, > -Ankit > > > --- "michael watson (IAH-C)" > <michael.watson@bbsrc.ac.uk> wrote: >> There are ways of combining replicate spots in >> limma, and it is all in the user guide :-) >> >> However, many people, myself included, prefer things >> reported on a spot-by-spot basis. If all replicate >> spots for a particular gene are reported as >> significant, I take that as further proof that i) >> the gene is differentially expressed, ii) my arrays >> are of good quality, iii) my experimental procedure >> was of good quality. Think about the case where >> only one out of two spots is reported - is that >> because one of the spots was of poor quality? Or >> because the values for each spot differ by a lot? >> You would lose this valuable information if you just >> took the average between replicates. >> >> If you *really* want an average value for each spot, >> simply take the average M value from the output of >> toTapble. >> >> Mick >> >> >> -----Original Message----- >> From: bioconductor-bounces@stat.math.ethz.ch on >> behalf of Ankit Pal >> Sent: Tue 10/05/2005 6:15 AM >> To: bioconductor@stat.math.ethz.ch >> Cc: >> Subject: [BioC] Limma final gene expression report >> >> Dear All, >> While looking at the Limma user guide, I came across >> the following example >> >>> targets <- readTargets("SwirlSample.txt") >>> RG <- read.maimages(targets$FileName, >> source="spot") >> >>> RG$genes <- readGAL() >>> RG$printer <- getLayout(RG$genes) >>> MA <- normalizeWithinArrays(RG) >>> MA <- normalizeBetweenArrays(MA) >>> fit <- lmFit(MA, design=c(-1,1,-1,1)) >>> fit <- eBayes(fit) >>> options(digits=3) >>> topTable(fit, n=30, adjust="fdr") >> ID Name M A t P.Value B >> control BMP2 -2.21 12.1 -21.1 0.000357 7.96 >> control BMP2 -2.30 13.1 -20.3 0.000357 7.78 >> control Dlx3 -2.18 13.3 -20.0 0.000357 7.71 >> control Dlx3 -2.18 13.5 -19.6 0.000357 7.62 >> fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78 >> fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54 >> >> I have omitted a few rows and columns. >> Here we see that after all the data transformations, >> we get an output where the ranking for the probes in >> an array is done on the basis of the B value. >> Notice that there are reapeating names for genes, >> therefore for a set of replicates, within and across >> arrays, each spot is reported separately as an >> individual entity. >> In the case of BMP2 from the above example, which >> result do I consider? >> Is there a way in which I can get a single result >> for >> a set of replicates. >> I am new to this package, so please do let me know >> if >> there is a problem in my understanding the concept. >> Thank you, >> -Ankit >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@stat.math.ethz.ch >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> >> >> > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor

ADD REPLY • link 20.6 years ago Sean Davis 21k

0

Entering edit mode

There are different oliogos of the same gene and replicates with the same sequence present on the array. Splice variants have been mapped to the same Accession number but I'm working on that to separate out the same by naming them differently. -Ankit --- Sean Davis <sdavis2@mail.nih.gov> wrote: > One other question--are these replicate spots (i.e., > the same DNA) or > different oligos/clones for the same gene? > > Sean > > On May 10, 2005, at 5:59 AM, Ankit Pal wrote: > > > Dear Mick, > > Thanks a lot for the reply. > > I am interested in the spots individually but for > > further analysis of the spots I need a single > > representative value for each gene. > > I have looked up the manual, I did not find a way > to > > combine replicate spots into a single value. > > Could you tell me what is the method or which > section > > of the manual is it present in. > > t will be of great help to me. > > Thank you, > > -Ankit > > > > > > --- "michael watson (IAH-C)" > > <michael.watson@bbsrc.ac.uk> wrote: > >> There are ways of combining replicate spots in > >> limma, and it is all in the user guide :-) > >> > >> However, many people, myself included, prefer > things > >> reported on a spot-by-spot basis. If all > replicate > >> spots for a particular gene are reported as > >> significant, I take that as further proof that i) > >> the gene is differentially expressed, ii) my > arrays > >> are of good quality, iii) my experimental > procedure > >> was of good quality. Think about the case where > >> only one out of two spots is reported - is that > >> because one of the spots was of poor quality? Or > >> because the values for each spot differ by a lot? > >> You would lose this valuable information if you > just > >> took the average between replicates. > >> > >> If you *really* want an average value for each > spot, > >> simply take the average M value from the output > of > >> toTapble. > >> > >> Mick > >> > >> > >> -----Original Message----- > >> From: bioconductor-bounces@stat.math.ethz.ch on > >> behalf of Ankit Pal > >> Sent: Tue 10/05/2005 6:15 AM > >> To: bioconductor@stat.math.ethz.ch > >> Cc: > >> Subject: [BioC] Limma final gene expression > report > >> > >> Dear All, > >> While looking at the Limma user guide, I came > across > >> the following example > >> > >>> targets <- readTargets("SwirlSample.txt") > >>> RG <- read.maimages(targets$FileName, > >> source="spot") > >> > >>> RG$genes <- readGAL() > >>> RG$printer <- getLayout(RG$genes) > >>> MA <- normalizeWithinArrays(RG) > >>> MA <- normalizeBetweenArrays(MA) > >>> fit <- lmFit(MA, design=c(-1,1,-1,1)) > >>> fit <- eBayes(fit) > >>> options(digits=3) > >>> topTable(fit, n=30, adjust="fdr") > >> ID Name M A t P.Value B > >> control BMP2 -2.21 12.1 -21.1 0.000357 > 7.96 > >> control BMP2 -2.30 13.1 -20.3 0.000357 > 7.78 > >> control Dlx3 -2.18 13.3 -20.0 0.000357 > 7.71 > >> control Dlx3 -2.18 13.5 -19.6 0.000357 > 7.62 > >> fb94h06 20-L12 1.27 12.0 14.1 0.002067 > 5.78 > >> fb40h07 7-D14 1.35 13.8 13.5 0.002067 > 5.54 > >> > >> I have omitted a few rows and columns. > >> Here we see that after all the data > transformations, > >> we get an output where the ranking for the probes > in > >> an array is done on the basis of the B value. > >> Notice that there are reapeating names for genes, > >> therefore for a set of replicates, within and > across > >> arrays, each spot is reported separately as an > >> individual entity. > >> In the case of BMP2 from the above example, which > >> result do I consider? > >> Is there a way in which I can get a single result > >> for > >> a set of replicates. > >> I am new to this package, so please do let me > know > >> if > >> there is a problem in my understanding the > concept. > >> Thank you, > >> -Ankit > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@stat.math.ethz.ch > >> > https://stat.ethz.ch/mailman/listinfo/bioconductor > >> > >> > >> > >> > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > Discover Yahoo! Stay in touch with email, IM, photo sharing and more. Check it out!

ADD REPLY • link 20.6 years ago Ankit Pal ▴ 230

0

Entering edit mode

On May 10, 2005, at 6:54 AM, Ankit Pal wrote: > There are different oliogos of the same gene and > replicates with the same sequence present on the > array. > Splice variants have been mapped to the same Accession > number but I'm working on that to separate out the > same by naming them differently. > -Ankit > So, for the different oligos/splice variant spots, averaging is probably NOT a good idea in most situations--they could be measuring quite different things. When there is a discrepancy, it MAY be due to array issues (quality of one spot, for instance), but it could also be a true finding; determining which is which often requires some human intervention. Perhaps others will weigh in on the issue, but I don't think ad-hoc averaging of truly duplicated spots really serves a purpose, either. Unless your array design allows you to treat the duplicated spots at the level of the analysis (lmFit, in limma) and not post-processing, I don't think you benefit much from averaging (i.e., you don't have more "confidence" in a gene that has been averaged). Sean

ADD REPLY • link 20.6 years ago Sean Davis 21k

0

Entering edit mode

Dear Ankit, In addition to Mick's arguments, you also need to be careful when averaging over spots if you use the empirical bayes methods in limma, and you have different number of replicates for different genes. If, for example, gene A has 1 spot but gene B has four replicated spots, then the variance of the mean of the replicates is very different for gene B and for gene A, and would violate some of the assumptions behind the Empirical Bayes procedure in limma. I think this is mentioned in the user guide (and/or Gordon Smyth's paper), and has also been mentioned in this list. If you really want an average, you can do as Mick suggests: > > If you *really* want an average value for each spot, > > simply take the average M value from the output of > > toTapble. you can use something like tapply(the.top.table.M.values, the.top.table.gene.identifiers, mean) Best, R. On Tuesday 10 May 2005 11:59, Ankit Pal wrote: > Dear Mick, > Thanks a lot for the reply. > I am interested in the spots individually but for > further analysis of the spots I need a single > representative value for each gene. > I have looked up the manual, I did not find a way to > combine replicate spots into a single value. > Could you tell me what is the method or which section > of the manual is it present in. > t will be of great help to me. > Thank you, > -Ankit > > > --- "michael watson (IAH-C)" > > <michael.watson@bbsrc.ac.uk> wrote: > > There are ways of combining replicate spots in > > limma, and it is all in the user guide :-) > > > > However, many people, myself included, prefer things > > reported on a spot-by-spot basis. If all replicate > > spots for a particular gene are reported as > > significant, I take that as further proof that i) > > the gene is differentially expressed, ii) my arrays > > are of good quality, iii) my experimental procedure > > was of good quality. Think about the case where > > only one out of two spots is reported - is that > > because one of the spots was of poor quality? Or > > because the values for each spot differ by a lot? > > You would lose this valuable information if you just > > took the average between replicates. > > > > If you *really* want an average value for each spot, > > simply take the average M value from the output of > > toTapble. > > > > Mick > > > > > > -----Original Message----- > > From: bioconductor-bounces@stat.math.ethz.ch on > > behalf of Ankit Pal > > Sent: Tue 10/05/2005 6:15 AM > > To: bioconductor@stat.math.ethz.ch > > Cc: > > Subject: [BioC] Limma final gene expression report > > > > Dear All, > > While looking at the Limma user guide, I came across > > the following example > > > > > targets <- readTargets("SwirlSample.txt") > > > RG <- read.maimages(targets$FileName, > > > > source="spot") > > > > > RG$genes <- readGAL() > > > RG$printer <- getLayout(RG$genes) > > > MA <- normalizeWithinArrays(RG) > > > MA <- normalizeBetweenArrays(MA) > > > fit <- lmFit(MA, design=c(-1,1,-1,1)) > > > fit <- eBayes(fit) > > > options(digits=3) > > > topTable(fit, n=30, adjust="fdr") > > > > ID Name M A t P.Value B > > control BMP2 -2.21 12.1 -21.1 0.000357 7.96 > > control BMP2 -2.30 13.1 -20.3 0.000357 7.78 > > control Dlx3 -2.18 13.3 -20.0 0.000357 7.71 > > control Dlx3 -2.18 13.5 -19.6 0.000357 7.62 > > fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78 > > fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54 > > > > I have omitted a few rows and columns. > > Here we see that after all the data transformations, > > we get an output where the ranking for the probes in > > an array is done on the basis of the B value. > > Notice that there are reapeating names for genes, > > therefore for a set of replicates, within and across > > arrays, each spot is reported separately as an > > individual entity. > > In the case of BMP2 from the above example, which > > result do I consider? > > Is there a way in which I can get a single result > > for > > a set of replicates. > > I am new to this package, so please do let me know > > if > > there is a problem in my understanding the concept. > > Thank you, > > -Ankit > > > > _______________________________________________ > > Bioconductor mailing list > > Bioconductor@stat.math.ethz.ch > > https://stat.ethz.ch/mailman/listinfo/bioconductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor@stat.math.ethz.ch > https://stat.ethz.ch/mailman/listinfo/bioconductor -- Ram?n D?az-Uriarte Bioinformatics Unit Centro Nacional de Investigaciones Oncol?gicas (CNIO) (Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) **NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y en su caso los ficheros adjuntos, pueden contener informaci?n protegida para el uso exclusivo de su destinatario. Se proh?be la distribuci?n, reproducci?n o cualquier otro tipo de transmisi?n por parte de otra persona que no sea el destinatario. Si usted recibe por error este correo, se ruega comunicarlo al remitente y borrar el mensaje recibido. **CONFIDENTIALITY NOTICE** This email communication and any attachments may contain confidential and privileged information for the sole use of the designated recipient named above. Distribution, reproduction or any other use of this transmission by any party other than the intended recipient is prohibited. If you are not the intended recipient please contact the sender and delete all copies.

ADD REPLY • link 20.6 years ago Ramon Diaz ★ 1.1k

0

Entering edit mode

michael watson IAH-C ★ 3.4k

@michael-watson-iah-c-378

Last seen 11.3 years ago

I can see where it may be handy though - e.g. for input into clustering. Really, at that stage, most people would prefer one gene - one value. There's no way I would average over the values for different oligos though, even if they were for the same gene! In limma, in "usersguide.pdf", section 14 "Within Array Replicate Spots" deals with the issue :-) Mick -----Original Message----- From: Sean Davis [mailto:sdavis2@mail.nih.gov] Sent: 10 May 2005 12:11 To: Ankit Pal Cc: michael watson (IAH-C); bioconductor@stat.math.ethz.ch Subject: Re: [BioC] Limma final gene expression report On May 10, 2005, at 6:54 AM, Ankit Pal wrote: > There are different oliogos of the same gene and > replicates with the same sequence present on the > array. > Splice variants have been mapped to the same Accession > number but I'm working on that to separate out the > same by naming them differently. > -Ankit > So, for the different oligos/splice variant spots, averaging is probably NOT a good idea in most situations--they could be measuring quite different things. When there is a discrepancy, it MAY be due to array issues (quality of one spot, for instance), but it could also be a true finding; determining which is which often requires some human intervention. Perhaps others will weigh in on the issue, but I don't think ad-hoc averaging of truly duplicated spots really serves a purpose, either. Unless your array design allows you to treat the duplicated spots at the level of the analysis (lmFit, in limma) and not post-processing, I don't think you benefit much from averaging (i.e., you don't have more "confidence" in a gene that has been averaged). Sean

ADD COMMENT • link 20.6 years ago michael watson IAH-C ★ 3.4k

0

Entering edit mode

Ankit Pal ▴ 230

@ankit-pal-1242

Last seen 11.3 years ago

Dear Sean, I agree averaging is not a good idea. So how do I get a single value for a set of replicate probes? In case of different values for the same gene which result do I consider to be representative of the whole? It would definately help at the clustering level. -Ankit --- Sean Davis <sdavis2@mail.nih.gov> wrote: > > > On May 10, 2005, at 6:54 AM, Ankit Pal wrote: > > > There are different oliogos of the same gene and > > replicates with the same sequence present on the > > array. > > Splice variants have been mapped to the same > Accession > > number but I'm working on that to separate out the > > same by naming them differently. > > -Ankit > > > > So, for the different oligos/splice variant spots, > averaging is > probably NOT a good idea in most situations--they > could be measuring > quite different things. When there is a > discrepancy, it MAY be due to > array issues (quality of one spot, for instance), > but it could also be > a true finding; determining which is which often > requires some human > intervention. > > Perhaps others will weigh in on the issue, but I > don't think ad-hoc > averaging of truly duplicated spots really serves a > purpose, either. > Unless your array design allows you to treat the > duplicated spots at > the level of the analysis (lmFit, in limma) and not > post-processing, I > don't think you benefit much from averaging (i.e., > you don't have more > "confidence" in a gene that has been averaged). > > Sean > > Stay connected, organized, and protected. Take the tour:

ADD COMMENT • link 20.6 years ago Ankit Pal ▴ 230

0

Entering edit mode

On May 10, 2005, at 7:45 AM, Ankit Pal wrote: > Dear Sean, > I agree averaging is not a good idea. > So how do I get a single value for a set of replicate > probes? > In case of different values for the same gene which > result do I consider to be representative of the > whole? That is the problem, isn't it. If you have duplicate spots (same sequence), you will need to look at the quality, etc., to see which you believe. If you have oligos that map to the same gene but behave differently, you will need to look at spot quality as well as other issues like the cross-hybridization potential (which often requires blasting), location in the gene (3' bias?), and splice variants that may be tissue specific. As I said, all of these require a bit of human intervention. In practice, though, you have to validate array results biologically--that is the real answer to your question. The "representative" spot is the one that validates; sometimes that will be the one that suggests differential expression, and sometimes not. Sean > It would definately help at the clustering level. > -Ankit

ADD REPLY • link 20.6 years ago Sean Davis 21k

0

Entering edit mode

Gordon Smyth 53k

@gordon-smyth

Last seen 1 hour ago

WEHI, Melbourne, Australia

> Date: Mon, 9 May 2005 22:15:17 -0700 (PDT) > From: Ankit Pal <pal_ankit2000@yahoo.com> > Subject: [BioC] Limma final gene expression report > To: bioconductor@stat.math.ethz.ch > > Dear All, > While looking at the Limma user guide, I came across > the following example > >> targets <- readTargets("SwirlSample.txt") >> RG <- read.maimages(targets$FileName, source="spot") > >> RG$genes <- readGAL() >> RG$printer <- getLayout(RG$genes) >> MA <- normalizeWithinArrays(RG) >> MA <- normalizeBetweenArrays(MA) >> fit <- lmFit(MA, design=c(-1,1,-1,1)) >> fit <- eBayes(fit) >> options(digits=3) >> topTable(fit, n=30, adjust="fdr") > ID Name M A t P.Value B > control BMP2 -2.21 12.1 -21.1 0.000357 7.96 > control BMP2 -2.30 13.1 -20.3 0.000357 7.78 > control Dlx3 -2.18 13.3 -20.0 0.000357 7.71 > control Dlx3 -2.18 13.5 -19.6 0.000357 7.62 > fb94h06 20-L12 1.27 12.0 14.1 0.002067 5.78 > fb40h07 7-D14 1.35 13.8 13.5 0.002067 5.54 > > I have omitted a few rows and columns. > Here we see that after all the data transformations, > we get an output where the ranking for the probes in > an array is done on the basis of the B value. > Notice that there are reapeating names for genes, > therefore for a set of replicates, within and across > arrays, each spot is reported separately as an > individual entity. > In the case of BMP2 from the above example, which > result do I consider? > Is there a way in which I can get a single result for > a set of replicates. No, there isn't. The limma facility to handle duplicate spots applies only when every single probe on your array is replicated the same number of times in a regular pattern. (The intention is to accommodate repeating printing from the same DNA wells, not irregularly repeated occurance of similar DNA in different wells of the DNA plates.) For the Swirl dataset which you're using here, the only probes which are repeated are control probes. There seems to me to be no purpose in averaging results for repeated control probes because then they would be treated differently from library probes and hence would no longer be comparable to the library probes in the statistical analysis. Similar treatment is necessary for them to be truly control probes. The fact that you get both copies of the BMP2 control probe at the top in the above list is useful information -- it shows that the top ranking is no fluke. Gordon > I am new to this package, so please do let me know if > there is a problem in my understanding the concept. > Thank you, > -Ankit

ADD COMMENT • link 20.6 years ago Gordon Smyth 53k

Login before adding your answer.