GCRMA Fold Change

0

Entering edit mode

cap2018@columbia.edu ▴ 20

@cap2018columbiaedu-1580

Last seen 9.7 years ago

I have used GCRMA to process and normalize my chip results. I had sufficient N to use 2 way ANOVA to analyze my data and I have used Baysiean statistics to determine significance levels. I have quite a few probe sets that are considered statistically significant by my analysis, but have a fold change close to 1, indicating the level of transcript is not THAT different between 2 groups. Most of the probe sets that fit this description have very low expression value from the GCRMA analysis, 1-6. I did not filter my results as expressed/not expressed before I did the analysis because I have been told it is unnecessary, and I get many significant results that are definitely are expressed, but I'd like to apply some kind of filter to my statistical results. Is there a way to determine which transcripts are unexpressed, a specific threshold for example? For instance, values 7< are probing expressed transcripts? Christine

probe gcrma PROcess probe gcrma PROcess • 1.3k views

ADD COMMENT • link updated 18.0 years ago by Jenny Drnevich ★ 2.2k • written 18.0 years ago by cap2018@columbia.edu ▴ 20

0

Entering edit mode

Jenny Drnevich ★ 2.2k

@jenny-drnevich-382

Last seen 9.7 years ago

Hi Christine, Instead on filtering on the expression values from GCRMA, I would suggest to use Affymetrix's Present/Marginal/Absent calls. You can get these with the mas5calls() function in the affy library. I use a very conservative filter, and only throw out genes that are "absent" on all arrays. I would suggest that you do this filtering before the statistical analysis for two reasons: 1) the error variances of the filtered genes are affecting your Bayesian statistics and 2) removing the genes will decrease the multiple test correction penalty. However, even after filtering out these genes, you may still have many genes with low fold-changes that are "significant". One can argue all day long on whether these low fold changers are "biologically" significant or not, but if you prefer to follow up first on genes with higher fold changes, then by all means pick these out of your significant gene list. Just make sure to document everything clearly! Cheers, Jenny At 11:14 AM 5/2/2006, cap2018 at columbia.edu wrote: >I have used GCRMA to process and normalize my chip results. I had >sufficient N to use 2 way ANOVA to analyze my data and I have used >Baysiean statistics to determine significance levels. > >I have quite a few probe sets that are considered statistically >significant by my analysis, but have a fold change close to 1, >indicating the level of transcript is not THAT different between 2 >groups. Most of the probe sets that fit this description have very >low expression value from the GCRMA analysis, 1-6. > >I did not filter my results as expressed/not expressed before I did >the analysis because I have been told it is unnecessary, and I get >many significant results that are definitely are expressed, but I'd >like to apply some kind of filter to my statistical results. >Is there a way to determine which transcripts are unexpressed, a >specific threshold for example? For instance, values 7< are >probing expressed transcripts? > >Christine > >_______________________________________________ >Bioconductor mailing list >Bioconductor at stat.math.ethz.ch >https://stat.ethz.ch/mailman/listinfo/bioconductor >Search the archives: >http://news.gmane.org/gmane.science.biology.informatics.conductor Jenny Drnevich, Ph.D. Functional Genomics Bioinformatics Specialist W.M. Keck Center for Comparative and Functional Genomics Roy J. Carver Biotechnology Center University of Illinois, Urbana-Champaign 330 ERML 1201 W. Gregory Dr. Urbana, IL 61801 USA ph: 217-244-7355 fax: 217-265-5066 e-mail: drnevich at uiuc.edu

ADD COMMENT • link 18.0 years ago Jenny Drnevich ★ 2.2k

0

Entering edit mode

On 5/2/06 12:42 PM, "Jenny Drnevich" <drnevich at="" uiuc.edu=""> wrote: > Hi Christine, > > Instead on filtering on the expression values from GCRMA, I would suggest > to use Affymetrix's Present/Marginal/Absent calls. You can get these with > the mas5calls() function in the affy library. I use a very conservative > filter, and only throw out genes that are "absent" on all arrays. I would > suggest that you do this filtering before the statistical analysis for two > reasons: 1) the error variances of the filtered genes are affecting your > Bayesian statistics and 2) removing the genes will decrease the multiple > test correction penalty. > > However, even after filtering out these genes, you may still have many > genes with low fold-changes that are "significant". One can argue all day > long on whether these low fold changers are "biologically" significant or > not, but if you prefer to follow up first on genes with higher fold > changes, then by all means pick these out of your significant gene list. > Just make sure to document everything clearly! Just to point out one detail--if you filter too stringently (which Jenny is careful not to do), you will potentially lose some of the most interesting genes, namely those that are expressed in one group and not in another, so there is potentially a fine line between too much filtering and not enough. Sean

ADD REPLY • link 18.0 years ago Sean Davis 21k

0

Entering edit mode

On Tue, May 02, 2006 at 01:02:20PM -0400, Sean Davis wrote: <sean>> Instead on filtering on the expression values from GCRMA, I would suggest <sean>> to use Affymetrix's Present/Marginal/Absent calls. You can get these with <sean>> the mas5calls() function in the affy library. I use a very conservative <sean>> filter, and only throw out genes that are "absent" on all arrays. I would <sean>> suggest that you do this filtering before the statistical analysis for two <sean>> reasons: 1) the error variances of the filtered genes are affecting your <sean>> Bayesian statistics and 2) removing the genes will decrease the multiple <sean>> test correction penalty. <sean>> <sean>> However, even after filtering out these genes, you may still have many <sean>> genes with low fold-changes that are "significant". One can argue all day <sean>> long on whether these low fold changers are "biologically" significant or <sean>> not, but if you prefer to follow up first on genes with higher fold <sean>> changes, then by all means pick these out of your significant gene list. <sean>> Just make sure to document everything clearly! <sean> <sean>Just to point out one detail--if you filter too stringently (which Jenny is <sean>careful not to do), you will potentially lose some of the most interesting <sean>genes, namely those that are expressed in one group and not in another, so <sean>there is potentially a fine line between too much filtering and not enough. <sean> A recent paper (McClintick Bioinformatics 7:49 2006) talk about this. As already pointed out filtering, too much (e.g. 100% Presence) doesn't seem a good idea. Stefano <sean>Sean <sean> <sean>_______________________________________________ <sean>Bioconductor mailing list <sean>Bioconductor at stat.math.ethz.ch <sean>https://stat.ethz.ch/mailman/listinfo/bioconductor <sean>Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Stefano Calza, PhD Researcher - Biostatistician Sezione di Statistica Medica e Biometria Dipartimento di Scienze Biomediche e Biotecnologie Universit? degli Studi di Brescia - Italy Viale Europa, 11 25123 Brescia email: calza at med.unibs.it Phone: +390303717653 Fax: +390303717488

ADD REPLY • link 18.0 years ago stecalza@tiscali.it ▴ 290

0

Entering edit mode

The journal is BMC Bioinformatics for those interested. -Christos -----Original Message----- From: bioconductor-bounces@stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Stefano Calza Sent: Tuesday, May 02, 2006 1:24 PM To: bioconductor at stat.math.ethz.ch Subject: Re: [BioC] GCRMA Fold Change On Tue, May 02, 2006 at 01:02:20PM -0400, Sean Davis wrote: <sean>> Instead on filtering on the expression values from GCRMA, I would suggest <sean>> to use Affymetrix's Present/Marginal/Absent calls. You can get these with <sean>> the mas5calls() function in the affy library. I use a very conservative <sean>> filter, and only throw out genes that are "absent" on all arrays. I would <sean>> suggest that you do this filtering before the statistical analysis for two <sean>> reasons: 1) the error variances of the filtered genes are affecting your <sean>> Bayesian statistics and 2) removing the genes will decrease the multiple <sean>> test correction penalty. <sean>> <sean>> However, even after filtering out these genes, you may still have many <sean>> genes with low fold-changes that are "significant". One can argue all day <sean>> long on whether these low fold changers are "biologically" significant or <sean>> not, but if you prefer to follow up first on genes with higher fold <sean>> changes, then by all means pick these out of your significant gene list. <sean>> Just make sure to document everything clearly! <sean> <sean>Just to point out one detail--if you filter too stringently (which Jenny is <sean>careful not to do), you will potentially lose some of the most interesting <sean>genes, namely those that are expressed in one group and not in another, so <sean>there is potentially a fine line between too much filtering and not enough. <sean> A recent paper (McClintick Bioinformatics 7:49 2006) talk about this. As already pointed out filtering, too much (e.g. 100% Presence) doesn't seem a good idea. Stefano <sean>Sean <sean> <sean>_______________________________________________ <sean>Bioconductor mailing list <sean>Bioconductor at stat.math.ethz.ch <sean>https://stat.ethz.ch/mailman/listinfo/bioconductor <sean>Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Stefano Calza, PhD Researcher - Biostatistician Sezione di Statistica Medica e Biometria Dipartimento di Scienze Biomediche e Biotecnologie Universit? degli Studi di Brescia - Italy Viale Europa, 11 25123 Brescia email: calza at med.unibs.it Phone: +390303717653 Fax: +390303717488 _______________________________________________ Bioconductor mailing list Bioconductor at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 18.0 years ago Christos Hatzis ▴ 110

0

Entering edit mode

Ops! Forgot the BMC. Right is BMC Bioinformatics. Sorry! Stefano On Tue, May 02, 2006 at 01:02:20PM -0400, Sean Davis wrote: <sean> <sean> <sean> <sean>On 5/2/06 12:42 PM, "Jenny Drnevich" <drnevich at="" uiuc.edu=""> wrote: <sean> <sean>> Hi Christine, <sean>> <sean>> Instead on filtering on the expression values from GCRMA, I would suggest <sean>> to use Affymetrix's Present/Marginal/Absent calls. You can get these with <sean>> the mas5calls() function in the affy library. I use a very conservative <sean>> filter, and only throw out genes that are "absent" on all arrays. I would <sean>> suggest that you do this filtering before the statistical analysis for two <sean>> reasons: 1) the error variances of the filtered genes are affecting your <sean>> Bayesian statistics and 2) removing the genes will decrease the multiple <sean>> test correction penalty. <sean>> <sean>> However, even after filtering out these genes, you may still have many <sean>> genes with low fold-changes that are "significant". One can argue all day <sean>> long on whether these low fold changers are "biologically" significant or <sean>> not, but if you prefer to follow up first on genes with higher fold <sean>> changes, then by all means pick these out of your significant gene list. <sean>> Just make sure to document everything clearly! <sean> <sean>Just to point out one detail--if you filter too stringently (which Jenny is <sean>careful not to do), you will potentially lose some of the most interesting <sean>genes, namely those that are expressed in one group and not in another, so <sean>there is potentially a fine line between too much filtering and not enough. <sean> <sean>Sean <sean> <sean>_______________________________________________ <sean>Bioconductor mailing list <sean>Bioconductor at stat.math.ethz.ch <sean>https://stat.ethz.ch/mailman/listinfo/bioconductor <sean>Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- Stefano Calza, PhD Researcher - Biostatistician Sezione di Statistica Medica e Biometria Dipartimento di Scienze Biomediche e Biotecnologie Universit? degli Studi di Brescia - Italy Viale Europa, 11 25123 Brescia email: calza at med.unibs.it Phone: +390303717653 Fax: +390303717488

ADD REPLY • link 18.0 years ago stecalza@tiscali.it ▴ 290

Login before adding your answer.