Ranked genes generated by learning datasets and Differentially expressed genes generated by original data
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 10.3 years ago
Dear R helpers, I'm confused about the applications of ranked top genes generated from multiple learning datasets normally used for supervised classification and those directly acquired from differential gene expression test from original data. With the same cut-off (like FDR<0.05) and nice classification result, are the ranked gene list better candidate for further biological validation (PCR) and gene enrichment analysis? With Respects, Kaj -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] plsgenomics_1.2-6 MASS_7.3-33 limma_3.20.8 [4] RankProd_2.36.0 CMA_1.22.0 Biobase_2.24.0 [7] BiocGenerics_0.10.0 e1071_1.6-3 loaded via a namespace (and not attached): [1] class_7.3-10 tools_3.1.0 -- Sent via the guest posting facility at bioconductor.org.
Classification Classification • 1.3k views
ADD COMMENT
0
Entering edit mode
@sean-davis-490
Last seen 4 months ago
United States
Hi, Kaj. There is no general answer to your question, I do not think. However, assuming that a statistical test was applied in both cases, one would need to evaluate the statistical evidence to make an informed decision. I will say that while machine learning approaches often include feature selection, the most common way to get a set of genes that differentiates two phenotypes or groups of samples is using differential gene expression hypothesis testing. Sean On Tue, Jul 29, 2014 at 11:00 AM, Kaj Chokeshaiusaha [guest] < guest@bioconductor.org> wrote: > Dear R helpers, > > I'm confused about the applications of ranked top genes generated from > multiple learning datasets normally used for supervised classification and > those directly acquired from differential gene expression test from > original data. > > With the same cut-off (like FDR<0.05) and nice classification result, are > the ranked gene list better candidate for further biological validation > (PCR) and gene enrichment analysis? > > With Respects, > Kaj > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 > [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 > [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] plsgenomics_1.2-6 MASS_7.3-33 limma_3.20.8 > [4] RankProd_2.36.0 CMA_1.22.0 Biobase_2.24.0 > [7] BiocGenerics_0.10.0 e1071_1.6-3 > > loaded via a namespace (and not attached): > [1] class_7.3-10 tools_3.1.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Thank you very much. I'll keep that in my mind. With Respects, Kaj 2557-07-29 22:06 GMT+07:00, Sean Davis <sdavis2 at="" mail.nih.gov="">: > Hi, Kaj. > > There is no general answer to your question, I do not think. However, > assuming that a statistical test was applied in both cases, one would need > to evaluate the statistical evidence to make an informed decision. I will > say that while machine learning approaches often include feature selection, > the most common way to get a set of genes that differentiates two > phenotypes or groups of samples is using differential gene expression > hypothesis testing. > > Sean > > > > On Tue, Jul 29, 2014 at 11:00 AM, Kaj Chokeshaiusaha [guest] < > guest at bioconductor.org> wrote: > >> Dear R helpers, >> >> I'm confused about the applications of ranked top genes generated from >> multiple learning datasets normally used for supervised classification >> and >> those directly acquired from differential gene expression test from >> original data. >> >> With the same cut-off (like FDR<0.05) and nice classification result, are >> the ranked gene list better candidate for further biological validation >> (PCR) and gene enrichment analysis? >> >> With Respects, >> Kaj >> >> -- output of sessionInfo(): >> >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8 >> [5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8 >> [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] plsgenomics_1.2-6 MASS_7.3-33 limma_3.20.8 >> [4] RankProd_2.36.0 CMA_1.22.0 Biobase_2.24.0 >> [7] BiocGenerics_0.10.0 e1071_1.6-3 >> >> loaded via a namespace (and not attached): >> [1] class_7.3-10 tools_3.1.0 >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >
ADD REPLY

Login before adding your answer.

Traffic: 470 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6