Single sample normalization of single-channel Agilent microarrays

0

Entering edit mode

Guest User ★ 13k

@guest-user-4897

Last seen 9.6 years ago

Dear BioConductor community, when faced with the concept of generating a microarray-based classifier for a clinical condition (say responder vs non-responder to a treatment), I have issues understaing how, after a model is built from a training set, it can be applied prospectively in a serial way in a prospective trial. It is my understanding that most normalization methods depend, at some point, on the information derived from the microarray batch which a given sample is normalized with. Few methods circumvent this issue, such as fRMA (in case one has the possibility to use Affy HGU133 Plus 2.0 arrays) or SCAN.UPC, which would be suitable for most Affy arrays and even dual-channel Agilent arrays. What about single-channel Agilent arrays? And which were the methods used in all the works published before those methods were published? Thanks in advance, I hope this is not too general a question -- output of sessionInfo(): R version 3.1.0 (2014-04-10) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=de_BE.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=de_BE.UTF-8 [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_BE.UTF-8 LC_NAME=de_BE.UTF-8 LC_ADDRESS=de_BE.UTF-8 LC_TELEPHONE=de_BE.UTF-8 [11] LC_MEASUREMENT=de_BE.UTF-8 LC_IDENTIFICATION=de_BE.UTF-8 attached base packages: [1] parallel splines stats graphics grDevices utils datasets methods base other attached packages: [1] frma_1.16.0 SCAN.UPC_2.6.3 sva_3.10.0 mgcv_1.8-1 nlme_3.1-117 corpcor_1.6.6 foreach_1.4.2 [8] affyio_1.32.0 affy_1.42.3 GEOquery_2.30.1 oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0 IRanges_1.22.9 [15] oligoClasses_1.26.0 Biobase_2.24.0 BiocGenerics_0.10.0 BiocInstaller_1.14.2 xlsx_0.5.5 xlsxjars_0.6.0 rJava_0.9-6 [22] ggplot2_1.0.0 aod_1.3 survcomp_1.14.0 prodlim_1.4.3 survival_2.37-7 limma_3.20.8 loaded via a namespace (and not attached): [1] affxparser_1.36.0 bit_1.1-12 bootstrap_2014.4 codetools_0.2-8 colorspace_1.2-4 DBI_0.2-7 digest_0.6.4 [8] ff_2.2-13 GenomeInfoDb_1.0.2 GenomicRanges_1.16.3 grid_3.1.0 gtable_0.1.2 iterators_1.0.7 KernSmooth_2.23-12 [15] lattice_0.20-29 lava_1.2.6 MASS_7.3-33 Matrix_1.1-4 munsell_0.4.2 plyr_1.8.1 preprocessCore_1.26.1 [22] proto_0.3-10 Rcpp_0.11.2 RCurl_1.95-4.1 reshape2_1.4 rmeta_2.16 scales_0.2.4 stats4_3.1.0 [29] stringr_0.6.2 SuppDists_1.1-9.1 survivalROC_1.0.3 tools_3.1.0 XML_3.98-1.1 zlibbioc_1.10.0 -- Sent via the guest posting facility at bioconductor.org.

affy frma SCAN.UPC affy frma SCAN.UPC • 1.8k views

ADD COMMENT • link updated 9.7 years ago by Ryan C. Thompson ★ 7.9k • written 9.7 years ago by Guest User ★ 13k

0

Entering edit mode

Ryan C. Thompson ★ 7.9k

@ryan-c-thompson-5618

Last seen 8 months ago

Scripps Research, La Jolla, CA

Well, if you have a large training set, one option is to use frmaTools to generate a fRMA normalization for your dataset. Then you can use this normalization on the individual samples in the test/validation set. http://bioconductor.org/packages/release/bioc/html/frmaTools.html Also, I know there was another similar method for freezing normalization and other parameters based on a training set, but I can't remember the name of it at all, so I can't find it on Google. On Tue Aug 26 09:42:45 2014, Gabriele Zoppoli [guest] wrote: > Dear BioConductor community, > > when faced with the concept of generating a microarray-based classifier for a clinical condition (say responder vs non-responder to a treatment), I have issues understaing how, after a model is built from a training set, it can be applied prospectively in a serial way in a prospective trial. It is my understanding that most normalization methods depend, at some point, on the information derived from the microarray batch which a given sample is normalized with. Few methods circumvent this issue, such as fRMA (in case one has the possibility to use Affy HGU133 Plus 2.0 arrays) or SCAN.UPC, which would be suitable for most Affy arrays and even dual-channel Agilent arrays. What about single-channel Agilent arrays? And which were the methods used in all the works published before those methods were published? Thanks in advance, I hope this is not too general a question > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=de_BE.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=de_BE.UTF-8 > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_BE.UTF-8 LC_NAME=de_BE.UTF-8 LC_ADDRESS=de_BE.UTF-8 LC_TELEPHONE=de_BE.UTF-8 > [11] LC_MEASUREMENT=de_BE.UTF-8 LC_IDENTIFICATION=de_BE.UTF-8 > > attached base packages: > [1] parallel splines stats graphics grDevices utils datasets methods base > > other attached packages: > [1] frma_1.16.0 SCAN.UPC_2.6.3 sva_3.10.0 mgcv_1.8-1 nlme_3.1-117 corpcor_1.6.6 foreach_1.4.2 > [8] affyio_1.32.0 affy_1.42.3 GEOquery_2.30.1 oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0 IRanges_1.22.9 > [15] oligoClasses_1.26.0 Biobase_2.24.0 BiocGenerics_0.10.0 BiocInstaller_1.14.2 xlsx_0.5.5 xlsxjars_0.6.0 rJava_0.9-6 > [22] ggplot2_1.0.0 aod_1.3 survcomp_1.14.0 prodlim_1.4.3 survival_2.37-7 limma_3.20.8 > > loaded via a namespace (and not attached): > [1] affxparser_1.36.0 bit_1.1-12 bootstrap_2014.4 codetools_0.2-8 colorspace_1.2-4 DBI_0.2-7 digest_0.6.4 > [8] ff_2.2-13 GenomeInfoDb_1.0.2 GenomicRanges_1.16.3 grid_3.1.0 gtable_0.1.2 iterators_1.0.7 KernSmooth_2.23-12 > [15] lattice_0.20-29 lava_1.2.6 MASS_7.3-33 Matrix_1.1-4 munsell_0.4.2 plyr_1.8.1 preprocessCore_1.26.1 > [22] proto_0.3-10 Rcpp_0.11.2 RCurl_1.95-4.1 reshape2_1.4 rmeta_2.16 scales_0.2.4 stats4_3.1.0 > [29] stringr_0.6.2 SuppDists_1.1-9.1 survivalROC_1.0.3 tools_3.1.0 XML_3.98-1.1 zlibbioc_1.10.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 9.7 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

Dear Ryan, I have thought about that. Conceptually however, I'm afraid I could be criticized for using the training set itself as a reference. One would have to use an independent, high quality data set to normalize both samples in the training set AND then the single samples for further analysis. Alas, this requires a lot of samples we don't have... On Tue, Aug 26, 2014 at 7:25 PM, Ryan <rct at="" thompsonclan.org=""> wrote: > Well, if you have a large training set, one option is to use frmaTools to > generate a fRMA normalization for your dataset. Then you can use this > normalization on the individual samples in the test/validation set. > > http://bioconductor.org/packages/release/bioc/html/frmaTools.html > > Also, I know there was another similar method for freezing normalization > and other parameters based on a training set, but I can't remember the name > of it at all, so I can't find it on Google. > > > On Tue Aug 26 09:42:45 2014, Gabriele Zoppoli [guest] wrote: > >> Dear BioConductor community, >> >> when faced with the concept of generating a microarray-based classifier >> for a clinical condition (say responder vs non-responder to a treatment), I >> have issues understaing how, after a model is built from a training set, it >> can be applied prospectively in a serial way in a prospective trial. It is >> my understanding that most normalization methods depend, at some point, on >> the information derived from the microarray batch which a given sample is >> normalized with. Few methods circumvent this issue, such as fRMA (in case >> one has the possibility to use Affy HGU133 Plus 2.0 arrays) or SCAN.UPC, >> which would be suitable for most Affy arrays and even dual-channel Agilent >> arrays. What about single-channel Agilent arrays? And which were the >> methods used in all the works published before those methods were >> published? Thanks in advance, I hope this is not too general a question >> >> -- output of sessionInfo(): >> >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> LC_TIME=de_BE.UTF-8 LC_COLLATE=en_US.UTF-8 >> LC_MONETARY=de_BE.UTF-8 >> [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_BE.UTF-8 >> LC_NAME=de_BE.UTF-8 LC_ADDRESS=de_BE.UTF-8 >> LC_TELEPHONE=de_BE.UTF-8 >> [11] LC_MEASUREMENT=de_BE.UTF-8 LC_IDENTIFICATION=de_BE.UTF-8 >> >> attached base packages: >> [1] parallel splines stats graphics grDevices utils datasets >> methods base >> >> other attached packages: >> [1] frma_1.16.0 SCAN.UPC_2.6.3 sva_3.10.0 >> mgcv_1.8-1 nlme_3.1-117 corpcor_1.6.6 >> foreach_1.4.2 >> [8] affyio_1.32.0 affy_1.42.3 GEOquery_2.30.1 >> oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0 >> IRanges_1.22.9 >> [15] oligoClasses_1.26.0 Biobase_2.24.0 BiocGenerics_0.10.0 >> BiocInstaller_1.14.2 xlsx_0.5.5 xlsxjars_0.6.0 rJava_0.9-6 >> [22] ggplot2_1.0.0 aod_1.3 survcomp_1.14.0 >> prodlim_1.4.3 survival_2.37-7 limma_3.20.8 >> >> loaded via a namespace (and not attached): >> [1] affxparser_1.36.0 bit_1.1-12 bootstrap_2014.4 >> codetools_0.2-8 colorspace_1.2-4 DBI_0.2-7 >> digest_0.6.4 >> [8] ff_2.2-13 GenomeInfoDb_1.0.2 GenomicRanges_1.16.3 >> grid_3.1.0 gtable_0.1.2 iterators_1.0.7 >> KernSmooth_2.23-12 >> [15] lattice_0.20-29 lava_1.2.6 MASS_7.3-33 >> Matrix_1.1-4 munsell_0.4.2 plyr_1.8.1 >> preprocessCore_1.26.1 >> [22] proto_0.3-10 Rcpp_0.11.2 RCurl_1.95-4.1 >> reshape2_1.4 rmeta_2.16 scales_0.2.4 >> stats4_3.1.0 >> [29] stringr_0.6.2 SuppDists_1.1-9.1 survivalROC_1.0.3 >> tools_3.1.0 XML_3.98-1.1 zlibbioc_1.10.0 >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> > -- Gabriele Zoppoli, MD Ph.D., Clinical and Experimental Oncology and Hematology Internal Medicine Specialist Research Assistant, DiMI, IRCCS AOU San Martino IST, Genova, IT Research Fellow, BCTL J.C. Heuson, Institut J. Bordet, Brussels, BE BIG Deputy for the Gruppo Oncologico Italiano di Ricerca Clinica (GOIRC) Tel: +39 010 353 7968 Mobile : +32 478 24 03 11 Email: gabriele.zoppoli at unige.it Alt. Email: zoppoli at gmail.com Alt. Email 2: gabriele.zoppoli at bordet.be ---------------------------------------------------------- ??? ????? ???? ?? ????? ??' ????? ???? ??????, ??????? ?' ??????, ??? ?' ??????????? ???????: ?? ?? ???? ??? ???????, ???? ?? ??? ?????? ?????. *Father Zeus, at least deliver the sons of Acheans from the gloom,* *And make clear the air, and give it to our eyes to see.* *In the light destroy us, since to do thus pleases you. (Il. 17, 645-7)* ---------------------------------------------------------- CONFIDENTIALITY NOTICE\ \ This e-mail message is intende...{{dropped:14}}

ADD REPLY • link 9.7 years ago Gabriele Zoppoli ▴ 50

0

Entering edit mode

Actually, I also don't think RMA-like normalization methods, including the fRMA, can be used for Agilent single-channel arrays On Tue, Aug 26, 2014 at 7:25 PM, Ryan <rct at="" thompsonclan.org=""> wrote: > Well, if you have a large training set, one option is to use frmaTools to > generate a fRMA normalization for your dataset. Then you can use this > normalization on the individual samples in the test/validation set. > > http://bioconductor.org/packages/release/bioc/html/frmaTools.html > > Also, I know there was another similar method for freezing normalization > and other parameters based on a training set, but I can't remember the name > of it at all, so I can't find it on Google. > > > On Tue Aug 26 09:42:45 2014, Gabriele Zoppoli [guest] wrote: > >> Dear BioConductor community, >> >> when faced with the concept of generating a microarray-based classifier >> for a clinical condition (say responder vs non-responder to a treatment), I >> have issues understaing how, after a model is built from a training set, it >> can be applied prospectively in a serial way in a prospective trial. It is >> my understanding that most normalization methods depend, at some point, on >> the information derived from the microarray batch which a given sample is >> normalized with. Few methods circumvent this issue, such as fRMA (in case >> one has the possibility to use Affy HGU133 Plus 2.0 arrays) or SCAN.UPC, >> which would be suitable for most Affy arrays and even dual-channel Agilent >> arrays. What about single-channel Agilent arrays? And which were the >> methods used in all the works published before those methods were >> published? Thanks in advance, I hope this is not too general a question >> >> -- output of sessionInfo(): >> >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> LC_TIME=de_BE.UTF-8 LC_COLLATE=en_US.UTF-8 >> LC_MONETARY=de_BE.UTF-8 >> [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_BE.UTF-8 >> LC_NAME=de_BE.UTF-8 LC_ADDRESS=de_BE.UTF-8 >> LC_TELEPHONE=de_BE.UTF-8 >> [11] LC_MEASUREMENT=de_BE.UTF-8 LC_IDENTIFICATION=de_BE.UTF-8 >> >> attached base packages: >> [1] parallel splines stats graphics grDevices utils datasets >> methods base >> >> other attached packages: >> [1] frma_1.16.0 SCAN.UPC_2.6.3 sva_3.10.0 >> mgcv_1.8-1 nlme_3.1-117 corpcor_1.6.6 >> foreach_1.4.2 >> [8] affyio_1.32.0 affy_1.42.3 GEOquery_2.30.1 >> oligo_1.28.2 Biostrings_2.32.1 XVector_0.4.0 >> IRanges_1.22.9 >> [15] oligoClasses_1.26.0 Biobase_2.24.0 BiocGenerics_0.10.0 >> BiocInstaller_1.14.2 xlsx_0.5.5 xlsxjars_0.6.0 rJava_0.9-6 >> [22] ggplot2_1.0.0 aod_1.3 survcomp_1.14.0 >> prodlim_1.4.3 survival_2.37-7 limma_3.20.8 >> >> loaded via a namespace (and not attached): >> [1] affxparser_1.36.0 bit_1.1-12 bootstrap_2014.4 >> codetools_0.2-8 colorspace_1.2-4 DBI_0.2-7 >> digest_0.6.4 >> [8] ff_2.2-13 GenomeInfoDb_1.0.2 GenomicRanges_1.16.3 >> grid_3.1.0 gtable_0.1.2 iterators_1.0.7 >> KernSmooth_2.23-12 >> [15] lattice_0.20-29 lava_1.2.6 MASS_7.3-33 >> Matrix_1.1-4 munsell_0.4.2 plyr_1.8.1 >> preprocessCore_1.26.1 >> [22] proto_0.3-10 Rcpp_0.11.2 RCurl_1.95-4.1 >> reshape2_1.4 rmeta_2.16 scales_0.2.4 >> stats4_3.1.0 >> [29] stringr_0.6.2 SuppDists_1.1-9.1 survivalROC_1.0.3 >> tools_3.1.0 XML_3.98-1.1 zlibbioc_1.10.0 >> >> -- >> Sent via the guest posting facility at bioconductor.org. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> > -- Gabriele Zoppoli, MD Ph.D., Clinical and Experimental Oncology and Hematology Internal Medicine Specialist Research Assistant, DiMI, IRCCS AOU San Martino IST, Genova, IT Research Fellow, BCTL J.C. Heuson, Institut J. Bordet, Brussels, BE BIG Deputy for the Gruppo Oncologico Italiano di Ricerca Clinica (GOIRC) Tel: +39 010 353 7968 Mobile : +32 478 24 03 11 Email: gabriele.zoppoli at unige.it Alt. Email: zoppoli at gmail.com Alt. Email 2: gabriele.zoppoli at bordet.be ---------------------------------------------------------- ??? ????? ???? ?? ????? ??' ????? ???? ??????, ??????? ?' ??????, ??? ?' ??????????? ???????: ?? ?? ???? ??? ???????, ???? ?? ??? ?????? ?????. *Father Zeus, at least deliver the sons of Acheans from the gloom,* *And make clear the air, and give it to our eyes to see.* *In the light destroy us, since to do thus pleases you. (Il. 17, 645-7)* ---------------------------------------------------------- CONFIDENTIALITY NOTICE\ \ This e-mail message is intende...{{dropped:14}}

ADD REPLY • link 9.7 years ago Gabriele Zoppoli ▴ 50

0

Entering edit mode

Yes, of course you are right about that. I should have spoken more generally about normalizing your training set and then normalizing each test/validation sample individually to the already-normalized training set. I think this approach is perfectly valid as long as you do not make any use of the clinical condition in the normalization step. For example, if you normalized your training set to each other using quantile normalization, you could save the resulting quantiles as a reference and then normalize each test/validation sample to those exact reference quantiles. On Tue Aug 26 13:51:58 2014, Gabriele Zoppoli wrote: > Actually, I also don't think RMA-like normalization methods, including > the fRMA, can be used for Agilent single-channel arrays > > > On Tue, Aug 26, 2014 at 7:25 PM, Ryan <rct at="" thompsonclan.org=""> <mailto:rct at="" thompsonclan.org="">> wrote: > > Well, if you have a large training set, one option is to use > frmaTools to generate a fRMA normalization for your dataset. Then > you can use this normalization on the individual samples in the > test/validation set. > > http://bioconductor.org/__packages/release/bioc/html/__frmaTools.html > <http: bioconductor.org="" packages="" release="" bioc="" html="" frmatools.html=""> > > Also, I know there was another similar method for freezing > normalization and other parameters based on a training set, but I > can't remember the name of it at all, so I can't find it on Google. > > > On Tue Aug 26 09:42:45 2014, Gabriele Zoppoli [guest] wrote: > > Dear BioConductor community, > > when faced with the concept of generating a microarray-based > classifier for a clinical condition (say responder vs > non-responder to a treatment), I have issues understaing how, > after a model is built from a training set, it can be applied > prospectively in a serial way in a prospective trial. It is my > understanding that most normalization methods depend, at some > point, on the information derived from the microarray batch > which a given sample is normalized with. Few methods > circumvent this issue, such as fRMA (in case one has the > possibility to use Affy HGU133 Plus 2.0 arrays) or SCAN.UPC, > which would be suitable for most Affy arrays and even > dual-channel Agilent arrays. What about single-channel Agilent > arrays? And which were the methods used in all the works > published before those methods were published? Thanks in > advance, I hope this is not too general a question > > -- output of sessionInfo(): > > R version 3.1.0 (2014-04-10) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > LC_TIME=de_BE.UTF-8 LC_COLLATE=en_US.UTF-8 > LC_MONETARY=de_BE.UTF-8 > [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_BE.UTF-8 > LC_NAME=de_BE.UTF-8 LC_ADDRESS=de_BE.UTF-8 > LC_TELEPHONE=de_BE.UTF-8 > [11] LC_MEASUREMENT=de_BE.UTF-8 LC_IDENTIFICATION=de_BE.UTF-8 > > attached base packages: > [1] parallel splines stats graphics grDevices utils > datasets methods base > > other attached packages: > [1] frma_1.16.0 SCAN.UPC_2.6.3 sva_3.10.0 > mgcv_1.8-1 nlme_3.1-117 > corpcor_1.6.6 foreach_1.4.2 > [8] affyio_1.32.0 affy_1.42.3 > GEOquery_2.30.1 oligo_1.28.2 Biostrings_2.32.1 > XVector_0.4.0 IRanges_1.22.9 > [15] oligoClasses_1.26.0 Biobase_2.24.0 > BiocGenerics_0.10.0 BiocInstaller_1.14.2 xlsx_0.5.5 > xlsxjars_0.6.0 rJava_0.9-6 > [22] ggplot2_1.0.0 aod_1.3 > survcomp_1.14.0 prodlim_1.4.3 survival_2.37-7 > limma_3.20.8 > > loaded via a namespace (and not attached): > [1] affxparser_1.36.0 bit_1.1-12 > bootstrap_2014.4 codetools_0.2-8 colorspace_1.2-4 > DBI_0.2-7 digest_0.6.4 > [8] ff_2.2-13 GenomeInfoDb_1.0.2 > GenomicRanges_1.16.3 grid_3.1.0 gtable_0.1.2 > iterators_1.0.7 KernSmooth_2.23-12 > [15] lattice_0.20-29 lava_1.2.6 MASS_7.3-33 > Matrix_1.1-4 munsell_0.4.2 > plyr_1.8.1 preprocessCore_1.26.1 > [22] proto_0.3-10 Rcpp_0.11.2 > RCurl_1.95-4.1 reshape2_1.4 rmeta_2.16 > scales_0.2.4 stats4_3.1.0 > [29] stringr_0.6.2 SuppDists_1.1-9.1 > survivalROC_1.0.3 tools_3.1.0 XML_3.98-1.1 > zlibbioc_1.10.0 > > -- > Sent via the guest posting facility at bioconductor.org > <http: bioconductor.org="">. > > _________________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/__listinfo/bioconductor > <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> > Search the archives: > http://news.gmane.org/gmane.__science.biology.informatics.__conductor > <http: news.gmane.org="" gmane.science.biology.informatics.conductor=""> > > > > > -- > Gabriele Zoppoli, MD > Ph.D., Clinical and Experimental Oncology and Hematology > Internal Medicine Specialist > Research Assistant, DiMI, IRCCS AOU San Martino IST, Genova, IT > Research Fellow, BCTL J.C. Heuson, Institut J. Bordet, Brussels, BE > BIG Deputy for the Gruppo Oncologico Italiano di Ricerca Clinica (GOIRC) > > > Tel: +39 010 353 7968 > Mobile : +32 478 24 03 11 > Email: gabriele.zoppoli at unige.it <mailto:gabriele.zoppoli at="" unige.it=""> > Alt. Email: zoppoli at gmail.com <mailto:zoppoli at="" gmail.com=""> > Alt. Email 2: gabriele.zoppoli at bordet.be > <mailto:gabriele.zoppoli at="" bordet.be=""> > ---------------------------------------------------------- > > ??? ????? ???? ?? ????? ??' ????? ???? ??????, > > ??????? ?' ??????, ??? ?' ??????????? ???????: > > ?? ?? ???? ??? ???????, ???? ?? ??? ?????? ?????. > > /Father Zeus, at least deliver the sons of Acheans from the gloom,/ > /And make clear the air, and give it to our eyes to see./ > /In the light destroy us, since to do thus pleases you. (Il. 17, 645-7) > / > ---------------------------------------------------------- > > CONFIDENTIALITY NOTICE > > This e-mail message is intended only for the individual or entity to > which it is addressed. This e-mail and any attachments may contain > information that is privileged, confidential and exempt from > disclosure under applicable law. If you are not the intended > recipient, you are hereby notified that any dissemination, > distribution or copying of this communication is strictly prohibited. > If you received this e-mail by accident, please notify the sender > immediately and destroy this e-mail and all copies of it.

ADD REPLY • link 9.7 years ago Ryan C. Thompson ★ 7.9k

0

Entering edit mode

That is cool advice! Thanks On Wed, Aug 27, 2014 at 1:46 AM, Ryan <rct at="" thompsonclan.org=""> wrote: > Yes, of course you are right about that. I should have spoken more > generally about normalizing your training set and then normalizing each > test/validation sample individually to the already-normalized training set. > I think this approach is perfectly valid as long as you do not make any use > of the clinical condition in the normalization step. For example, if you > normalized your training set to each other using quantile normalization, > you could save the resulting quantiles as a reference and then normalize > each test/validation sample to those exact reference quantiles. > > > On Tue Aug 26 13:51:58 2014, Gabriele Zoppoli wrote: > >> Actually, I also don't think RMA-like normalization methods, including >> the fRMA, can be used for Agilent single-channel arrays >> >> >> On Tue, Aug 26, 2014 at 7:25 PM, Ryan <rct at="" thompsonclan.org="">> <mailto:rct at="" thompsonclan.org="">> wrote: >> >> Well, if you have a large training set, one option is to use >> frmaTools to generate a fRMA normalization for your dataset. Then >> you can use this normalization on the individual samples in the >> test/validation set. >> >> http://bioconductor.org/__packages/release/bioc/html/__frmaTools.html >> >> <http: bioconductor.org="" packages="" release="" bioc="" html="" frmatools.html=""> >> >> Also, I know there was another similar method for freezing >> normalization and other parameters based on a training set, but I >> can't remember the name of it at all, so I can't find it on Google. >> >> >> On Tue Aug 26 09:42:45 2014, Gabriele Zoppoli [guest] wrote: >> >> Dear BioConductor community, >> >> when faced with the concept of generating a microarray- based >> classifier for a clinical condition (say responder vs >> non-responder to a treatment), I have issues understaing how, >> after a model is built from a training set, it can be applied >> prospectively in a serial way in a prospective trial. It is my >> understanding that most normalization methods depend, at some >> point, on the information derived from the microarray batch >> which a given sample is normalized with. Few methods >> circumvent this issue, such as fRMA (in case one has the >> possibility to use Affy HGU133 Plus 2.0 arrays) or SCAN.UPC, >> which would be suitable for most Affy arrays and even >> dual-channel Agilent arrays. What about single-channel Agilent >> arrays? And which were the methods used in all the works >> published before those methods were published? Thanks in >> advance, I hope this is not too general a question >> >> -- output of sessionInfo(): >> >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> LC_TIME=de_BE.UTF-8 LC_COLLATE=en_US.UTF-8 >> LC_MONETARY=de_BE.UTF-8 >> [6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=de_BE.UTF-8 >> LC_NAME=de_BE.UTF-8 LC_ADDRESS=de_BE.UTF-8 >> LC_TELEPHONE=de_BE.UTF-8 >> [11] LC_MEASUREMENT=de_BE.UTF-8 LC_IDENTIFICATION=de_BE.UTF-8 >> >> attached base packages: >> [1] parallel splines stats graphics grDevices utils >> datasets methods base >> >> other attached packages: >> [1] frma_1.16.0 SCAN.UPC_2.6.3 sva_3.10.0 >> mgcv_1.8-1 nlme_3.1-117 >> corpcor_1.6.6 foreach_1.4.2 >> [8] affyio_1.32.0 affy_1.42.3 >> GEOquery_2.30.1 oligo_1.28.2 Biostrings_2.32.1 >> XVector_0.4.0 IRanges_1.22.9 >> [15] oligoClasses_1.26.0 Biobase_2.24.0 >> BiocGenerics_0.10.0 BiocInstaller_1.14.2 xlsx_0.5.5 >> xlsxjars_0.6.0 rJava_0.9-6 >> [22] ggplot2_1.0.0 aod_1.3 >> survcomp_1.14.0 prodlim_1.4.3 survival_2.37-7 >> limma_3.20.8 >> >> loaded via a namespace (and not attached): >> [1] affxparser_1.36.0 bit_1.1-12 >> bootstrap_2014.4 codetools_0.2-8 colorspace_1.2-4 >> DBI_0.2-7 digest_0.6.4 >> [8] ff_2.2-13 GenomeInfoDb_1.0.2 >> GenomicRanges_1.16.3 grid_3.1.0 gtable_0.1.2 >> iterators_1.0.7 KernSmooth_2.23-12 >> [15] lattice_0.20-29 lava_1.2.6 MASS_7.3-33 >> Matrix_1.1-4 munsell_0.4.2 >> plyr_1.8.1 preprocessCore_1.26.1 >> [22] proto_0.3-10 Rcpp_0.11.2 >> RCurl_1.95-4.1 reshape2_1.4 rmeta_2.16 >> scales_0.2.4 stats4_3.1.0 >> [29] stringr_0.6.2 SuppDists_1.1-9.1 >> survivalROC_1.0.3 tools_3.1.0 XML_3.98-1.1 >> zlibbioc_1.10.0 >> >> -- >> Sent via the guest posting facility at bioconductor.org >> <http: bioconductor.org="">. >> >> _________________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> >> https://stat.ethz.ch/mailman/__listinfo/bioconductor >> >> <https: stat.ethz.ch="" mailman="" listinfo="" bioconductor=""> >> Search the archives: >> http://news.gmane.org/gmane.__science.biology.informatics.__ >> conductor >> >> <http: news.gmane.org="" gmane.science.biology.informatics.="">> conductor> >> >> >> >> >> -- >> Gabriele Zoppoli, MD >> Ph.D., Clinical and Experimental Oncology and Hematology >> Internal Medicine Specialist >> Research Assistant, DiMI, IRCCS AOU San Martino IST, Genova, IT >> Research Fellow, BCTL J.C. Heuson, Institut J. Bordet, Brussels, BE >> BIG Deputy for the Gruppo Oncologico Italiano di Ricerca Clinica (GOIRC) >> >> >> Tel: +39 010 353 7968 >> Mobile : +32 478 24 03 11 >> Email: gabriele.zoppoli at unige.it <mailto:gabriele.zoppoli at="" unige.it=""> >> Alt. Email: zoppoli at gmail.com <mailto:zoppoli at="" gmail.com=""> >> Alt. Email 2: gabriele.zoppoli at bordet.be >> <mailto:gabriele.zoppoli at="" bordet.be=""> >> >> ---------------------------------------------------------- >> >> ??? ????? ???? ?? ????? ??' ????? ???? ??????, >> >> ??????? ?' ??????, ??? ?' ??????????? ???????: >> >> ?? ?? ???? ??? ???????, ???? ?? ??? ?????? ?????. >> >> /Father Zeus, at least deliver the sons of Acheans from the gloom,/ >> /And make clear the air, and give it to our eyes to see./ >> /In the light destroy us, since to do thus pleases you. (Il. 17, 645-7) >> >> / >> ---------------------------------------------------------- >> >> CONFIDENTIALITY NOTICE >> >> This e-mail message is intended only for the individual or entity to >> which it is addressed. This e-mail and any attachments may contain >> information that is privileged, confidential and exempt from >> disclosure under applicable law. If you are not the intended >> recipient, you are hereby notified that any dissemination, >> distribution or copying of this communication is strictly prohibited. >> If you received this e-mail by accident, please notify the sender >> immediately and destroy this e-mail and all copies of it. >> > -- Gabriele Zoppoli, MD Ph.D., Clinical and Experimental Oncology and Hematology Internal Medicine Specialist Research Assistant, DiMI, IRCCS AOU San Martino IST, Genova, IT Research Fellow, BCTL J.C. Heuson, Institut J. Bordet, Brussels, BE BIG Deputy for the Gruppo Oncologico Italiano di Ricerca Clinica (GOIRC) Tel: +39 010 353 7968 Mobile : +32 478 24 03 11 Email: gabriele.zoppoli at unige.it Alt. Email: zoppoli at gmail.com Alt. Email 2: gabriele.zoppoli at bordet.be ---------------------------------------------------------- ??? ????? ???? ?? ????? ??' ????? ???? ??????, ??????? ?' ??????, ??? ?' ??????????? ???????: ?? ?? ???? ??? ???????, ???? ?? ??? ?????? ?????. *Father Zeus, at least deliver the sons of Acheans from the gloom,* *And make clear the air, and give it to our eyes to see.* *In the light destroy us, since to do thus pleases you. (Il. 17, 645-7)* ---------------------------------------------------------- CONFIDENTIALITY NOTICE\ \ This e-mail message is intende...{{dropped:14}}

ADD REPLY • link 9.7 years ago Gabriele Zoppoli ▴ 50

Login before adding your answer.