microarray outlier detection
2
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Dear users, I have human gene 2.0 st array, total 12 samples including 4 groups, each group has 3 replicates. The lab person would like to remove one from each of the group due to the outliers, but from PCA plot, the samples are not clustered, it is hard to remove any sample as an outlier. I wonder if we have the package or function to solve the outlier detection issue on microarray. Thanks, -- output of sessionInfo(): > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-apple-darwin10.8.0 (64-bit) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] pd.hugene.2.0.st_3.8.0 oligo_1.24.1 oligoClasses_1.22.0 hugene20sttranscriptcluster.db_2.12.1 [5] org.Hs.eg.db_2.9.0 RSQLite_0.11.4 DBI_0.2-7 AnnotationDbi_1.22.6 [9] Biobase_2.20.1 BiocGenerics_0.6.0 limma_3.16.6 loaded via a namespace (and not attached): [1] affxparser_1.32.3 affyio_1.28.0 BiocInstaller_1.10.3 Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 [7] ff_2.2-11 foreach_1.4.1 GenomicRanges_1.12.4 IRanges_1.18.2 iterators_1.0.6 preprocessCore_1.22.0 [13] splines_3.0.1 stats4_3.0.1 zlibbioc_1.6.0 -- Sent via the guest posting facility at bioconductor.org.
Microarray Microarray • 2.0k views
ADD COMMENT
0
Entering edit mode
@peter-langfelder-4469
Last seen 1 day ago
United States
First, you should only remove outliers if there's a clear indication you actually do have outliers, and the outliers are technical, not biological. Especially if you have a small data set, removing any samples can be counterproductive. Should you nevertheless want to go ahead and remove some of the samples, you may want to look at the SampleNetwork approach detailed at http://labs.genetics.ucla.edu/horvath/htdocs/CoexpressionNetwork/Sampl eNetwork/ The web site contains the necessary R code and a detailed tutorial on how to use the function. HTH, Peter On Fri, Aug 30, 2013 at 1:32 PM, guest [guest] <guest at="" bioconductor.org=""> wrote: > > Dear users, > > I have human gene 2.0 st array, total 12 samples including 4 groups, each group has 3 replicates. The lab person would like to remove one from each of the group due to the outliers, but from PCA plot, the samples are not clustered, it is hard to remove any sample as an outlier. I wonder if we have the package or function to solve the outlier detection issue on microarray. > > Thanks, > > > -- output of sessionInfo(): > >> sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] pd.hugene.2.0.st_3.8.0 oligo_1.24.1 oligoClasses_1.22.0 hugene20sttranscriptcluster.db_2.12.1 > [5] org.Hs.eg.db_2.9.0 RSQLite_0.11.4 DBI_0.2-7 AnnotationDbi_1.22.6 > [9] Biobase_2.20.1 BiocGenerics_0.6.0 limma_3.16.6 > > loaded via a namespace (and not attached): > [1] affxparser_1.32.3 affyio_1.28.0 BiocInstaller_1.10.3 Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 > [7] ff_2.2-11 foreach_1.4.1 GenomicRanges_1.12.4 IRanges_1.18.2 iterators_1.0.6 preprocessCore_1.22.0 > [13] splines_3.0.1 stats4_3.0.1 zlibbioc_1.6.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Devon Ryan ▴ 200
@devon-ryan-6054
Last seen 8.2 years ago
Germany
To expound on what Peter Langfelder wrote, some people get in the unwise practice early in their careers of removing what the think are outlier datapoints/samples simply because it makes their data cleaner. This is a really bad idea because you end up chronically underestimating biological variability, which will inevitably come back to haunt you. I would argue that, regardless of what some statistical test that the lab person likely doesn't understand might say, if you can't immediately eyeball a sample as an outlier in a PCA plot or via hierarchical clustering, you probably shouldn't remove it. Try discussing with this person his/her reasons for thinking that there are outliers, it's likely that he/she has simply fallen into this trap. Good luck, Devon ____________________________________________ Devon Ryan, Ph.D. Email: dpryan at dpryan.com Molecular and Cellular Cognition Lab German Centre for Neurodegenerative Diseases (DZNE) Ludwig-Erhard-Allee 2 53175 Bonn, Germany On Aug 30, 2013, at 10:32 PM, guest [guest] wrote: > > Dear users, > > I have human gene 2.0 st array, total 12 samples including 4 groups, each group has 3 replicates. The lab person would like to remove one from each of the group due to the outliers, but from PCA plot, the samples are not clustered, it is hard to remove any sample as an outlier. I wonder if we have the package or function to solve the outlier detection issue on microarray. > > Thanks, > > > -- output of sessionInfo(): > >> sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-apple-darwin10.8.0 (64-bit) > > locale: > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] pd.hugene.2.0.st_3.8.0 oligo_1.24.1 oligoClasses_1.22.0 hugene20sttranscriptcluster.db_2.12.1 > [5] org.Hs.eg.db_2.9.0 RSQLite_0.11.4 DBI_0.2-7 AnnotationDbi_1.22.6 > [9] Biobase_2.20.1 BiocGenerics_0.6.0 limma_3.16.6 > > loaded via a namespace (and not attached): > [1] affxparser_1.32.3 affyio_1.28.0 BiocInstaller_1.10.3 Biostrings_2.28.0 bit_1.1-10 codetools_0.2-8 > [7] ff_2.2-11 foreach_1.4.1 GenomicRanges_1.12.4 IRanges_1.18.2 iterators_1.0.6 preprocessCore_1.22.0 > [13] splines_3.0.1 stats4_3.0.1 zlibbioc_1.6.0 > > -- > Sent via the guest posting facility at bioconductor.org. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT

Login before adding your answer.

Traffic: 954 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6