ComBat: running with all samples but fail with sub-selection
0
0
Entering edit mode
@w-evan-johnson-5447
Last seen 10 hours ago
United States
Hi Ramzi, My first thought is that something is causing NaNs in the parameter estimates, i.e. maybe one of your genes has the same count for all your samples? Try the following: Apply ComBat using the first 20 genes in your your dataset. If ComBat works, add genes (not one at a time, but in sets) until you find the line in your data that makes ComBat fail. If it fails with 20 lines, send me your 20 genes data (offline--no need to send to the entire bioconductor group) and I will take a look. Hopefully this helps! Evan On Nov 14, 2012, at 4:48 PM, Ramzi TEMANNI wrote: > Hi, > I have RNA-Seq (host+pathogen) data with this design: > SampleID condition Seq_prot Exp Date RNA modif > 20 1 single-end 1 1 > 82 1 paired-end 2 1 > 86 1 paired-end 3 1 > 21 2 single-end 1 1 > 83 2 paired-end 2 1 > 87 2 paired-end 3 1 > 22 3 single-end 1 1 > 84 3 paired-end 2 1 > 88 3 paired-end 3 1 > 23 4 single-end 1 1 > 85 4 paired-end 2 1 > 89 4 paired-end 3 1 > 90 1 paired-end 4 1 > 91 2 paired-end 4 1 > 92 3 paired-end 4 1 > 93 4 paired-end 4 1 > 94 2 paired-end 4 2 > 95 2 paired-end 4 3 > > > > > > When I cluster the data, the samples are not clustering by condition and we notice batch effect > on the Host side: > > <image.png> > On the Pathogen side: > <image.png> > RNA-Seq data is preprocessed : (Log transformation, quantile normalization, zero variance filtering) > > I used Combat to correct for the batch effect (Exp Date), > > I run combat on host and pathogen using the following selection of samples: > > 1) All Samples > > 2) Samples with RNA modif 1 > > 3) Sample with exp_date 1/2/3 (without 4) > > 4) Sample with condition 2/3 > > 5) Sample with condition 2/3 AND exp_date 1/2/3 (without 4) > > with the following command line: > > ComBat(dat=CountTable.IF.LGT, batch=Design$batch, Design$condition, par.prior=T, prior.plots=F) > For pathogen(selection 4) and host (selection 3 and 5) I get the folowing error: > > Error in while (change > conv) { : missing value where TRUE/FALSE needed > The low count genes and zero variance genes were removed and there's no missing value in the data, any idea why this error is showing ? > > > > Thanks in advance, > > Best, > > Ramzi > > > > > > > > sessionInfo() > > R version 2.15.1 Patched (2012-09-16 r60732) > > Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) > > locale: > > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 > > attached base packages: > > [1] grid stats graphics grDevices utils datasets methods base > > other attached packages: > > [1] DESeq_1.8.3 locfit_1.5-8 Biobase_2.16.0 BiocGenerics_0.2.0 RColorBrewer_1.0-5 gplots_2.11.0 > > [7] MASS_7.3-22 KernSmooth_2.23-8 caTools_1.13 bitops_1.0-4.1 gdata_2.12.0 gtools_2.7.0 > > [13] sva_3.2.1 mgcv_1.7-21 corpcor_1.6.4 preprocessCore_1.18.0 > > loaded via a namespace (and not attached): > > [1] annotate_1.34.1 AnnotationDbi_1.18.4 DBI_0.2-5 genefilter_1.38.0 geneplotter_1.34.0 IRanges_1.14.4 > > [7] lattice_0.20-10 Matrix_1.0-9 nlme_3.1-105 RSQLite_0.11.2 splines_2.15.1 stats4_2.15.1 > > [13] survival_2.36-14 tools_2.15.1 XML_3.95-0.1 xtable_1.7-0 > > > > > > > > > > > Mohamed-Ramzi TEMANNI, Ph.D. > > Bioinformatics Research Associate > > Department of Cell Biology and Molecular Genetics > > Maryland Pathogen Research Institute (MPRI) > > 3225 Biosciences Research Bldg > > University of Maryland > > College Park, MD 20742 > [[alternative HTML version deleted]]
Clustering Clustering • 1.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 668 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6