Entering edit mode
Hi Ramzi,
My first thought is that something is causing NaNs in the parameter
estimates, i.e. maybe one of your genes has the same count for all
your samples?
Try the following: Apply ComBat using the first 20 genes in your your
dataset. If ComBat works, add genes (not one at a time, but in sets)
until you find the line in your data that makes ComBat fail. If it
fails with 20 lines, send me your 20 genes data (offline--no need to
send to the entire bioconductor group) and I will take a look.
Hopefully this helps!
Evan
On Nov 14, 2012, at 4:48 PM, Ramzi TEMANNI wrote:
> Hi,
> I have RNA-Seq (host+pathogen) data with this design:
> SampleID condition Seq_prot Exp Date RNA
modif
> 20 1 single-end 1 1
> 82 1 paired-end 2 1
> 86 1 paired-end 3 1
> 21 2 single-end 1 1
> 83 2 paired-end 2 1
> 87 2 paired-end 3 1
> 22 3 single-end 1 1
> 84 3 paired-end 2 1
> 88 3 paired-end 3 1
> 23 4 single-end 1 1
> 85 4 paired-end 2 1
> 89 4 paired-end 3 1
> 90 1 paired-end 4 1
> 91 2 paired-end 4 1
> 92 3 paired-end 4 1
> 93 4 paired-end 4 1
> 94 2 paired-end 4 2
> 95 2 paired-end 4 3
>
>
>
>
>
> When I cluster the data, the samples are not clustering by condition
and we notice batch effect
> on the Host side:
>
> <image.png>
> On the Pathogen side:
> <image.png>
> RNA-Seq data is preprocessed : (Log transformation, quantile
normalization, zero variance filtering)
>
> I used Combat to correct for the batch effect (Exp Date),
>
> I run combat on host and pathogen using the following selection of
samples:
>
> 1) All Samples
>
> 2) Samples with RNA modif 1
>
> 3) Sample with exp_date 1/2/3 (without 4)
>
> 4) Sample with condition 2/3
>
> 5) Sample with condition 2/3 AND exp_date 1/2/3 (without 4)
>
> with the following command line:
>
> ComBat(dat=CountTable.IF.LGT, batch=Design$batch, Design$condition,
par.prior=T, prior.plots=F)
> For pathogen(selection 4) and host (selection 3 and 5) I get the
folowing error:
>
> Error in while (change > conv) { : missing value where TRUE/FALSE
needed
> The low count genes and zero variance genes were removed and there's
no missing value in the data, any idea why this error is showing ?
>
>
>
> Thanks in advance,
>
> Best,
>
> Ramzi
>
>
>
>
>
>
> > sessionInfo()
>
> R version 2.15.1 Patched (2012-09-16 r60732)
>
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
>
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
>
> [1] grid stats graphics grDevices utils datasets
methods base
>
> other attached packages:
>
> [1] DESeq_1.8.3 locfit_1.5-8 Biobase_2.16.0
BiocGenerics_0.2.0 RColorBrewer_1.0-5 gplots_2.11.0
>
> [7] MASS_7.3-22 KernSmooth_2.23-8 caTools_1.13
bitops_1.0-4.1 gdata_2.12.0 gtools_2.7.0
>
> [13] sva_3.2.1 mgcv_1.7-21 corpcor_1.6.4
preprocessCore_1.18.0
>
> loaded via a namespace (and not attached):
>
> [1] annotate_1.34.1 AnnotationDbi_1.18.4 DBI_0.2-5
genefilter_1.38.0 geneplotter_1.34.0 IRanges_1.14.4
>
> [7] lattice_0.20-10 Matrix_1.0-9 nlme_3.1-105
RSQLite_0.11.2 splines_2.15.1 stats4_2.15.1
>
> [13] survival_2.36-14 tools_2.15.1 XML_3.95-0.1
xtable_1.7-0
>
>
>
>
>
>
>
>
>
>
> Mohamed-Ramzi TEMANNI, Ph.D.
>
> Bioinformatics Research Associate
>
> Department of Cell Biology and Molecular Genetics
>
> Maryland Pathogen Research Institute (MPRI)
>
> 3225 Biosciences Research Bldg
>
> University of Maryland
>
> College Park, MD 20742
>
[[alternative HTML version deleted]]