Hello,
I am having problems to analyzing an RNA-seq dataset using Ballgown. The RNA are from 3 patients and 3 gender and age matched controls. For each individual there are 3 technical replicates, except for one patient with 6 technical replicates. So there are 21 samples in total. I have processed the data following the Nature Protocol paper using HISAT, StringTie and load the data as Ballgown object into R without trouble.
I wanted to find differentially expressed genes between patients and controls, taking into account the information that each of the 3 patients is paired with a control. The phenotype data is as below:
sampleID case_control pair replicate S10 patient pair2 p2 S11 patient pair2 p2 S12 patient pair2 p2 S13 control pair3 c3 S14 control pair3 c3 S15 control pair3 c3 S16 patient pair3 p3 S17 patient pair3 p3 S18 patient pair3 p3 S19 patient pair3 p3 S1 control pair1 c1 S20 patient pair3 p3 S21 patient pair3 p3 S2 control pair1 c1 S3 control pair1 c1 S4 patient pair1 p1 S5 patient pair1 p1 S6 patient pair1 p1 S7 control pair2 c2 S8 control pair2 c2 S9 control pair2 c2
For that I tried the following command but got an error:
> results_t <- stattest(bg_filt, feature="transcript", covariate="case_control", adjustvars = c("pair", "replicate"), getFC=TRUE, meas="FPKM") Coefficients not estimable: replicatep1 replicatep2 replicatep3 Error in solve.default(t(mod) %*% mod) : system is computationally singular: reciprocal condition number = 4.81307e-27 In addition: Warning message: Partial NA coefficients for 26438 probe(s)
I can run the above command with adjustvars = c("pair") successfully, but this way I am worrying that the technical replicate information is not used. I have read the other posts dealing with the same ballgown error but still not clear what to do in this case.
Many thanks for your help.
Yong Li
> sessionInfo() R version 3.2.3 (2015-12-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux 8 (jessie) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=de_DE.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] devtools_1.13.5 dplyr_0.7.4 genefilter_1.52.1 RSkittleBrewer_1.1 [5] ballgown_2.2.0 loaded via a namespace (and not attached): [1] Rcpp_0.12.16 pillar_1.2.1 [3] bindr_0.1.1 RColorBrewer_1.1-2 [5] futile.logger_1.4.3 GenomeInfoDb_1.6.3 [7] XVector_0.10.0 bitops_1.0-6 [9] futile.options_1.0.0 tools_3.2.3 [11] zlibbioc_1.16.0 digest_0.6.15 [13] bit_1.1-12 tibble_1.4.2 [15] annotate_1.48.0 RSQLite_2.1.0 [17] memoise_1.1.0 nlme_3.1-124 [19] lattice_0.20-35 mgcv_1.8-23 [21] pkgconfig_2.0.1 rlang_0.2.0 [23] Matrix_1.2-13 DBI_0.8 [25] parallel_3.2.3 bindrcpp_0.2.2 [27] withr_2.1.2 rtracklayer_1.30.4 [29] Biostrings_2.38.4 S4Vectors_0.8.11 [31] IRanges_2.4.8 stats4_3.2.3 [33] bit64_0.9-7 grid_3.2.3 [35] glue_1.2.0 Biobase_2.30.0 [37] R6_2.2.2 AnnotationDbi_1.32.3 [39] survival_2.41-3 XML_3.98-1.10 [41] BiocParallel_1.4.3 limma_3.26.9 [43] sva_3.18.0 magrittr_1.5 [45] lambda.r_1.2 blob_1.1.1 [47] Rsamtools_1.22.0 GenomicAlignments_1.6.3 [49] splines_3.2.3 BiocGenerics_0.16.1 [51] GenomicRanges_1.22.4 assertthat_0.2.0 [53] SummarizedExperiment_1.0.2 xtable_1.8-2 [55] RCurl_1.95-4.10