I am trying to calculate the enrichment of control (IgG) of an RIP-seq experiment. Since it was performed in triplicates, one approach I am exploring is using DESeq2 to take advantage of that replicate goodness. The goals:
- Determine enrichment of specific IPs over a background control.
- Possibly determine enrichment difference between IPs (testing DESeq2 testing ratio of ratios (RIP-Seq, CLIP-Seq, ribosomal profiling))
These samples are small RNA RIP-seqs, and his is what the sample table looks like:
From above we can see that there is only one control condition (N2, assay = IgG) for all four experimental conditions. This is ok to solve (1) using the the formula
design= ~ condition and then extracting pairwise results.
Side note, the dispersion plot, default settings, seems to indicate a good fit of the data.
My main doubt is how to prepare the sample table to determine (2), and following DESeq2 testing ratio of ratios (RIP-Seq, CLIP-Seq, ribosomal profiling). I tried
design= ~ assay + condition + assay:condition but of course got the error:
Model matrix not full rank
Likely because IgG and N2 are a linear combination. My question is:
Is it possible to change the sample table in a way to avoid this? How would I go about it? I basically want to use
N2 as a (background) control for every condition, e.g. `(RFK679 / N2) / (Tyr101 / N2)`, before comparing them between each other (one to one comparisons).
I know this type of question is rather frequent, but even after reading the vignette and a number of threads on this, I am still at a loss.
This is a preliminary analysts, and the experimental design is already being changed to included input as background for each experimental condition. That said, I would still like to a brief feel of what is being enriched in this experimental setting.
sessionInfo() R version 3.3.3 (2017-03-06) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Debian GNU/Linux 8 (jessie) locale:  LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C  LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8  LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8  LC_PAPER=en_US.UTF-8 LC_NAME=C  LC_ADDRESS=C LC_TELEPHONE=C  LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages:  parallel stats4 stats graphics grDevices utils datasets  methods base other attached packages:  stringr_1.2.0 org.Ce.eg.db_3.4.0  AnnotationDbi_1.36.2 pheatmap_1.0.8  scales_0.5.0 DESeq2_1.14.1  SummarizedExperiment_1.4.0 Biobase_2.34.0  GenomicRanges_1.26.4 GenomeInfoDb_1.10.3  IRanges_2.8.2 S4Vectors_0.12.2  BiocGenerics_0.20.0 ggplot2_2.2.1  RColorBrewer_1.1-2 data.table_1.10.4  fortunes_1.5-4 loaded via a namespace (and not attached):  genefilter_1.56.0 locfit_1.5-9.1 splines_3.3.3  lattice_0.20-35 colorspace_1.3-2 htmltools_0.3.6  base64enc_0.1-3 blob_1.1.0 survival_2.41-3  XML_3.98-1.9 rlang_0.1.2 foreign_0.8-69  DBI_0.7 BiocParallel_1.8.2 bit64_0.9-7  plyr_1.8.4 zlibbioc_1.20.0 munsell_0.4.3  gtable_0.2.0 htmlwidgets_0.9 memoise_1.1.0  latticeExtra_0.6-28 knitr_1.17 geneplotter_1.52.0  highr_0.6 htmlTable_1.9 Rcpp_0.12.12  acepack_1.4.1 xtable_1.8-2 backports_1.1.0  checkmate_1.8.3 Hmisc_4.0-3 annotate_1.52.1  XVector_0.14.1 bit_1.1-12 gridExtra_2.2.1  digest_0.6.12 stringi_1.1.5 grid_3.3.3  bitops_1.0-6 tools_3.3.3 magrittr_1.5  lazyeval_0.2.0 RCurl_1.95-4.8 tibble_1.3.4  RSQLite_2.0 Formula_1.2-2 cluster_2.0.6  pkgconfig_2.0.1 Matrix_1.2-11 rpart_4.1-11  nnet_7.3-12<font face="sans-serif, Arial, Verdana, Trebuchet MS"> </font>