I have an RNAseq dataset that is a developmental time series across 13 time points with 4 biological replicates. I'm interested in identifying differentially expressed genes through time and generate a variance stabilized dataset (vsd) for further analysis.
My counts file (countData) contains 43469 transcripts across 48 samples. There are some samples missing from the later time points. I set up this model:
>dds<-DESeqDataSetFromMatrix(countData=countData, colData=colData, design= ~replicate + day )
>ddsTC <- DESeq(dds, test="LRT", reduced = ~day )
>res<-results(ddsTC)
This model runs, however, once I call the 'res' function, 90% of the genes get filtered out, leaving 4369 transcripts. To me this seems like quite a lot.
>attr(res, "filterThreshold")
># 90%
>#37.87345
When I ran this dataset through the original DESeq, considering time points as independent factors which I know is incorrect, I got thousands of differentially expressed genes. I feel like there is a better way to run the model in DESeq2. There is no treatment between time series and running the model above in DESeq2 shows very few genes differentially expressed between biological replicates (<10) which is what I would expect
Thank you for any input.
> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] RColorBrewer_1.1-2 ggplot2_1.0.0 gplots_2.15.0
[4] DESeq2_1.4.5 RcppArmadillo_0.4.500.0 Rcpp_0.11.3
[7] GenomicRanges_1.16.4 GenomeInfoDb_1.0.2 IRanges_1.22.10
[10] DESeq_1.16.0 lattice_0.20-29 locfit_1.5-9.1
[13] Biobase_2.24.0 BiocGenerics_0.10.0
loaded via a namespace (and not attached):
[1] annotate_1.42.1 AnnotationDbi_1.26.1 bitops_1.0-6
[4] caTools_1.17.1 colorspace_1.2-4 DBI_0.3.1
[7] digest_0.6.4 gdata_2.13.3 genefilter_1.46.1
[10] geneplotter_1.42.0 grid_3.1.0 gtable_0.1.2
[13] gtools_3.4.1 KernSmooth_2.23-13 MASS_7.3-35
[16] munsell_0.4.2 plyr_1.8.1 proto_0.3-10
[19] reshape2_1.4.1 RSQLite_1.0.0 scales_0.2.4
[22] splines_3.1.0 stats4_3.1.0 stringr_0.6.2
[25] survival_2.37-7 XML_3.98-1.1 xtable_1.7-4
[28] XVector_0.4.0