Hello,
I am using DESeq2 in phyloseq to analyse my data, following the phyloseq tutorial.
Previously I used the following code and this seemed to work nicely, however when I am trying to repeat this I now get the following message and although the listed bacterial species are the same the log2fold changes are quite different. The difference is at the point when I try to convert the phyloseq format to DESeq2 I get the error message in bold below (previously I was not seeing this error message). I have also loaded the sessionInfo in case this helps shed any light on the problem.
otufile=("otu_table_m1000_minus_cyano_chloro_hdf5_consensuslineage.txt")
mapfile=("Mapping_file_LS.txt")
trefile=("rep_set_tree.tre")
qiimedata = import_qiime(otufile, mapfile, trefile)
#remove my unwanted groups
qiimedata = subset_samples(qiimedata, LesionScore != "6")
qiimedata = subset_samples(qiimedata, LesionScore != "7")
qiimedata = subset_samples(qiimedata, LesionScore != "0")
qiimedata = subset_samples(qiimedata, LesionScore != "1")
qiimedata = subset_samples(qiimedata, LesionScore != "3")
qiimedata = subset_samples(qiimedata, LesionScore != "2")
#left with the two groups I want to compare
head(sample_data(qiimedata)$LesionScore, 25)
[1] -5 -5 4 -5 -5 -5 4 -5 4 4 4 4 -5 -5 4 -5 -5 4
diagdds = phyloseq_to_deseq2(qiimedata, ~ LesionScore)
converting counts to integer mode
the design formula contains a numeric variable with integer values,
specifying a model with increasing fold change for higher values.
did you mean for this to be a factor? if so, first convert
this variable to a factor using the factor() function
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] BiocInstaller_1.24.0 DESeq2_1.14.1
[3] SummarizedExperiment_1.4.0 Biobase_2.34.0
[5] GenomicRanges_1.26.3 GenomeInfoDb_1.10.3
[7] IRanges_2.8.1 S4Vectors_0.12.1
[9] BiocGenerics_0.20.0 phyloseq_1.19.1
loaded via a namespace (and not attached):
[1] locfit_1.5-9.1 Rcpp_0.12.9 ape_4.1
[4] lattice_0.20-34 Biostrings_2.42.1 assertthat_0.1
[7] digest_0.6.12 foreach_1.4.3 plyr_1.8.4
[10] backports_1.0.5 acepack_1.4.1 RSQLite_1.1-2
[13] ggplot2_2.2.1 zlibbioc_1.20.0 lazyeval_0.2.0
[16] data.table_1.10.4 annotate_1.52.1 vegan_2.4-2
[19] rpart_4.1-10 Matrix_1.2-8 checkmate_1.8.2
[22] splines_3.3.2 BiocParallel_1.8.1 geneplotter_1.52.0
[25] stringr_1.2.0 foreign_0.8-67 htmlwidgets_0.8
[28] igraph_1.0.1 RCurl_1.95-4.8 munsell_0.4.3
[31] base64enc_0.1-3 multtest_2.30.0 mgcv_1.8-17
[34] htmltools_0.3.5 nnet_7.3-12 biomformat_1.2.0
[37] tibble_1.2 gridExtra_2.2.1 htmlTable_1.9
[40] Hmisc_4.0-2 codetools_0.2-15 XML_3.98-1.5
[43] permute_0.9-4 MASS_7.3-45 bitops_1.0-6
[46] grid_3.3.2 DBI_0.5-1 nlme_3.1-131
[49] jsonlite_1.3 xtable_1.8-2 gtable_0.2.0
[52] magrittr_1.5 scales_0.4.1 stringi_1.1.2
[55] XVector_0.14.0 reshape2_1.4.2 genefilter_1.56.0
[58] latticeExtra_0.6-28 Formula_1.2-1 RColorBrewer_1.1-2
[61] iterators_1.0.8 tools_3.3.2 ade4_1.7-5
[64] survival_2.40-1 AnnotationDbi_1.36.2 colorspace_1.3-2
[67] rhdf5_2.18.0 cluster_2.0.5 memoise_1.0.0
[70] knitr_1.15.1
Cheers,
Sarah

Hi Michael,
Thanks for the clarification.
Changing from using numerical values for LesionScore in my mapping file fixed the problem (eg from 4 to four etc), meaning I guess this is now coded as a factor.
Cheers,
Sarah