Hi,
I used Tax4Fun for functional community profiling based on 16S rRNA data. I would like to go further in my analyses using DESeq2. However, I got the following error message: "Error in DESeqDataSet(se, design = design, ignoreRank) : some values in assay are not integers". I read the details on vignette("DESeq2"). But I still have two questions:
1) Is it possible to overcome this issue by transforming Tax4Fun data?
2) If the answer is yes, then, what transformation would be most appropriate?
Here are the code that I used, and an overview of my Tax4Fun data:
ddsMT<- DESeqDataSetFromMatrix(countData = t(data.dd2biom.tax4fun.corr.deseq),
colData = sam2,
design= ~ treatment_N + treatment_H + treatment_N:treatment_H)
> head(t(data.dd2biom.tax4fun.corr.deseq))
BER203 BER211
K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001202 1.001175
K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000021 1.000020
K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000440 1.000513
K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000032 1.000039
K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000039 1.000030
K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000043 1.000049
BER220 BER226
K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001026 1.000934
K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000024 1.000024
K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000519 1.000343
K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000033 1.000029
K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000052 1.000094
K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000058 1.000063
BER233 BER237
K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001285 1.001052
K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000011 1.000019
K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000419 1.000393
K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000020 1.000032
K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000031 1.000053
K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000028 1.000053
BER241 BER247
K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001144 1.001156
K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000023 1.000019
K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000470 1.000469
K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000036 1.000039
K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000026 1.000039
K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000051 1.000045
BER251 BER257
K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001110 1.001266
K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000017 1.000016
K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000448 1.000466
K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000033 1.000031
K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000036 1.000038
K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000056 1.000040
BER262 BER263
K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001235 1.001239
K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000018 1.000018
K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000490 1.000487
K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000042 1.000043
K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000019 1.000064
K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000058 1.000032
SIM403 SIM413
K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001376 1.001229
K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000014 1.000019
K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000455 1.000462

Thank you for your reply. The values that I'm trying to input to DESeq2 come from Tax4Fun output, which is a list of enzymes with abundances scores for each sample (= numbers with a decimal. So I think that DESeq2 is not designed for this kind of data, am I right?). They are as follows:
Yes, these do not look like appropriate input if they aren’t counts / observations.