Hi,
I used Tax4Fun for functional community profiling based on 16S rRNA data. I would like to go further in my analyses using DESeq2. However, I got the following error message: "Error in DESeqDataSet(se, design = design, ignoreRank) : some values in assay are not integers". I read the details on vignette("DESeq2"). But I still have two questions:
1) Is it possible to overcome this issue by transforming Tax4Fun data?
2) If the answer is yes, then, what transformation would be most appropriate?
Here are the code that I used, and an overview of my Tax4Fun data:
ddsMT<- DESeqDataSetFromMatrix(countData = t(data.dd2biom.tax4fun.corr.deseq), colData = sam2, design= ~ treatment_N + treatment_H + treatment_N:treatment_H)
> head(t(data.dd2biom.tax4fun.corr.deseq)) BER203 BER211 K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001202 1.001175 K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000021 1.000020 K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000440 1.000513 K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000032 1.000039 K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000039 1.000030 K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000043 1.000049 BER220 BER226 K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001026 1.000934 K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000024 1.000024 K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000519 1.000343 K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000033 1.000029 K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000052 1.000094 K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000058 1.000063 BER233 BER237 K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001285 1.001052 K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000011 1.000019 K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000419 1.000393 K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000020 1.000032 K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000031 1.000053 K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000028 1.000053 BER241 BER247 K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001144 1.001156 K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000023 1.000019 K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000470 1.000469 K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000036 1.000039 K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000026 1.000039 K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000051 1.000045 BER251 BER257 K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001110 1.001266 K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000017 1.000016 K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000448 1.000466 K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000033 1.000031 K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000036 1.000038 K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000056 1.000040 BER262 BER263 K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001235 1.001239 K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000018 1.000018 K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000490 1.000487 K00004...R.R..butanediol.dehydrogenase...diacetyl.reductase..EC.1.1.1.4.1.1.1.303. 1.000042 1.000043 K00005..glycerol.dehydrogenase..EC.1.1.1.6. 1.000019 1.000064 K00007..D.arabinitol.4.dehydrogenase..EC.1.1.1.11. 1.000058 1.000032 SIM403 SIM413 K00001..alcohol.dehydrogenase..EC.1.1.1.1. 1.001376 1.001229 K00002..alcohol.dehydrogenase..NADP....EC.1.1.1.2. 1.000014 1.000019 K00003..homoserine.dehydrogenase..EC.1.1.1.3. 1.000455 1.000462
Thank you for your reply. The values that I'm trying to input to DESeq2 come from Tax4Fun output, which is a list of enzymes with abundances scores for each sample (= numbers with a decimal. So I think that DESeq2 is not designed for this kind of data, am I right?). They are as follows:
Yes, these do not look like appropriate input if they aren’t counts / observations.