Search
Question: phyloseq/DESeq gives negative transformed values
0
gravatar for Sophie Josephine Weiss
4.4 years ago by
Sophie Josephine Weiss130 wrote:
Hello, I have microbiome data with no replicates, from different conditions. I am trying to transform the data using the DESeq method, as described in McMurdie and Holmes 2014. The attached file is the definition I am using, as per the supplemental info in McMurdie and Holmes 2014, and the .biom file I am using. Thank you for your help, Sophie
ADD COMMENTlink modified 4.4 years ago by Michael Love19k • written 4.4 years ago by Sophie Josephine Weiss130
1
gravatar for Michael Love
4.4 years ago by
Michael Love19k
United States
Michael Love19k wrote:
​I tried poking around here http://joey711.github.io/phyloseq/distance but couldn't see if the authors did anything for distances requiring non-negative data. It appears http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcb i.1003531that VST was tested with Bray-Curtis distance.​ I think the distance is designed for counts, but you could always threshold at 0 to insist that the log2-like quantity act more like a count. On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss < Sophie.Weiss@colorado.edu> wrote: > Hi Mike, > Thanks for explaining more. I am used to working with rarefied microbial > datasets, that is why. Instead of rarefying I would like to use the DESeq > method. > > How would you then suggest going about calculating bray-curtis distance, > or summarized taxa diagrams with these new transformed matrices with > negative values? > Thanks again, > Sophie > > > On Mon, Apr 14, 2014 at 7:17 AM, Michael Love <michaelisaiahlove@gmail.com> > wrote: > >> hi Sophie, >> >> Can you explain why you don't want negative values in the transformed >> values? Adding one to the raw counts is not sufficient. I should have said >> in my previous email, "the expected counts on the common scale". If the >> size factor for a sample is 2, then an expected count of 1 leads to an >> expected count of 1/2 on the common scale (after accounting for size >> factors). >> >> >> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss < >> Sophie.Weiss@colorado.edu> wrote: >> >>> Hi Mike, >>> Thanks for your reply! Ok, makes sense, but I added 1 to all my matrix >>> values, so the lowest value in the matrix is 1 - there are still negatives? >>> Thanks again, >>> Sophie >>> >>> >>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love < >>> michaelisaiahlove@gmail.com> wrote: >>> >>>> hi Sophie, >>>> >>>> The transformations in DESeq and DESeq2 are log2-like transformations. >>>> If the expected count is between 0 and 1, the values can be negative, this >>>> does not indicate a problem. >>>> >>>> Mike >>>> >>>> >>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss < >>>> Sophie.Weiss@colorado.edu> wrote: >>>> >>>>> Hello, >>>>> I have microbiome data with no replicates, from different conditions. >>>>> I am >>>>> trying to transform the data using the DESeq method, as described in >>>>> McMurdie and Holmes 2014. >>>>> >>>>> The attached file is the definition I am using, as per the supplemental >>>>> info in McMurdie and Holmes 2014, and the .biom file I am using. >>>>> >>>>> Thank you for your help, >>>>> Sophie >>>>> >>>>> _______________________________________________ >>>>> Bioconductor mailing list >>>>> Bioconductor@r-project.org >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>> Search the archives: >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>> >>>> >>>> >>> >> > [[alternative HTML version deleted]]
ADD COMMENTlink written 4.4 years ago by Michael Love19k
Hi Mike, Thanks for the references. By "threshold at 0" do you mean set any negative values equal to 0? Do you think this is the best approach? Thanks again, Sophie On Mon, Apr 14, 2014 at 11:01 AM, Michael Love <michaelisaiahlove@gmail.com>wrote: > I tried poking around here http://joey711.github.io/phyloseq/distancebut couldn't see if the authors did anything for distances requiring > non-negative data. It appears > http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.p cbi.1003531that VST was tested with Bray-Curtis distance. I think the distance is > designed for counts, but you could always threshold at 0 to insist that the > log2-like quantity act more like a count. > > > > On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss < > Sophie.Weiss@colorado.edu> wrote: > >> Hi Mike, >> Thanks for explaining more. I am used to working with rarefied microbial >> datasets, that is why. Instead of rarefying I would like to use the DESeq >> method. >> >> How would you then suggest going about calculating bray-curtis distance, >> or summarized taxa diagrams with these new transformed matrices with >> negative values? >> Thanks again, >> Sophie >> >> >> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love < >> michaelisaiahlove@gmail.com> wrote: >> >>> hi Sophie, >>> >>> Can you explain why you don't want negative values in the transformed >>> values? Adding one to the raw counts is not sufficient. I should have said >>> in my previous email, "the expected counts on the common scale". If the >>> size factor for a sample is 2, then an expected count of 1 leads to an >>> expected count of 1/2 on the common scale (after accounting for size >>> factors). >>> >>> >>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss < >>> Sophie.Weiss@colorado.edu> wrote: >>> >>>> Hi Mike, >>>> Thanks for your reply! Ok, makes sense, but I added 1 to all my matrix >>>> values, so the lowest value in the matrix is 1 - there are still negatives? >>>> Thanks again, >>>> Sophie >>>> >>>> >>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love < >>>> michaelisaiahlove@gmail.com> wrote: >>>> >>>>> hi Sophie, >>>>> >>>>> The transformations in DESeq and DESeq2 are log2-like transformations. >>>>> If the expected count is between 0 and 1, the values can be negative, this >>>>> does not indicate a problem. >>>>> >>>>> Mike >>>>> >>>>> >>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss < >>>>> Sophie.Weiss@colorado.edu> wrote: >>>>> >>>>>> Hello, >>>>>> I have microbiome data with no replicates, from different conditions. >>>>>> I am >>>>>> trying to transform the data using the DESeq method, as described in >>>>>> McMurdie and Holmes 2014. >>>>>> >>>>>> The attached file is the definition I am using, as per the >>>>>> supplemental >>>>>> info in McMurdie and Holmes 2014, and the .biom file I am using. >>>>>> >>>>>> Thank you for your help, >>>>>> Sophie >>>>>> >>>>>> _______________________________________________ >>>>>> Bioconductor mailing list >>>>>> Bioconductor@r-project.org >>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>> Search the archives: >>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>> >>>>> >>>>> >>>> >>> >> > [[alternative HTML version deleted]]
ADD REPLYlink written 4.4 years ago by Sophie Josephine Weiss130
hi Sophie, On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss <sophie.weiss at="" colorado.edu=""> wrote: > > Hi Mike, > Thanks for the references. By "threshold at 0" do you mean set any negative values equal to 0? yes. > > Do you think this is the best approach? I haven't explored this area, and would defer to the McMurdie and Holmes paper for the best combinations of distance and transformation. > > Thanks again, > Sophie > > > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love <michaelisaiahlove at="" gmail.com=""> wrote: >> >> I tried poking around here http://joey711.github.io/phyloseq/distance but couldn't see if the authors did anything for distances requiring non-negative data. It appears http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjou rnal.pcbi.1003531 that VST was tested with Bray-Curtis distance. I think the distance is designed for counts, but you could always threshold at 0 to insist that the log2-like quantity act more like a count. >> >> >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss <sophie.weiss at="" colorado.edu=""> wrote: >>> >>> Hi Mike, >>> Thanks for explaining more. I am used to working with rarefied microbial datasets, that is why. Instead of rarefying I would like to use the DESeq method. >>> >>> How would you then suggest going about calculating bray-curtis distance, or summarized taxa diagrams with these new transformed matrices with negative values? >>> Thanks again, >>> Sophie >>> >>> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love <michaelisaiahlove at="" gmail.com=""> wrote: >>>> >>>> hi Sophie, >>>> >>>> Can you explain why you don't want negative values in the transformed values? Adding one to the raw counts is not sufficient. I should have said in my previous email, "the expected counts on the common scale". If the size factor for a sample is 2, then an expected count of 1 leads to an expected count of 1/2 on the common scale (after accounting for size factors). >>>> >>>> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss <sophie.weiss at="" colorado.edu=""> wrote: >>>>> >>>>> Hi Mike, >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to all my matrix values, so the lowest value in the matrix is 1 - there are still negatives? >>>>> Thanks again, >>>>> Sophie >>>>> >>>>> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love <michaelisaiahlove at="" gmail.com=""> wrote: >>>>>> >>>>>> hi Sophie, >>>>>> >>>>>> The transformations in DESeq and DESeq2 are log2-like transformations. If the expected count is between 0 and 1, the values can be negative, this does not indicate a problem. >>>>>> >>>>>> Mike >>>>>> >>>>>> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss <sophie.weiss at="" colorado.edu=""> wrote: >>>>>>> >>>>>>> Hello, >>>>>>> I have microbiome data with no replicates, from different conditions. I am >>>>>>> trying to transform the data using the DESeq method, as described in >>>>>>> McMurdie and Holmes 2014. >>>>>>> >>>>>>> The attached file is the definition I am using, as per the supplemental >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am using. >>>>>>> >>>>>>> Thank you for your help, >>>>>>> Sophie >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Bioconductor mailing list >>>>>>> Bioconductor at r-project.org >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>> >>>>>> >>>>> >>>> >>> >> >
ADD REPLYlink written 4.4 years ago by Michael Love19k
Hi Mike, The McMurdie and Holmes paper uses DESeq for matrix normalization - do you think that is ok, or would it be better to use DESeq 2? Thanks again, Sophie On Mon, Apr 14, 2014 at 3:40 PM, Michael Love <michaelisaiahlove@gmail.com>wrote: > hi Sophie, > > > On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss > <sophie.weiss@colorado.edu> wrote: > > > > Hi Mike, > > Thanks for the references. By "threshold at 0" do you mean set any > negative values equal to 0? > > > yes. > > > > > > Do you think this is the best approach? > > > I haven't explored this area, and would defer to the McMurdie and > Holmes paper for the best combinations of distance and transformation. > > > > > > Thanks again, > > Sophie > > > > > > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love < > michaelisaiahlove@gmail.com> wrote: > >> > >> I tried poking around here http://joey711.github.io/phyloseq/distancebut couldn't see if the authors did anything for distances requiring > non-negative data. It appears > http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.p cbi.1003531that VST was tested with Bray-Curtis distance. I think the distance is > designed for counts, but you could always threshold at 0 to insist that the > log2-like quantity act more like a count. > >> > >> > >> > >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss < > Sophie.Weiss@colorado.edu> wrote: > >>> > >>> Hi Mike, > >>> Thanks for explaining more. I am used to working with rarefied > microbial datasets, that is why. Instead of rarefying I would like to use > the DESeq method. > >>> > >>> How would you then suggest going about calculating bray-curtis > distance, or summarized taxa diagrams with these new transformed matrices > with negative values? > >>> Thanks again, > >>> Sophie > >>> > >>> > >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love < > michaelisaiahlove@gmail.com> wrote: > >>>> > >>>> hi Sophie, > >>>> > >>>> Can you explain why you don't want negative values in the transformed > values? Adding one to the raw counts is not sufficient. I should have said > in my previous email, "the expected counts on the common scale". If the > size factor for a sample is 2, then an expected count of 1 leads to an > expected count of 1/2 on the common scale (after accounting for size > factors). > >>>> > >>>> > >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss < > Sophie.Weiss@colorado.edu> wrote: > >>>>> > >>>>> Hi Mike, > >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to all my > matrix values, so the lowest value in the matrix is 1 - there are still > negatives? > >>>>> Thanks again, > >>>>> Sophie > >>>>> > >>>>> > >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love < > michaelisaiahlove@gmail.com> wrote: > >>>>>> > >>>>>> hi Sophie, > >>>>>> > >>>>>> The transformations in DESeq and DESeq2 are log2-like > transformations. If the expected count is between 0 and 1, the values can > be negative, this does not indicate a problem. > >>>>>> > >>>>>> Mike > >>>>>> > >>>>>> > >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss < > Sophie.Weiss@colorado.edu> wrote: > >>>>>>> > >>>>>>> Hello, > >>>>>>> I have microbiome data with no replicates, from different > conditions. I am > >>>>>>> trying to transform the data using the DESeq method, as described > in > >>>>>>> McMurdie and Holmes 2014. > >>>>>>> > >>>>>>> The attached file is the definition I am using, as per the > supplemental > >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am using. > >>>>>>> > >>>>>>> Thank you for your help, > >>>>>>> Sophie > >>>>>>> > >>>>>>> _______________________________________________ > >>>>>>> Bioconductor mailing list > >>>>>>> Bioconductor@r-project.org > >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>>>>>> Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > >>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > > > [[alternative HTML version deleted]]
ADD REPLYlink written 4.4 years ago by Sophie Josephine Weiss130
1
gravatar for Michael Love
4.4 years ago by
Michael Love19k
United States
Michael Love19k wrote:
hi Sophie, We recommend using the standard DESeq() function for differential expression. This is mentioned in the first line of the vignette section on transformations: "In order to test for diff erential expression, we operate on raw counts and use discrete distributions as described in the previous section" Also, in the McMurdie and Holmes, they are using the DESeq() function, as shown in their supplemental material: http://joey711.github.io/waste-not-supplemental/simulation- differential-abundance/simulation-differential-abundance-server.html On Wed, Apr 16, 2014 at 3:22 PM, Sophie Josephine Weiss <sophie.weiss at="" colorado.edu=""> wrote: > Please help with this? Thanks again. > > > On Mon, Apr 14, 2014 at 6:02 PM, Sophie Josephine Weiss > <sophie.weiss at="" colorado.edu=""> wrote: >> >> Thanks again Mike - would it be ok to do chi-2 and other significance >> tests on the DESeq transformed datasets using independent code, or is it >> necessary to do the differential expression tests strictly within DESeq2? >> >> Sophie >> >> >> On Mon, Apr 14, 2014 at 5:41 PM, Michael Love >> <michaelisaiahlove at="" gmail.com=""> wrote: >>> >>> hi Sophie, >>> >>> The VST code is the same in DESeq and DESeq2. The estimation of >>> dispersion is slightly different (details are in the vignette "Changes >>> from DESeq to DESeq2"), but the fitted line (which is used by the VST) >>> should be very similar. >>> >>> Mike >>> >>> On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss >>> <sophie.weiss at="" colorado.edu=""> wrote: >>> > Hi Mike, >>> > The McMurdie and Holmes paper uses DESeq for matrix normalization - do >>> > you >>> > think that is ok, or would it be better to use DESeq 2? >>> > Thanks again, >>> > Sophie >>> > >>> > >>> > On Mon, Apr 14, 2014 at 3:40 PM, Michael Love >>> > <michaelisaiahlove at="" gmail.com=""> >>> > wrote: >>> >> >>> >> hi Sophie, >>> >> >>> >> >>> >> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss >>> >> <sophie.weiss at="" colorado.edu=""> wrote: >>> >> > >>> >> > Hi Mike, >>> >> > Thanks for the references. By "threshold at 0" do you mean set any >>> >> > negative values equal to 0? >>> >> >>> >> >>> >> yes. >>> >> >>> >> >>> >> > >>> >> > Do you think this is the best approach? >>> >> >>> >> >>> >> I haven't explored this area, and would defer to the McMurdie and >>> >> Holmes paper for the best combinations of distance and transformation. >>> >> >>> >> >>> >> > >>> >> > Thanks again, >>> >> > Sophie >>> >> > >>> >> > >>> >> > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love >>> >> > <michaelisaiahlove at="" gmail.com=""> wrote: >>> >> >> >>> >> >> I tried poking around here >>> >> >> http://joey711.github.io/phyloseq/distance >>> >> >> but couldn't see if the authors did anything for distances >>> >> >> requiring >>> >> >> non-negative data. It appears >>> >> >> >>> >> >> http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fj ournal.pcbi.1003531 >>> >> >> that VST was tested with Bray-Curtis distance. I think the distance >>> >> >> is >>> >> >> designed for counts, but you could always threshold at 0 to insist >>> >> >> that the >>> >> >> log2-like quantity act more like a count. >>> >> >> >>> >> >> >>> >> >> >>> >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss >>> >> >> <sophie.weiss at="" colorado.edu=""> wrote: >>> >> >>> >>> >> >>> Hi Mike, >>> >> >>> Thanks for explaining more. I am used to working with rarefied >>> >> >>> microbial datasets, that is why. Instead of rarefying I would >>> >> >>> like to use >>> >> >>> the DESeq method. >>> >> >>> >>> >> >>> How would you then suggest going about calculating bray- curtis >>> >> >>> distance, or summarized taxa diagrams with these new transformed >>> >> >>> matrices >>> >> >>> with negative values? >>> >> >>> Thanks again, >>> >> >>> Sophie >>> >> >>> >>> >> >>> >>> >> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love >>> >> >>> <michaelisaiahlove at="" gmail.com=""> wrote: >>> >> >>>> >>> >> >>>> hi Sophie, >>> >> >>>> >>> >> >>>> Can you explain why you don't want negative values in the >>> >> >>>> transformed >>> >> >>>> values? Adding one to the raw counts is not sufficient. I should >>> >> >>>> have said >>> >> >>>> in my previous email, "the expected counts on the common scale". >>> >> >>>> If the >>> >> >>>> size factor for a sample is 2, then an expected count of 1 leads >>> >> >>>> to an >>> >> >>>> expected count of 1/2 on the common scale (after accounting for >>> >> >>>> size >>> >> >>>> factors). >>> >> >>>> >>> >> >>>> >>> >> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss >>> >> >>>> <sophie.weiss at="" colorado.edu=""> wrote: >>> >> >>>>> >>> >> >>>>> Hi Mike, >>> >> >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to all my >>> >> >>>>> matrix values, so the lowest value in the matrix is 1 - there >>> >> >>>>> are still >>> >> >>>>> negatives? >>> >> >>>>> Thanks again, >>> >> >>>>> Sophie >>> >> >>>>> >>> >> >>>>> >>> >> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love >>> >> >>>>> <michaelisaiahlove at="" gmail.com=""> wrote: >>> >> >>>>>> >>> >> >>>>>> hi Sophie, >>> >> >>>>>> >>> >> >>>>>> The transformations in DESeq and DESeq2 are log2-like >>> >> >>>>>> transformations. If the expected count is between 0 and 1, the >>> >> >>>>>> values can be >>> >> >>>>>> negative, this does not indicate a problem. >>> >> >>>>>> >>> >> >>>>>> Mike >>> >> >>>>>> >>> >> >>>>>> >>> >> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss >>> >> >>>>>> <sophie.weiss at="" colorado.edu=""> wrote: >>> >> >>>>>>> >>> >> >>>>>>> Hello, >>> >> >>>>>>> I have microbiome data with no replicates, from different >>> >> >>>>>>> conditions. I am >>> >> >>>>>>> trying to transform the data using the DESeq method, as >>> >> >>>>>>> described >>> >> >>>>>>> in >>> >> >>>>>>> McMurdie and Holmes 2014. >>> >> >>>>>>> >>> >> >>>>>>> The attached file is the definition I am using, as per the >>> >> >>>>>>> supplemental >>> >> >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am >>> >> >>>>>>> using. >>> >> >>>>>>> >>> >> >>>>>>> Thank you for your help, >>> >> >>>>>>> Sophie >>> >> >>>>>>> >>> >> >>>>>>> _______________________________________________ >>> >> >>>>>>> Bioconductor mailing list >>> >> >>>>>>> Bioconductor at r-project.org >>> >> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> >> >>>>>>> Search the archives: >>> >> >>>>>>> >>> >> >>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >>>>>> >>> >> >>>>>> >>> >> >>>>> >>> >> >>>> >>> >> >>> >>> >> >> >>> >> > >>> > >>> > >> >> >
ADD COMMENTlink written 4.4 years ago by Michael Love19k
Thanks Mike, that is what I thought. What if we wanted to perform kruskal wallis, or is it possible to perform anova on the variance-stabilized matrix? On Wed, Apr 16, 2014 at 2:29 PM, Michael Love <michaelisaiahlove@gmail.com>wrote: > hi Sophie, > > We recommend using the standard DESeq() function for differential > expression. > > This is mentioned in the first line of the vignette section on > transformations: > > "In order to test for diff erential expression, we operate on raw > counts and use discrete distributions as > described in the previous section" > > Also, in the McMurdie and Holmes, they are using the DESeq() function, > as shown in their supplemental material: > > > http://joey711.github.io/waste-not-supplemental/simulation- differential-abundance/simulation-differential-abundance-server.html > > On Wed, Apr 16, 2014 at 3:22 PM, Sophie Josephine Weiss > <sophie.weiss@colorado.edu> wrote: > > Please help with this? Thanks again. > > > > > > On Mon, Apr 14, 2014 at 6:02 PM, Sophie Josephine Weiss > > <sophie.weiss@colorado.edu> wrote: > >> > >> Thanks again Mike - would it be ok to do chi-2 and other significance > >> tests on the DESeq transformed datasets using independent code, or is it > >> necessary to do the differential expression tests strictly within > DESeq2? > >> > >> Sophie > >> > >> > >> On Mon, Apr 14, 2014 at 5:41 PM, Michael Love > >> <michaelisaiahlove@gmail.com> wrote: > >>> > >>> hi Sophie, > >>> > >>> The VST code is the same in DESeq and DESeq2. The estimation of > >>> dispersion is slightly different (details are in the vignette "Changes > >>> from DESeq to DESeq2"), but the fitted line (which is used by the VST) > >>> should be very similar. > >>> > >>> Mike > >>> > >>> On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss > >>> <sophie.weiss@colorado.edu> wrote: > >>> > Hi Mike, > >>> > The McMurdie and Holmes paper uses DESeq for matrix normalization - > do > >>> > you > >>> > think that is ok, or would it be better to use DESeq 2? > >>> > Thanks again, > >>> > Sophie > >>> > > >>> > > >>> > On Mon, Apr 14, 2014 at 3:40 PM, Michael Love > >>> > <michaelisaiahlove@gmail.com> > >>> > wrote: > >>> >> > >>> >> hi Sophie, > >>> >> > >>> >> > >>> >> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss > >>> >> <sophie.weiss@colorado.edu> wrote: > >>> >> > > >>> >> > Hi Mike, > >>> >> > Thanks for the references. By "threshold at 0" do you mean set > any > >>> >> > negative values equal to 0? > >>> >> > >>> >> > >>> >> yes. > >>> >> > >>> >> > >>> >> > > >>> >> > Do you think this is the best approach? > >>> >> > >>> >> > >>> >> I haven't explored this area, and would defer to the McMurdie and > >>> >> Holmes paper for the best combinations of distance and > transformation. > >>> >> > >>> >> > >>> >> > > >>> >> > Thanks again, > >>> >> > Sophie > >>> >> > > >>> >> > > >>> >> > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love > >>> >> > <michaelisaiahlove@gmail.com> wrote: > >>> >> >> > >>> >> >> I tried poking around here > >>> >> >> http://joey711.github.io/phyloseq/distance > >>> >> >> but couldn't see if the authors did anything for distances > >>> >> >> requiring > >>> >> >> non-negative data. It appears > >>> >> >> > >>> >> >> > http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.p cbi.1003531 > >>> >> >> that VST was tested with Bray-Curtis distance. I think the > distance > >>> >> >> is > >>> >> >> designed for counts, but you could always threshold at 0 to > insist > >>> >> >> that the > >>> >> >> log2-like quantity act more like a count. > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss > >>> >> >> <sophie.weiss@colorado.edu> wrote: > >>> >> >>> > >>> >> >>> Hi Mike, > >>> >> >>> Thanks for explaining more. I am used to working with rarefied > >>> >> >>> microbial datasets, that is why. Instead of rarefying I would > >>> >> >>> like to use > >>> >> >>> the DESeq method. > >>> >> >>> > >>> >> >>> How would you then suggest going about calculating bray- curtis > >>> >> >>> distance, or summarized taxa diagrams with these new transformed > >>> >> >>> matrices > >>> >> >>> with negative values? > >>> >> >>> Thanks again, > >>> >> >>> Sophie > >>> >> >>> > >>> >> >>> > >>> >> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love > >>> >> >>> <michaelisaiahlove@gmail.com> wrote: > >>> >> >>>> > >>> >> >>>> hi Sophie, > >>> >> >>>> > >>> >> >>>> Can you explain why you don't want negative values in the > >>> >> >>>> transformed > >>> >> >>>> values? Adding one to the raw counts is not sufficient. I > should > >>> >> >>>> have said > >>> >> >>>> in my previous email, "the expected counts on the common > scale". > >>> >> >>>> If the > >>> >> >>>> size factor for a sample is 2, then an expected count of 1 > leads > >>> >> >>>> to an > >>> >> >>>> expected count of 1/2 on the common scale (after accounting for > >>> >> >>>> size > >>> >> >>>> factors). > >>> >> >>>> > >>> >> >>>> > >>> >> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss > >>> >> >>>> <sophie.weiss@colorado.edu> wrote: > >>> >> >>>>> > >>> >> >>>>> Hi Mike, > >>> >> >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to all > my > >>> >> >>>>> matrix values, so the lowest value in the matrix is 1 - there > >>> >> >>>>> are still > >>> >> >>>>> negatives? > >>> >> >>>>> Thanks again, > >>> >> >>>>> Sophie > >>> >> >>>>> > >>> >> >>>>> > >>> >> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love > >>> >> >>>>> <michaelisaiahlove@gmail.com> wrote: > >>> >> >>>>>> > >>> >> >>>>>> hi Sophie, > >>> >> >>>>>> > >>> >> >>>>>> The transformations in DESeq and DESeq2 are log2-like > >>> >> >>>>>> transformations. If the expected count is between 0 and 1, > the > >>> >> >>>>>> values can be > >>> >> >>>>>> negative, this does not indicate a problem. > >>> >> >>>>>> > >>> >> >>>>>> Mike > >>> >> >>>>>> > >>> >> >>>>>> > >>> >> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss > >>> >> >>>>>> <sophie.weiss@colorado.edu> wrote: > >>> >> >>>>>>> > >>> >> >>>>>>> Hello, > >>> >> >>>>>>> I have microbiome data with no replicates, from different > >>> >> >>>>>>> conditions. I am > >>> >> >>>>>>> trying to transform the data using the DESeq method, as > >>> >> >>>>>>> described > >>> >> >>>>>>> in > >>> >> >>>>>>> McMurdie and Holmes 2014. > >>> >> >>>>>>> > >>> >> >>>>>>> The attached file is the definition I am using, as per the > >>> >> >>>>>>> supplemental > >>> >> >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am > >>> >> >>>>>>> using. > >>> >> >>>>>>> > >>> >> >>>>>>> Thank you for your help, > >>> >> >>>>>>> Sophie > >>> >> >>>>>>> > >>> >> >>>>>>> _______________________________________________ > >>> >> >>>>>>> Bioconductor mailing list > >>> >> >>>>>>> Bioconductor@r-project.org > >>> >> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> >> >>>>>>> Search the archives: > >>> >> >>>>>>> > >>> >> >>>>>>> > http://news.gmane.org/gmane.science.biology.informatics.conductor > >>> >> >>>>>> > >>> >> >>>>>> > >>> >> >>>>> > >>> >> >>>> > >>> >> >>> > >>> >> >> > >>> >> > > >>> > > >>> > > >> > >> > > > [[alternative HTML version deleted]]
ADD REPLYlink written 4.4 years ago by Sophie Josephine Weiss130
Hi Mike, Could you please check whether I am running this correctly? I have double checked all the parameters, but for some reason, I am getting negatives using the R script on the attached .biom dataset. There are no replicates in this microbial dataset. Thanks for your advice, Sophie On Wed, Apr 16, 2014 at 4:02 PM, Sophie Josephine Weiss < Sophie.Weiss at colorado.edu> wrote: > Thanks Mike, that is what I thought. What if we wanted to perform kruskal > wallis, or is it possible to perform anova on the variance- stabilized > matrix? > > > On Wed, Apr 16, 2014 at 2:29 PM, Michael Love <michaelisaiahlove at="" gmail.com=""> > wrote: > >> hi Sophie, >> >> We recommend using the standard DESeq() function for differential >> expression. >> >> This is mentioned in the first line of the vignette section on >> transformations: >> >> "In order to test for diff erential expression, we operate on raw >> counts and use discrete distributions as >> described in the previous section" >> >> Also, in the McMurdie and Holmes, they are using the DESeq() function, >> as shown in their supplemental material: >> >> >> http://joey711.github.io/waste-not-supplemental/simulation- differential-abundance/simulation-differential-abundance-server.html >> >> On Wed, Apr 16, 2014 at 3:22 PM, Sophie Josephine Weiss >> <sophie.weiss at="" colorado.edu=""> wrote: >> > Please help with this? Thanks again. >> > >> > >> > On Mon, Apr 14, 2014 at 6:02 PM, Sophie Josephine Weiss >> > <sophie.weiss at="" colorado.edu=""> wrote: >> >> >> >> Thanks again Mike - would it be ok to do chi-2 and other significance >> >> tests on the DESeq transformed datasets using independent code, or is >> it >> >> necessary to do the differential expression tests strictly within >> DESeq2? >> >> >> >> Sophie >> >> >> >> >> >> On Mon, Apr 14, 2014 at 5:41 PM, Michael Love >> >> <michaelisaiahlove at="" gmail.com=""> wrote: >> >>> >> >>> hi Sophie, >> >>> >> >>> The VST code is the same in DESeq and DESeq2. The estimation of >> >>> dispersion is slightly different (details are in the vignette "Changes >> >>> from DESeq to DESeq2"), but the fitted line (which is used by the VST) >> >>> should be very similar. >> >>> >> >>> Mike >> >>> >> >>> On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss >> >>> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> > Hi Mike, >> >>> > The McMurdie and Holmes paper uses DESeq for matrix normalization - >> do >> >>> > you >> >>> > think that is ok, or would it be better to use DESeq 2? >> >>> > Thanks again, >> >>> > Sophie >> >>> > >> >>> > >> >>> > On Mon, Apr 14, 2014 at 3:40 PM, Michael Love >> >>> > <michaelisaiahlove at="" gmail.com=""> >> >>> > wrote: >> >>> >> >> >>> >> hi Sophie, >> >>> >> >> >>> >> >> >>> >> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss >> >>> >> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> >> > >> >>> >> > Hi Mike, >> >>> >> > Thanks for the references. By "threshold at 0" do you mean set >> any >> >>> >> > negative values equal to 0? >> >>> >> >> >>> >> >> >>> >> yes. >> >>> >> >> >>> >> >> >>> >> > >> >>> >> > Do you think this is the best approach? >> >>> >> >> >>> >> >> >>> >> I haven't explored this area, and would defer to the McMurdie and >> >>> >> Holmes paper for the best combinations of distance and >> transformation. >> >>> >> >> >>> >> >> >>> >> > >> >>> >> > Thanks again, >> >>> >> > Sophie >> >>> >> > >> >>> >> > >> >>> >> > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love >> >>> >> > <michaelisaiahlove at="" gmail.com=""> wrote: >> >>> >> >> >> >>> >> >> I tried poking around here >> >>> >> >> http://joey711.github.io/phyloseq/distance >> >>> >> >> but couldn't see if the authors did anything for distances >> >>> >> >> requiring >> >>> >> >> non-negative data. It appears >> >>> >> >> >> >>> >> >> >> http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal. pcbi.1003531 >> >>> >> >> that VST was tested with Bray-Curtis distance. I think the >> distance >> >>> >> >> is >> >>> >> >> designed for counts, but you could always threshold at 0 to >> insist >> >>> >> >> that the >> >>> >> >> log2-like quantity act more like a count. >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss >> >>> >> >> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> >> >>> >> >>> >> >>> Hi Mike, >> >>> >> >>> Thanks for explaining more. I am used to working with rarefied >> >>> >> >>> microbial datasets, that is why. Instead of rarefying I would >> >>> >> >>> like to use >> >>> >> >>> the DESeq method. >> >>> >> >>> >> >>> >> >>> How would you then suggest going about calculating bray- curtis >> >>> >> >>> distance, or summarized taxa diagrams with these new >> transformed >> >>> >> >>> matrices >> >>> >> >>> with negative values? >> >>> >> >>> Thanks again, >> >>> >> >>> Sophie >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love >> >>> >> >>> <michaelisaiahlove at="" gmail.com=""> wrote: >> >>> >> >>>> >> >>> >> >>>> hi Sophie, >> >>> >> >>>> >> >>> >> >>>> Can you explain why you don't want negative values in the >> >>> >> >>>> transformed >> >>> >> >>>> values? Adding one to the raw counts is not sufficient. I >> should >> >>> >> >>>> have said >> >>> >> >>>> in my previous email, "the expected counts on the common >> scale". >> >>> >> >>>> If the >> >>> >> >>>> size factor for a sample is 2, then an expected count of 1 >> leads >> >>> >> >>>> to an >> >>> >> >>>> expected count of 1/2 on the common scale (after accounting >> for >> >>> >> >>>> size >> >>> >> >>>> factors). >> >>> >> >>>> >> >>> >> >>>> >> >>> >> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss >> >>> >> >>>> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> >> >>>>> >> >>> >> >>>>> Hi Mike, >> >>> >> >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to >> all my >> >>> >> >>>>> matrix values, so the lowest value in the matrix is 1 - there >> >>> >> >>>>> are still >> >>> >> >>>>> negatives? >> >>> >> >>>>> Thanks again, >> >>> >> >>>>> Sophie >> >>> >> >>>>> >> >>> >> >>>>> >> >>> >> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love >> >>> >> >>>>> <michaelisaiahlove at="" gmail.com=""> wrote: >> >>> >> >>>>>> >> >>> >> >>>>>> hi Sophie, >> >>> >> >>>>>> >> >>> >> >>>>>> The transformations in DESeq and DESeq2 are log2-like >> >>> >> >>>>>> transformations. If the expected count is between 0 and 1, >> the >> >>> >> >>>>>> values can be >> >>> >> >>>>>> negative, this does not indicate a problem. >> >>> >> >>>>>> >> >>> >> >>>>>> Mike >> >>> >> >>>>>> >> >>> >> >>>>>> >> >>> >> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss >> >>> >> >>>>>> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> >> >>>>>>> >> >>> >> >>>>>>> Hello, >> >>> >> >>>>>>> I have microbiome data with no replicates, from different >> >>> >> >>>>>>> conditions. I am >> >>> >> >>>>>>> trying to transform the data using the DESeq method, as >> >>> >> >>>>>>> described >> >>> >> >>>>>>> in >> >>> >> >>>>>>> McMurdie and Holmes 2014. >> >>> >> >>>>>>> >> >>> >> >>>>>>> The attached file is the definition I am using, as per the >> >>> >> >>>>>>> supplemental >> >>> >> >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am >> >>> >> >>>>>>> using. >> >>> >> >>>>>>> >> >>> >> >>>>>>> Thank you for your help, >> >>> >> >>>>>>> Sophie >> >>> >> >>>>>>> >> >>> >> >>>>>>> _______________________________________________ >> >>> >> >>>>>>> Bioconductor mailing list >> >>> >> >>>>>>> Bioconductor at r-project.org >> >>> >> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >>> >> >>>>>>> Search the archives: >> >>> >> >>>>>>> >> >>> >> >>>>>>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >>> >> >>>>>> >> >>> >> >>>>>> >> >>> >> >>>>> >> >>> >> >>>> >> >>> >> >>> >> >>> >> >> >> >>> >> > >> >>> > >> >>> > >> >> >> >> >> > >> > >
ADD REPLYlink written 4.4 years ago by Sophie Josephine Weiss130
These are the commands I used to view the negatives for the above attached dataset/deseq_varstab definition: source("http://bioconductor.org/biocLite.R") biocLite("phyloseq") source("http://bioconductor.org/biocLite.R") biocLite("DESeq") library("phyloseq") library("DESeq") library("biom") file = "~/Downloads/study_1200_closed_reference_otu_table.biom" source("~/Downloads/deseq_varstab.R") DESeq_data = deseq_varstab(import_biom(file), method = "blind", sharingMode = "maximum", fitType = "local") sample_sums(DESeq_data) GN01P.484257 GN06P.o.484262 GN02P.484248 GN04P.484258 1.154872 -47.946867 -79.674507 -50.421798 GN05P.o.484260 GN07P.o.484246 GN04P.o.484251 GN07P.484259 -61.353420 -8.377317 -59.895848 -5.939202 GN02P.o.484250 GN06P.484247 GN09P.484254 GN01P.o.484256 -68.087039 -48.542141 -56.946003 8.642230 GN03P.o.484249 GN08P.o.484263 GN03P.484253 GN05P.484261 -107.273428 -50.791391 -93.250525 -43.492646 GN09P.o.484264 GN08P.484265 -76.240559 -40.871040 The parameters I used seem reasonable for what I read about data with no replicates (these samples are microbial, from a pH gradient). Thanks again, Sophie On Fri, Apr 18, 2014 at 4:32 PM, Sophie Josephine Weiss < Sophie.Weiss@colorado.edu> wrote: > Hi Mike, > Could you please check whether I am running this correctly? I have double > checked all the parameters, but for some reason, I am getting negatives > using the R script on the attached .biom dataset. There are no replicates > in this microbial dataset. > Thanks for your advice, > Sophie > > > On Wed, Apr 16, 2014 at 4:02 PM, Sophie Josephine Weiss < > Sophie.Weiss@colorado.edu> wrote: > >> Thanks Mike, that is what I thought. What if we wanted to perform >> kruskal wallis, or is it possible to perform anova on the >> variance-stabilized matrix? >> >> >> On Wed, Apr 16, 2014 at 2:29 PM, Michael Love < >> michaelisaiahlove@gmail.com> wrote: >> >>> hi Sophie, >>> >>> We recommend using the standard DESeq() function for differential >>> expression. >>> >>> This is mentioned in the first line of the vignette section on >>> transformations: >>> >>> "In order to test for diff erential expression, we operate on raw >>> counts and use discrete distributions as >>> described in the previous section" >>> >>> Also, in the McMurdie and Holmes, they are using the DESeq() function, >>> as shown in their supplemental material: >>> >>> >>> http://joey711.github.io/waste-not-supplemental/simulation- differential-abundance/simulation-differential-abundance-server.html >>> >>> On Wed, Apr 16, 2014 at 3:22 PM, Sophie Josephine Weiss >>> <sophie.weiss@colorado.edu> wrote: >>> > Please help with this? Thanks again. >>> > >>> > >>> > On Mon, Apr 14, 2014 at 6:02 PM, Sophie Josephine Weiss >>> > <sophie.weiss@colorado.edu> wrote: >>> >> >>> >> Thanks again Mike - would it be ok to do chi-2 and other significance >>> >> tests on the DESeq transformed datasets using independent code, or is >>> it >>> >> necessary to do the differential expression tests strictly within >>> DESeq2? >>> >> >>> >> Sophie >>> >> >>> >> >>> >> On Mon, Apr 14, 2014 at 5:41 PM, Michael Love >>> >> <michaelisaiahlove@gmail.com> wrote: >>> >>> >>> >>> hi Sophie, >>> >>> >>> >>> The VST code is the same in DESeq and DESeq2. The estimation of >>> >>> dispersion is slightly different (details are in the vignette >>> "Changes >>> >>> from DESeq to DESeq2"), but the fitted line (which is used by the >>> VST) >>> >>> should be very similar. >>> >>> >>> >>> Mike >>> >>> >>> >>> On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss >>> >>> <sophie.weiss@colorado.edu> wrote: >>> >>> > Hi Mike, >>> >>> > The McMurdie and Holmes paper uses DESeq for matrix normalization >>> - do >>> >>> > you >>> >>> > think that is ok, or would it be better to use DESeq 2? >>> >>> > Thanks again, >>> >>> > Sophie >>> >>> > >>> >>> > >>> >>> > On Mon, Apr 14, 2014 at 3:40 PM, Michael Love >>> >>> > <michaelisaiahlove@gmail.com> >>> >>> > wrote: >>> >>> >> >>> >>> >> hi Sophie, >>> >>> >> >>> >>> >> >>> >>> >> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss >>> >>> >> <sophie.weiss@colorado.edu> wrote: >>> >>> >> > >>> >>> >> > Hi Mike, >>> >>> >> > Thanks for the references. By "threshold at 0" do you mean set >>> any >>> >>> >> > negative values equal to 0? >>> >>> >> >>> >>> >> >>> >>> >> yes. >>> >>> >> >>> >>> >> >>> >>> >> > >>> >>> >> > Do you think this is the best approach? >>> >>> >> >>> >>> >> >>> >>> >> I haven't explored this area, and would defer to the McMurdie and >>> >>> >> Holmes paper for the best combinations of distance and >>> transformation. >>> >>> >> >>> >>> >> >>> >>> >> > >>> >>> >> > Thanks again, >>> >>> >> > Sophie >>> >>> >> > >>> >>> >> > >>> >>> >> > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love >>> >>> >> > <michaelisaiahlove@gmail.com> wrote: >>> >>> >> >> >>> >>> >> >> I tried poking around here >>> >>> >> >> http://joey711.github.io/phyloseq/distance >>> >>> >> >> but couldn't see if the authors did anything for distances >>> >>> >> >> requiring >>> >>> >> >> non-negative data. It appears >>> >>> >> >> >>> >>> >> >> >>> http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal .pcbi.1003531 >>> >>> >> >> that VST was tested with Bray-Curtis distance. I think the >>> distance >>> >>> >> >> is >>> >>> >> >> designed for counts, but you could always threshold at 0 to >>> insist >>> >>> >> >> that the >>> >>> >> >> log2-like quantity act more like a count. >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss >>> >>> >> >> <sophie.weiss@colorado.edu> wrote: >>> >>> >> >>> >>> >>> >> >>> Hi Mike, >>> >>> >> >>> Thanks for explaining more. I am used to working with >>> rarefied >>> >>> >> >>> microbial datasets, that is why. Instead of rarefying I would >>> >>> >> >>> like to use >>> >>> >> >>> the DESeq method. >>> >>> >> >>> >>> >>> >> >>> How would you then suggest going about calculating bray-curtis >>> >>> >> >>> distance, or summarized taxa diagrams with these new >>> transformed >>> >>> >> >>> matrices >>> >>> >> >>> with negative values? >>> >>> >> >>> Thanks again, >>> >>> >> >>> Sophie >>> >>> >> >>> >>> >>> >> >>> >>> >>> >> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love >>> >>> >> >>> <michaelisaiahlove@gmail.com> wrote: >>> >>> >> >>>> >>> >>> >> >>>> hi Sophie, >>> >>> >> >>>> >>> >>> >> >>>> Can you explain why you don't want negative values in the >>> >>> >> >>>> transformed >>> >>> >> >>>> values? Adding one to the raw counts is not sufficient. I >>> should >>> >>> >> >>>> have said >>> >>> >> >>>> in my previous email, "the expected counts on the common >>> scale". >>> >>> >> >>>> If the >>> >>> >> >>>> size factor for a sample is 2, then an expected count of 1 >>> leads >>> >>> >> >>>> to an >>> >>> >> >>>> expected count of 1/2 on the common scale (after accounting >>> for >>> >>> >> >>>> size >>> >>> >> >>>> factors). >>> >>> >> >>>> >>> >>> >> >>>> >>> >>> >> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss >>> >>> >> >>>> <sophie.weiss@colorado.edu> wrote: >>> >>> >> >>>>> >>> >>> >> >>>>> Hi Mike, >>> >>> >> >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to >>> all my >>> >>> >> >>>>> matrix values, so the lowest value in the matrix is 1 - >>> there >>> >>> >> >>>>> are still >>> >>> >> >>>>> negatives? >>> >>> >> >>>>> Thanks again, >>> >>> >> >>>>> Sophie >>> >>> >> >>>>> >>> >>> >> >>>>> >>> >>> >> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love >>> >>> >> >>>>> <michaelisaiahlove@gmail.com> wrote: >>> >>> >> >>>>>> >>> >>> >> >>>>>> hi Sophie, >>> >>> >> >>>>>> >>> >>> >> >>>>>> The transformations in DESeq and DESeq2 are log2-like >>> >>> >> >>>>>> transformations. If the expected count is between 0 and 1, >>> the >>> >>> >> >>>>>> values can be >>> >>> >> >>>>>> negative, this does not indicate a problem. >>> >>> >> >>>>>> >>> >>> >> >>>>>> Mike >>> >>> >> >>>>>> >>> >>> >> >>>>>> >>> >>> >> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss >>> >>> >> >>>>>> <sophie.weiss@colorado.edu> wrote: >>> >>> >> >>>>>>> >>> >>> >> >>>>>>> Hello, >>> >>> >> >>>>>>> I have microbiome data with no replicates, from different >>> >>> >> >>>>>>> conditions. I am >>> >>> >> >>>>>>> trying to transform the data using the DESeq method, as >>> >>> >> >>>>>>> described >>> >>> >> >>>>>>> in >>> >>> >> >>>>>>> McMurdie and Holmes 2014. >>> >>> >> >>>>>>> >>> >>> >> >>>>>>> The attached file is the definition I am using, as per the >>> >>> >> >>>>>>> supplemental >>> >>> >> >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am >>> >>> >> >>>>>>> using. >>> >>> >> >>>>>>> >>> >>> >> >>>>>>> Thank you for your help, >>> >>> >> >>>>>>> Sophie >>> >>> >> >>>>>>> >>> >>> >> >>>>>>> _______________________________________________ >>> >>> >> >>>>>>> Bioconductor mailing list >>> >>> >> >>>>>>> Bioconductor@r-project.org >>> >>> >> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> >>> >> >>>>>>> Search the archives: >>> >>> >> >>>>>>> >>> >>> >> >>>>>>> >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >> >>>>>> >>> >>> >> >>>>>> >>> >>> >> >>>>> >>> >>> >> >>>> >>> >>> >> >>> >>> >>> >> >> >>> >>> >> > >>> >>> > >>> >>> > >>> >> >>> >> >>> > >>> >> >> > [[alternative HTML version deleted]]
ADD REPLYlink written 4.4 years ago by Sophie Josephine Weiss130
hi Sophie, You are getting negative values from the transformation for the reasons I mentioned earlier, the transformation is log2-like. If you want to do something downstream of our software which requires non-negative values, below is some example code of how to threshold negative values for a matrix in R. The question of what is the best distance to use for taxa counts, or whether ANOVA on variance stabilized data is a good idea for taxa counts, depends on the properties of the data, and this is an area of active research. As I don't have experience analyzing this kind of data, I don't want to make any guesses. > m <- matrix(-2:5, ncol=2) > m [,1] [,2] [1,] -2 2 [2,] -1 3 [3,] 0 4 [4,] 1 5 > m[m < 0] <- 0 > m [,1] [,2] [1,] 0 2 [2,] 0 3 [3,] 0 4 [4,] 1 5 On Fri, Apr 18, 2014 at 3:32 PM, Sophie Josephine Weiss <sophie.weiss at="" colorado.edu=""> wrote: > Hi Mike, > Could you please check whether I am running this correctly? I have double > checked all the parameters, but for some reason, I am getting negatives > using the R script on the attached .biom dataset. There are no replicates > in this microbial dataset. > Thanks for your advice, > Sophie > > > On Wed, Apr 16, 2014 at 4:02 PM, Sophie Josephine Weiss > <sophie.weiss at="" colorado.edu=""> wrote: >> >> Thanks Mike, that is what I thought. What if we wanted to perform kruskal >> wallis, or is it possible to perform anova on the variance- stabilized >> matrix? >> >> >> On Wed, Apr 16, 2014 at 2:29 PM, Michael Love >> <michaelisaiahlove at="" gmail.com=""> wrote: >>> >>> hi Sophie, >>> >>> We recommend using the standard DESeq() function for differential >>> expression. >>> >>> This is mentioned in the first line of the vignette section on >>> transformations: >>> >>> "In order to test for diff erential expression, we operate on raw >>> counts and use discrete distributions as >>> described in the previous section" >>> >>> Also, in the McMurdie and Holmes, they are using the DESeq() function, >>> as shown in their supplemental material: >>> >>> >>> http://joey711.github.io/waste-not-supplemental/simulation- differential-abundance/simulation-differential-abundance-server.html >>> >>> On Wed, Apr 16, 2014 at 3:22 PM, Sophie Josephine Weiss >>> <sophie.weiss at="" colorado.edu=""> wrote: >>> > Please help with this? Thanks again. >>> > >>> > >>> > On Mon, Apr 14, 2014 at 6:02 PM, Sophie Josephine Weiss >>> > <sophie.weiss at="" colorado.edu=""> wrote: >>> >> >>> >> Thanks again Mike - would it be ok to do chi-2 and other significance >>> >> tests on the DESeq transformed datasets using independent code, or is >>> >> it >>> >> necessary to do the differential expression tests strictly within >>> >> DESeq2? >>> >> >>> >> Sophie >>> >> >>> >> >>> >> On Mon, Apr 14, 2014 at 5:41 PM, Michael Love >>> >> <michaelisaiahlove at="" gmail.com=""> wrote: >>> >>> >>> >>> hi Sophie, >>> >>> >>> >>> The VST code is the same in DESeq and DESeq2. The estimation of >>> >>> dispersion is slightly different (details are in the vignette >>> >>> "Changes >>> >>> from DESeq to DESeq2"), but the fitted line (which is used by the >>> >>> VST) >>> >>> should be very similar. >>> >>> >>> >>> Mike >>> >>> >>> >>> On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss >>> >>> <sophie.weiss at="" colorado.edu=""> wrote: >>> >>> > Hi Mike, >>> >>> > The McMurdie and Holmes paper uses DESeq for matrix normalization - >>> >>> > do >>> >>> > you >>> >>> > think that is ok, or would it be better to use DESeq 2? >>> >>> > Thanks again, >>> >>> > Sophie >>> >>> > >>> >>> > >>> >>> > On Mon, Apr 14, 2014 at 3:40 PM, Michael Love >>> >>> > <michaelisaiahlove at="" gmail.com=""> >>> >>> > wrote: >>> >>> >> >>> >>> >> hi Sophie, >>> >>> >> >>> >>> >> >>> >>> >> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss >>> >>> >> <sophie.weiss at="" colorado.edu=""> wrote: >>> >>> >> > >>> >>> >> > Hi Mike, >>> >>> >> > Thanks for the references. By "threshold at 0" do you mean set >>> >>> >> > any >>> >>> >> > negative values equal to 0? >>> >>> >> >>> >>> >> >>> >>> >> yes. >>> >>> >> >>> >>> >> >>> >>> >> > >>> >>> >> > Do you think this is the best approach? >>> >>> >> >>> >>> >> >>> >>> >> I haven't explored this area, and would defer to the McMurdie and >>> >>> >> Holmes paper for the best combinations of distance and >>> >>> >> transformation. >>> >>> >> >>> >>> >> >>> >>> >> > >>> >>> >> > Thanks again, >>> >>> >> > Sophie >>> >>> >> > >>> >>> >> > >>> >>> >> > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love >>> >>> >> > <michaelisaiahlove at="" gmail.com=""> wrote: >>> >>> >> >> >>> >>> >> >> I tried poking around here >>> >>> >> >> http://joey711.github.io/phyloseq/distance >>> >>> >> >> but couldn't see if the authors did anything for distances >>> >>> >> >> requiring >>> >>> >> >> non-negative data. It appears >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371 %2Fjournal.pcbi.1003531 >>> >>> >> >> that VST was tested with Bray-Curtis distance. I think the >>> >>> >> >> distance >>> >>> >> >> is >>> >>> >> >> designed for counts, but you could always threshold at 0 to >>> >>> >> >> insist >>> >>> >> >> that the >>> >>> >> >> log2-like quantity act more like a count. >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss >>> >>> >> >> <sophie.weiss at="" colorado.edu=""> wrote: >>> >>> >> >>> >>> >>> >> >>> Hi Mike, >>> >>> >> >>> Thanks for explaining more. I am used to working with >>> >>> >> >>> rarefied >>> >>> >> >>> microbial datasets, that is why. Instead of rarefying I would >>> >>> >> >>> like to use >>> >>> >> >>> the DESeq method. >>> >>> >> >>> >>> >>> >> >>> How would you then suggest going about calculating bray-curtis >>> >>> >> >>> distance, or summarized taxa diagrams with these new >>> >>> >> >>> transformed >>> >>> >> >>> matrices >>> >>> >> >>> with negative values? >>> >>> >> >>> Thanks again, >>> >>> >> >>> Sophie >>> >>> >> >>> >>> >>> >> >>> >>> >>> >> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love >>> >>> >> >>> <michaelisaiahlove at="" gmail.com=""> wrote: >>> >>> >> >>>> >>> >>> >> >>>> hi Sophie, >>> >>> >> >>>> >>> >>> >> >>>> Can you explain why you don't want negative values in the >>> >>> >> >>>> transformed >>> >>> >> >>>> values? Adding one to the raw counts is not sufficient. I >>> >>> >> >>>> should >>> >>> >> >>>> have said >>> >>> >> >>>> in my previous email, "the expected counts on the common >>> >>> >> >>>> scale". >>> >>> >> >>>> If the >>> >>> >> >>>> size factor for a sample is 2, then an expected count of 1 >>> >>> >> >>>> leads >>> >>> >> >>>> to an >>> >>> >> >>>> expected count of 1/2 on the common scale (after accounting >>> >>> >> >>>> for >>> >>> >> >>>> size >>> >>> >> >>>> factors). >>> >>> >> >>>> >>> >>> >> >>>> >>> >>> >> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss >>> >>> >> >>>> <sophie.weiss at="" colorado.edu=""> wrote: >>> >>> >> >>>>> >>> >>> >> >>>>> Hi Mike, >>> >>> >> >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to >>> >>> >> >>>>> all my >>> >>> >> >>>>> matrix values, so the lowest value in the matrix is 1 - >>> >>> >> >>>>> there >>> >>> >> >>>>> are still >>> >>> >> >>>>> negatives? >>> >>> >> >>>>> Thanks again, >>> >>> >> >>>>> Sophie >>> >>> >> >>>>> >>> >>> >> >>>>> >>> >>> >> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love >>> >>> >> >>>>> <michaelisaiahlove at="" gmail.com=""> wrote: >>> >>> >> >>>>>> >>> >>> >> >>>>>> hi Sophie, >>> >>> >> >>>>>> >>> >>> >> >>>>>> The transformations in DESeq and DESeq2 are log2-like >>> >>> >> >>>>>> transformations. If the expected count is between 0 and 1, >>> >>> >> >>>>>> the >>> >>> >> >>>>>> values can be >>> >>> >> >>>>>> negative, this does not indicate a problem. >>> >>> >> >>>>>> >>> >>> >> >>>>>> Mike >>> >>> >> >>>>>> >>> >>> >> >>>>>> >>> >>> >> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss >>> >>> >> >>>>>> <sophie.weiss at="" colorado.edu=""> wrote: >>> >>> >> >>>>>>> >>> >>> >> >>>>>>> Hello, >>> >>> >> >>>>>>> I have microbiome data with no replicates, from different >>> >>> >> >>>>>>> conditions. I am >>> >>> >> >>>>>>> trying to transform the data using the DESeq method, as >>> >>> >> >>>>>>> described >>> >>> >> >>>>>>> in >>> >>> >> >>>>>>> McMurdie and Holmes 2014. >>> >>> >> >>>>>>> >>> >>> >> >>>>>>> The attached file is the definition I am using, as per the >>> >>> >> >>>>>>> supplemental >>> >>> >> >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am >>> >>> >> >>>>>>> using. >>> >>> >> >>>>>>> >>> >>> >> >>>>>>> Thank you for your help, >>> >>> >> >>>>>>> Sophie >>> >>> >> >>>>>>> >>> >>> >> >>>>>>> _______________________________________________ >>> >>> >> >>>>>>> Bioconductor mailing list >>> >>> >> >>>>>>> Bioconductor at r-project.org >>> >>> >> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> >>> >> >>>>>>> Search the archives: >>> >>> >> >>>>>>> >>> >>> >> >>>>>>> >>> >>> >> >>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >>> >> >>>>>> >>> >>> >> >>>>>> >>> >>> >> >>>>> >>> >>> >> >>>> >>> >>> >> >>> >>> >>> >> >> >>> >>> >> > >>> >>> > >>> >>> > >>> >> >>> >> >>> > >> >> >
ADD REPLYlink written 4.4 years ago by Michael Love19k
Thanks Michael, The entire dataset (attached code and .biom) is negatives - there was an error of "out of vertex space" as described here<http: seqanswers.com="" forums="" showthread.php?p="18620">, so I tried setting maxk=300 as suggested. Commands are below. Thanks again! Sophie source("http://bioconductor.org/biocLite.R") biocLite("phyloseq") biocLite("DESeq") library("phyloseq") library("DESeq") library("biom") file = "~/Downloads/study_449_closed_reference_otu_table.biom" x = import_biom(file) source("~/Downloads/deseq_varstab.R") DESeq_data = deseq_varstab(x, method = "blind", sharingMode = "maximum", fitType = "local", locfit_extra_args=list(maxk=300)) write_biom(make_biom(DESeq_data at otu_table ),"~/Desktop/449_Costello_DESeq.biom.tsv") On Sat, Apr 19, 2014 at 11:29 AM, Michael Love <michaelisaiahlove at="" gmail.com="">wrote: > hi Sophie, > > You are getting negative values from the transformation for the > reasons I mentioned earlier, the transformation is log2-like. > > If you want to do something downstream of our software which requires > non-negative values, below is some example code of how to threshold > negative values for a matrix in R. > > The question of what is the best distance to use for taxa counts, or > whether ANOVA on variance stabilized data is a good idea for taxa > counts, depends on the properties of the data, and this is an area of > active research. As I don't have experience analyzing this kind of > data, I don't want to make any guesses. > > > m <- matrix(-2:5, ncol=2) > > m > [,1] [,2] > [1,] -2 2 > [2,] -1 3 > [3,] 0 4 > [4,] 1 5 > > m[m < 0] <- 0 > > m > [,1] [,2] > [1,] 0 2 > [2,] 0 3 > [3,] 0 4 > [4,] 1 5 > > On Fri, Apr 18, 2014 at 3:32 PM, Sophie Josephine Weiss > <sophie.weiss at="" colorado.edu=""> wrote: > > Hi Mike, > > Could you please check whether I am running this correctly? I have > double > > checked all the parameters, but for some reason, I am getting negatives > > using the R script on the attached .biom dataset. There are no > replicates > > in this microbial dataset. > > Thanks for your advice, > > Sophie > > > > > > On Wed, Apr 16, 2014 at 4:02 PM, Sophie Josephine Weiss > > <sophie.weiss at="" colorado.edu=""> wrote: > >> > >> Thanks Mike, that is what I thought. What if we wanted to perform > kruskal > >> wallis, or is it possible to perform anova on the variance- stabilized > >> matrix? > >> > >> > >> On Wed, Apr 16, 2014 at 2:29 PM, Michael Love > >> <michaelisaiahlove at="" gmail.com=""> wrote: > >>> > >>> hi Sophie, > >>> > >>> We recommend using the standard DESeq() function for differential > >>> expression. > >>> > >>> This is mentioned in the first line of the vignette section on > >>> transformations: > >>> > >>> "In order to test for diff erential expression, we operate on raw > >>> counts and use discrete distributions as > >>> described in the previous section" > >>> > >>> Also, in the McMurdie and Holmes, they are using the DESeq() function, > >>> as shown in their supplemental material: > >>> > >>> > >>> > http://joey711.github.io/waste-not-supplemental/simulation- differential-abundance/simulation-differential-abundance-server.html > >>> > >>> On Wed, Apr 16, 2014 at 3:22 PM, Sophie Josephine Weiss > >>> <sophie.weiss at="" colorado.edu=""> wrote: > >>> > Please help with this? Thanks again. > >>> > > >>> > > >>> > On Mon, Apr 14, 2014 at 6:02 PM, Sophie Josephine Weiss > >>> > <sophie.weiss at="" colorado.edu=""> wrote: > >>> >> > >>> >> Thanks again Mike - would it be ok to do chi-2 and other > significance > >>> >> tests on the DESeq transformed datasets using independent code, or > is > >>> >> it > >>> >> necessary to do the differential expression tests strictly within > >>> >> DESeq2? > >>> >> > >>> >> Sophie > >>> >> > >>> >> > >>> >> On Mon, Apr 14, 2014 at 5:41 PM, Michael Love > >>> >> <michaelisaiahlove at="" gmail.com=""> wrote: > >>> >>> > >>> >>> hi Sophie, > >>> >>> > >>> >>> The VST code is the same in DESeq and DESeq2. The estimation of > >>> >>> dispersion is slightly different (details are in the vignette > >>> >>> "Changes > >>> >>> from DESeq to DESeq2"), but the fitted line (which is used by the > >>> >>> VST) > >>> >>> should be very similar. > >>> >>> > >>> >>> Mike > >>> >>> > >>> >>> On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss > >>> >>> <sophie.weiss at="" colorado.edu=""> wrote: > >>> >>> > Hi Mike, > >>> >>> > The McMurdie and Holmes paper uses DESeq for matrix > normalization - > >>> >>> > do > >>> >>> > you > >>> >>> > think that is ok, or would it be better to use DESeq 2? > >>> >>> > Thanks again, > >>> >>> > Sophie > >>> >>> > > >>> >>> > > >>> >>> > On Mon, Apr 14, 2014 at 3:40 PM, Michael Love > >>> >>> > <michaelisaiahlove at="" gmail.com=""> > >>> >>> > wrote: > >>> >>> >> > >>> >>> >> hi Sophie, > >>> >>> >> > >>> >>> >> > >>> >>> >> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss > >>> >>> >> <sophie.weiss at="" colorado.edu=""> wrote: > >>> >>> >> > > >>> >>> >> > Hi Mike, > >>> >>> >> > Thanks for the references. By "threshold at 0" do you mean > set > >>> >>> >> > any > >>> >>> >> > negative values equal to 0? > >>> >>> >> > >>> >>> >> > >>> >>> >> yes. > >>> >>> >> > >>> >>> >> > >>> >>> >> > > >>> >>> >> > Do you think this is the best approach? > >>> >>> >> > >>> >>> >> > >>> >>> >> I haven't explored this area, and would defer to the McMurdie > and > >>> >>> >> Holmes paper for the best combinations of distance and > >>> >>> >> transformation. > >>> >>> >> > >>> >>> >> > >>> >>> >> > > >>> >>> >> > Thanks again, > >>> >>> >> > Sophie > >>> >>> >> > > >>> >>> >> > > >>> >>> >> > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love > >>> >>> >> > <michaelisaiahlove at="" gmail.com=""> wrote: > >>> >>> >> >> > >>> >>> >> >> I tried poking around here > >>> >>> >> >> http://joey711.github.io/phyloseq/distance > >>> >>> >> >> but couldn't see if the authors did anything for distances > >>> >>> >> >> requiring > >>> >>> >> >> non-negative data. It appears > >>> >>> >> >> > >>> >>> >> >> > >>> >>> >> >> > http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.p cbi.1003531 > >>> >>> >> >> that VST was tested with Bray-Curtis distance. I think the > >>> >>> >> >> distance > >>> >>> >> >> is > >>> >>> >> >> designed for counts, but you could always threshold at 0 to > >>> >>> >> >> insist > >>> >>> >> >> that the > >>> >>> >> >> log2-like quantity act more like a count. > >>> >>> >> >> > >>> >>> >> >> > >>> >>> >> >> > >>> >>> >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss > >>> >>> >> >> <sophie.weiss at="" colorado.edu=""> wrote: > >>> >>> >> >>> > >>> >>> >> >>> Hi Mike, > >>> >>> >> >>> Thanks for explaining more. I am used to working with > >>> >>> >> >>> rarefied > >>> >>> >> >>> microbial datasets, that is why. Instead of rarefying I > would > >>> >>> >> >>> like to use > >>> >>> >> >>> the DESeq method. > >>> >>> >> >>> > >>> >>> >> >>> How would you then suggest going about calculating > bray-curtis > >>> >>> >> >>> distance, or summarized taxa diagrams with these new > >>> >>> >> >>> transformed > >>> >>> >> >>> matrices > >>> >>> >> >>> with negative values? > >>> >>> >> >>> Thanks again, > >>> >>> >> >>> Sophie > >>> >>> >> >>> > >>> >>> >> >>> > >>> >>> >> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love > >>> >>> >> >>> <michaelisaiahlove at="" gmail.com=""> wrote: > >>> >>> >> >>>> > >>> >>> >> >>>> hi Sophie, > >>> >>> >> >>>> > >>> >>> >> >>>> Can you explain why you don't want negative values in the > >>> >>> >> >>>> transformed > >>> >>> >> >>>> values? Adding one to the raw counts is not sufficient. I > >>> >>> >> >>>> should > >>> >>> >> >>>> have said > >>> >>> >> >>>> in my previous email, "the expected counts on the common > >>> >>> >> >>>> scale". > >>> >>> >> >>>> If the > >>> >>> >> >>>> size factor for a sample is 2, then an expected count of 1 > >>> >>> >> >>>> leads > >>> >>> >> >>>> to an > >>> >>> >> >>>> expected count of 1/2 on the common scale (after accounting > >>> >>> >> >>>> for > >>> >>> >> >>>> size > >>> >>> >> >>>> factors). > >>> >>> >> >>>> > >>> >>> >> >>>> > >>> >>> >> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss > >>> >>> >> >>>> <sophie.weiss at="" colorado.edu=""> wrote: > >>> >>> >> >>>>> > >>> >>> >> >>>>> Hi Mike, > >>> >>> >> >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to > >>> >>> >> >>>>> all my > >>> >>> >> >>>>> matrix values, so the lowest value in the matrix is 1 - > >>> >>> >> >>>>> there > >>> >>> >> >>>>> are still > >>> >>> >> >>>>> negatives? > >>> >>> >> >>>>> Thanks again, > >>> >>> >> >>>>> Sophie > >>> >>> >> >>>>> > >>> >>> >> >>>>> > >>> >>> >> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love > >>> >>> >> >>>>> <michaelisaiahlove at="" gmail.com=""> wrote: > >>> >>> >> >>>>>> > >>> >>> >> >>>>>> hi Sophie, > >>> >>> >> >>>>>> > >>> >>> >> >>>>>> The transformations in DESeq and DESeq2 are log2-like > >>> >>> >> >>>>>> transformations. If the expected count is between 0 and > 1, > >>> >>> >> >>>>>> the > >>> >>> >> >>>>>> values can be > >>> >>> >> >>>>>> negative, this does not indicate a problem. > >>> >>> >> >>>>>> > >>> >>> >> >>>>>> Mike > >>> >>> >> >>>>>> > >>> >>> >> >>>>>> > >>> >>> >> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss > >>> >>> >> >>>>>> <sophie.weiss at="" colorado.edu=""> wrote: > >>> >>> >> >>>>>>> > >>> >>> >> >>>>>>> Hello, > >>> >>> >> >>>>>>> I have microbiome data with no replicates, from > different > >>> >>> >> >>>>>>> conditions. I am > >>> >>> >> >>>>>>> trying to transform the data using the DESeq method, as > >>> >>> >> >>>>>>> described > >>> >>> >> >>>>>>> in > >>> >>> >> >>>>>>> McMurdie and Holmes 2014. > >>> >>> >> >>>>>>> > >>> >>> >> >>>>>>> The attached file is the definition I am using, as per > the > >>> >>> >> >>>>>>> supplemental > >>> >>> >> >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I > am > >>> >>> >> >>>>>>> using. > >>> >>> >> >>>>>>> > >>> >>> >> >>>>>>> Thank you for your help, > >>> >>> >> >>>>>>> Sophie > >>> >>> >> >>>>>>> > >>> >>> >> >>>>>>> _______________________________________________ > >>> >>> >> >>>>>>> Bioconductor mailing list > >>> >>> >> >>>>>>> Bioconductor at r-project.org > >>> >>> >> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >>> >>> >> >>>>>>> Search the archives: > >>> >>> >> >>>>>>> > >>> >>> >> >>>>>>> > >>> >>> >> >>>>>>> > http://news.gmane.org/gmane.science.biology.informatics.conductor > >>> >>> >> >>>>>> > >>> >>> >> >>>>>> > >>> >>> >> >>>>> > >>> >>> >> >>>> > >>> >>> >> >>> > >>> >>> >> >> > >>> >>> >> > > >>> >>> > > >>> >>> > > >>> >> > >>> >> > >>> > > >> > >> > > >
ADD REPLYlink written 4.4 years ago by Sophie Josephine Weiss130
Hi Sophie as this issue comes up periodically, let me point out that log (cx) = log(c) + log(x) That means, if you think of ?x? as your data matrix and ?c? as a single positive number, you can always add or subtract a constant to your transformed data, for instance, to make it more agreeable to you by having all positive signs, and all that amounts to is an overall scaling (multiplication) of the data on the untransformed scale. An analogous idea applies to the rlog or vst transformations of DESeq2. A reasonable distance metric between samples or genes should probably not depend on such an overall constant c. Best wishes Wolfgang On 23 Apr 2014, at 23:44, Sophie Josephine Weiss <sophie.weiss at="" colorado.edu=""> wrote: > Thanks Michael, > The entire dataset (attached code and .biom) is negatives - there was an > error of "out of vertex space" as described > here<http: seqanswers.com="" forums="" showthread.php?p="18620">, > so I tried setting maxk=300 as suggested. > Commands are below. > Thanks again! > Sophie > > source("http://bioconductor.org/biocLite.R") > biocLite("phyloseq") > biocLite("DESeq") > > library("phyloseq") > library("DESeq") > library("biom") > > file = "~/Downloads/study_449_closed_reference_otu_table.biom" > x = import_biom(file) > source("~/Downloads/deseq_varstab.R") > DESeq_data = deseq_varstab(x, method = "blind", sharingMode = "maximum", > fitType = "local", locfit_extra_args=list(maxk=300)) > write_biom(make_biom(DESeq_data at otu_table > ),"~/Desktop/449_Costello_DESeq.biom.tsv") > > > On Sat, Apr 19, 2014 at 11:29 AM, Michael Love > <michaelisaiahlove at="" gmail.com="">wrote: > >> hi Sophie, >> >> You are getting negative values from the transformation for the >> reasons I mentioned earlier, the transformation is log2-like. >> >> If you want to do something downstream of our software which requires >> non-negative values, below is some example code of how to threshold >> negative values for a matrix in R. >> >> The question of what is the best distance to use for taxa counts, or >> whether ANOVA on variance stabilized data is a good idea for taxa >> counts, depends on the properties of the data, and this is an area of >> active research. As I don't have experience analyzing this kind of >> data, I don't want to make any guesses. >> >>> m <- matrix(-2:5, ncol=2) >>> m >> [,1] [,2] >> [1,] -2 2 >> [2,] -1 3 >> [3,] 0 4 >> [4,] 1 5 >>> m[m < 0] <- 0 >>> m >> [,1] [,2] >> [1,] 0 2 >> [2,] 0 3 >> [3,] 0 4 >> [4,] 1 5 >> >> On Fri, Apr 18, 2014 at 3:32 PM, Sophie Josephine Weiss >> <sophie.weiss at="" colorado.edu=""> wrote: >>> Hi Mike, >>> Could you please check whether I am running this correctly? I have >> double >>> checked all the parameters, but for some reason, I am getting negatives >>> using the R script on the attached .biom dataset. There are no >> replicates >>> in this microbial dataset. >>> Thanks for your advice, >>> Sophie >>> >>> >>> On Wed, Apr 16, 2014 at 4:02 PM, Sophie Josephine Weiss >>> <sophie.weiss at="" colorado.edu=""> wrote: >>>> >>>> Thanks Mike, that is what I thought. What if we wanted to perform >> kruskal >>>> wallis, or is it possible to perform anova on the variance- stabilized >>>> matrix? >>>> >>>> >>>> On Wed, Apr 16, 2014 at 2:29 PM, Michael Love >>>> <michaelisaiahlove at="" gmail.com=""> wrote: >>>>> >>>>> hi Sophie, >>>>> >>>>> We recommend using the standard DESeq() function for differential >>>>> expression. >>>>> >>>>> This is mentioned in the first line of the vignette section on >>>>> transformations: >>>>> >>>>> "In order to test for diff erential expression, we operate on raw >>>>> counts and use discrete distributions as >>>>> described in the previous section" >>>>> >>>>> Also, in the McMurdie and Holmes, they are using the DESeq() function, >>>>> as shown in their supplemental material: >>>>> >>>>> >>>>> >> http://joey711.github.io/waste-not-supplemental/simulation- differential-abundance/simulation-differential-abundance-server.html >>>>> >>>>> On Wed, Apr 16, 2014 at 3:22 PM, Sophie Josephine Weiss >>>>> <sophie.weiss at="" colorado.edu=""> wrote: >>>>>> Please help with this? Thanks again. >>>>>> >>>>>> >>>>>> On Mon, Apr 14, 2014 at 6:02 PM, Sophie Josephine Weiss >>>>>> <sophie.weiss at="" colorado.edu=""> wrote: >>>>>>> >>>>>>> Thanks again Mike - would it be ok to do chi-2 and other >> significance >>>>>>> tests on the DESeq transformed datasets using independent code, or >> is >>>>>>> it >>>>>>> necessary to do the differential expression tests strictly within >>>>>>> DESeq2? >>>>>>> >>>>>>> Sophie >>>>>>> >>>>>>> >>>>>>> On Mon, Apr 14, 2014 at 5:41 PM, Michael Love >>>>>>> <michaelisaiahlove at="" gmail.com=""> wrote: >>>>>>>> >>>>>>>> hi Sophie, >>>>>>>> >>>>>>>> The VST code is the same in DESeq and DESeq2. The estimation of >>>>>>>> dispersion is slightly different (details are in the vignette >>>>>>>> "Changes >>>>>>>> from DESeq to DESeq2"), but the fitted line (which is used by the >>>>>>>> VST) >>>>>>>> should be very similar. >>>>>>>> >>>>>>>> Mike >>>>>>>> >>>>>>>> On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss >>>>>>>> <sophie.weiss at="" colorado.edu=""> wrote: >>>>>>>>> Hi Mike, >>>>>>>>> The McMurdie and Holmes paper uses DESeq for matrix >> normalization - >>>>>>>>> do >>>>>>>>> you >>>>>>>>> think that is ok, or would it be better to use DESeq 2? >>>>>>>>> Thanks again, >>>>>>>>> Sophie >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Apr 14, 2014 at 3:40 PM, Michael Love >>>>>>>>> <michaelisaiahlove at="" gmail.com=""> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> hi Sophie, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss >>>>>>>>>> <sophie.weiss at="" colorado.edu=""> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Mike, >>>>>>>>>>> Thanks for the references. By "threshold at 0" do you mean >> set >>>>>>>>>>> any >>>>>>>>>>> negative values equal to 0? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> yes. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Do you think this is the best approach? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I haven't explored this area, and would defer to the McMurdie >> and >>>>>>>>>> Holmes paper for the best combinations of distance and >>>>>>>>>> transformation. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Thanks again, >>>>>>>>>>> Sophie >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Mon, Apr 14, 2014 at 11:01 AM, Michael Love >>>>>>>>>>> <michaelisaiahlove at="" gmail.com=""> wrote: >>>>>>>>>>>> >>>>>>>>>>>> I tried poking around here >>>>>>>>>>>> http://joey711.github.io/phyloseq/distance >>>>>>>>>>>> but couldn't see if the authors did anything for distances >>>>>>>>>>>> requiring >>>>>>>>>>>> non-negative data. It appears >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >> http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal. pcbi.1003531 >>>>>>>>>>>> that VST was tested with Bray-Curtis distance. I think the >>>>>>>>>>>> distance >>>>>>>>>>>> is >>>>>>>>>>>> designed for counts, but you could always threshold at 0 to >>>>>>>>>>>> insist >>>>>>>>>>>> that the >>>>>>>>>>>> log2-like quantity act more like a count. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss >>>>>>>>>>>> <sophie.weiss at="" colorado.edu=""> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Mike, >>>>>>>>>>>>> Thanks for explaining more. I am used to working with >>>>>>>>>>>>> rarefied >>>>>>>>>>>>> microbial datasets, that is why. Instead of rarefying I >> would >>>>>>>>>>>>> like to use >>>>>>>>>>>>> the DESeq method. >>>>>>>>>>>>> >>>>>>>>>>>>> How would you then suggest going about calculating >> bray-curtis >>>>>>>>>>>>> distance, or summarized taxa diagrams with these new >>>>>>>>>>>>> transformed >>>>>>>>>>>>> matrices >>>>>>>>>>>>> with negative values? >>>>>>>>>>>>> Thanks again, >>>>>>>>>>>>> Sophie >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love >>>>>>>>>>>>> <michaelisaiahlove at="" gmail.com=""> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> hi Sophie, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Can you explain why you don't want negative values in the >>>>>>>>>>>>>> transformed >>>>>>>>>>>>>> values? Adding one to the raw counts is not sufficient. I >>>>>>>>>>>>>> should >>>>>>>>>>>>>> have said >>>>>>>>>>>>>> in my previous email, "the expected counts on the common >>>>>>>>>>>>>> scale". >>>>>>>>>>>>>> If the >>>>>>>>>>>>>> size factor for a sample is 2, then an expected count of 1 >>>>>>>>>>>>>> leads >>>>>>>>>>>>>> to an >>>>>>>>>>>>>> expected count of 1/2 on the common scale (after accounting >>>>>>>>>>>>>> for >>>>>>>>>>>>>> size >>>>>>>>>>>>>> factors). >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss >>>>>>>>>>>>>> <sophie.weiss at="" colorado.edu=""> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Mike, >>>>>>>>>>>>>>> Thanks for your reply! Ok, makes sense, but I added 1 to >>>>>>>>>>>>>>> all my >>>>>>>>>>>>>>> matrix values, so the lowest value in the matrix is 1 - >>>>>>>>>>>>>>> there >>>>>>>>>>>>>>> are still >>>>>>>>>>>>>>> negatives? >>>>>>>>>>>>>>> Thanks again, >>>>>>>>>>>>>>> Sophie >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love >>>>>>>>>>>>>>> <michaelisaiahlove at="" gmail.com=""> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> hi Sophie, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The transformations in DESeq and DESeq2 are log2-like >>>>>>>>>>>>>>>> transformations. If the expected count is between 0 and >> 1, >>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> values can be >>>>>>>>>>>>>>>> negative, this does not indicate a problem. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Mike >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss >>>>>>>>>>>>>>>> <sophie.weiss at="" colorado.edu=""> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>> I have microbiome data with no replicates, from >> different >>>>>>>>>>>>>>>>> conditions. I am >>>>>>>>>>>>>>>>> trying to transform the data using the DESeq method, as >>>>>>>>>>>>>>>>> described >>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>> McMurdie and Holmes 2014. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The attached file is the definition I am using, as per >> the >>>>>>>>>>>>>>>>> supplemental >>>>>>>>>>>>>>>>> info in McMurdie and Holmes 2014, and the .biom file I >> am >>>>>>>>>>>>>>>>> using. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thank you for your help, >>>>>>>>>>>>>>>>> Sophie >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>> Bioconductor mailing list >>>>>>>>>>>>>>>>> Bioconductor at r-project.org >>>>>>>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>>>>>>>>>>>>>>> Search the archives: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>> >>>> >>> >> > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLYlink written 4.4 years ago by Wolfgang Huber13k
0
gravatar for Michael Love
4.4 years ago by
Michael Love19k
United States
Michael Love19k wrote:
hi Sophie, The transformations in DESeq and DESeq2 are log2-like transformations. If the expected count is between 0 and 1, the values can be negative, this does not indicate a problem. Mike On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss < Sophie.Weiss@colorado.edu> wrote: > Hello, > I have microbiome data with no replicates, from different conditions. I am > trying to transform the data using the DESeq method, as described in > McMurdie and Holmes 2014. > > The attached file is the definition I am using, as per the supplemental > info in McMurdie and Holmes 2014, and the .biom file I am using. > > Thank you for your help, > Sophie > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]
ADD COMMENTlink written 4.4 years ago by Michael Love19k
Hi Mike, Thanks for your reply! Ok, makes sense, but I added 1 to all my matrix values, so the lowest value in the matrix is 1 - there are still negatives? Thanks again, Sophie On Sun, Apr 13, 2014 at 9:01 PM, Michael Love <michaelisaiahlove@gmail.com>wrote: > hi Sophie, > > The transformations in DESeq and DESeq2 are log2-like transformations. If > the expected count is between 0 and 1, the values can be negative, this > does not indicate a problem. > > Mike > > > On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss < > Sophie.Weiss@colorado.edu> wrote: > >> Hello, >> I have microbiome data with no replicates, from different conditions. I >> am >> trying to transform the data using the DESeq method, as described in >> McMurdie and Holmes 2014. >> >> The attached file is the definition I am using, as per the supplemental >> info in McMurdie and Holmes 2014, and the .biom file I am using. >> >> Thank you for your help, >> Sophie >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > [[alternative HTML version deleted]]
ADD REPLYlink written 4.4 years ago by Sophie Josephine Weiss130
hi Sophie, Can you explain why you don't want negative values in the transformed values? Adding one to the raw counts is not sufficient. I should have said in my previous email, "the expected counts on the common scale". If the size factor for a sample is 2, then an expected count of 1 leads to an expected count of 1/2 on the common scale (after accounting for size factors). On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss < Sophie.Weiss@colorado.edu> wrote: > Hi Mike, > Thanks for your reply! Ok, makes sense, but I added 1 to all my matrix > values, so the lowest value in the matrix is 1 - there are still negatives? > Thanks again, > Sophie > > > On Sun, Apr 13, 2014 at 9:01 PM, Michael Love <michaelisaiahlove@gmail.com> > wrote: > >> hi Sophie, >> >> The transformations in DESeq and DESeq2 are log2-like transformations. If >> the expected count is between 0 and 1, the values can be negative, this >> does not indicate a problem. >> >> Mike >> >> >> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss < >> Sophie.Weiss@colorado.edu> wrote: >> >>> Hello, >>> I have microbiome data with no replicates, from different conditions. I >>> am >>> trying to transform the data using the DESeq method, as described in >>> McMurdie and Holmes 2014. >>> >>> The attached file is the definition I am using, as per the supplemental >>> info in McMurdie and Holmes 2014, and the .biom file I am using. >>> >>> Thank you for your help, >>> Sophie >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> > [[alternative HTML version deleted]]
ADD REPLYlink written 4.4 years ago by Michael Love19k
Hi Mike, Thanks for explaining more. I am used to working with rarefied microbial datasets, that is why. Instead of rarefying I would like to use the DESeq method. How would you then suggest going about calculating bray-curtis distance, or summarized taxa diagrams with these new transformed matrices with negative values? Thanks again, Sophie On Mon, Apr 14, 2014 at 7:17 AM, Michael Love <michaelisaiahlove@gmail.com>wrote: > hi Sophie, > > Can you explain why you don't want negative values in the transformed > values? Adding one to the raw counts is not sufficient. I should have said > in my previous email, "the expected counts on the common scale". If the > size factor for a sample is 2, then an expected count of 1 leads to an > expected count of 1/2 on the common scale (after accounting for size > factors). > > > On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss < > Sophie.Weiss@colorado.edu> wrote: > >> Hi Mike, >> Thanks for your reply! Ok, makes sense, but I added 1 to all my matrix >> values, so the lowest value in the matrix is 1 - there are still negatives? >> Thanks again, >> Sophie >> >> >> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love < >> michaelisaiahlove@gmail.com> wrote: >> >>> hi Sophie, >>> >>> The transformations in DESeq and DESeq2 are log2-like transformations. >>> If the expected count is between 0 and 1, the values can be negative, this >>> does not indicate a problem. >>> >>> Mike >>> >>> >>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss < >>> Sophie.Weiss@colorado.edu> wrote: >>> >>>> Hello, >>>> I have microbiome data with no replicates, from different conditions. >>>> I am >>>> trying to transform the data using the DESeq method, as described in >>>> McMurdie and Holmes 2014. >>>> >>>> The attached file is the definition I am using, as per the supplemental >>>> info in McMurdie and Holmes 2014, and the .biom file I am using. >>>> >>>> Thank you for your help, >>>> Sophie >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor@r-project.org >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>>> >>> >>> >> > [[alternative HTML version deleted]]
ADD REPLYlink written 4.4 years ago by Sophie Josephine Weiss130
0
gravatar for Michael Love
4.4 years ago by
Michael Love19k
United States
Michael Love19k wrote:
hi Sophie, The VST code is the same in DESeq and DESeq2. The estimation of dispersion is slightly different (details are in the vignette "Changes from DESeq to DESeq2"), but the fitted line (which is used by the VST) should be very similar. Mike On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss <sophie.weiss at="" colorado.edu=""> wrote: > Hi Mike, > The McMurdie and Holmes paper uses DESeq for matrix normalization - do you > think that is ok, or would it be better to use DESeq 2? > Thanks again, > Sophie > > > On Mon, Apr 14, 2014 at 3:40 PM, Michael Love <michaelisaiahlove at="" gmail.com=""> > wrote: >> >> hi Sophie, >> >> >> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss >> <sophie.weiss at="" colorado.edu=""> wrote: >> > >> > Hi Mike, >> > Thanks for the references. By "threshold at 0" do you mean set any >> > negative values equal to 0? >> >> >> yes. >> >> >> > >> > Do you think this is the best approach? >> >> >> I haven't explored this area, and would defer to the McMurdie and >> Holmes paper for the best combinations of distance and transformation. >> >> >> > >> > Thanks again, >> > Sophie >> > >> > >> > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love >> > <michaelisaiahlove at="" gmail.com=""> wrote: >> >> >> >> I tried poking around here http://joey711.github.io/phyloseq/distance >> >> but couldn't see if the authors did anything for distances requiring >> >> non-negative data. It appears >> >> http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjourn al.pcbi.1003531 >> >> that VST was tested with Bray-Curtis distance. I think the distance is >> >> designed for counts, but you could always threshold at 0 to insist that the >> >> log2-like quantity act more like a count. >> >> >> >> >> >> >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss >> >> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> >> >>> Hi Mike, >> >>> Thanks for explaining more. I am used to working with rarefied >> >>> microbial datasets, that is why. Instead of rarefying I would like to use >> >>> the DESeq method. >> >>> >> >>> How would you then suggest going about calculating bray-curtis >> >>> distance, or summarized taxa diagrams with these new transformed matrices >> >>> with negative values? >> >>> Thanks again, >> >>> Sophie >> >>> >> >>> >> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love >> >>> <michaelisaiahlove at="" gmail.com=""> wrote: >> >>>> >> >>>> hi Sophie, >> >>>> >> >>>> Can you explain why you don't want negative values in the transformed >> >>>> values? Adding one to the raw counts is not sufficient. I should have said >> >>>> in my previous email, "the expected counts on the common scale". If the >> >>>> size factor for a sample is 2, then an expected count of 1 leads to an >> >>>> expected count of 1/2 on the common scale (after accounting for size >> >>>> factors). >> >>>> >> >>>> >> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss >> >>>> <sophie.weiss at="" colorado.edu=""> wrote: >> >>>>> >> >>>>> Hi Mike, >> >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to all my >> >>>>> matrix values, so the lowest value in the matrix is 1 - there are still >> >>>>> negatives? >> >>>>> Thanks again, >> >>>>> Sophie >> >>>>> >> >>>>> >> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love >> >>>>> <michaelisaiahlove at="" gmail.com=""> wrote: >> >>>>>> >> >>>>>> hi Sophie, >> >>>>>> >> >>>>>> The transformations in DESeq and DESeq2 are log2-like >> >>>>>> transformations. If the expected count is between 0 and 1, the values can be >> >>>>>> negative, this does not indicate a problem. >> >>>>>> >> >>>>>> Mike >> >>>>>> >> >>>>>> >> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss >> >>>>>> <sophie.weiss at="" colorado.edu=""> wrote: >> >>>>>>> >> >>>>>>> Hello, >> >>>>>>> I have microbiome data with no replicates, from different >> >>>>>>> conditions. I am >> >>>>>>> trying to transform the data using the DESeq method, as described >> >>>>>>> in >> >>>>>>> McMurdie and Holmes 2014. >> >>>>>>> >> >>>>>>> The attached file is the definition I am using, as per the >> >>>>>>> supplemental >> >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am using. >> >>>>>>> >> >>>>>>> Thank you for your help, >> >>>>>>> Sophie >> >>>>>>> >> >>>>>>> _______________________________________________ >> >>>>>>> Bioconductor mailing list >> >>>>>>> Bioconductor at r-project.org >> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >>>>>>> Search the archives: >> >>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>> >> >> >> > > >
ADD COMMENTlink written 4.4 years ago by Michael Love19k
Thanks again Mike - would it be ok to do chi-2 and other significance tests on the DESeq transformed datasets using independent code, or is it necessary to do the differential expression tests strictly within DESeq2? Sophie On Mon, Apr 14, 2014 at 5:41 PM, Michael Love <michaelisaiahlove@gmail.com>wrote: > hi Sophie, > > The VST code is the same in DESeq and DESeq2. The estimation of > dispersion is slightly different (details are in the vignette "Changes > from DESeq to DESeq2"), but the fitted line (which is used by the VST) > should be very similar. > > Mike > > On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss > <sophie.weiss@colorado.edu> wrote: > > Hi Mike, > > The McMurdie and Holmes paper uses DESeq for matrix normalization - do > you > > think that is ok, or would it be better to use DESeq 2? > > Thanks again, > > Sophie > > > > > > On Mon, Apr 14, 2014 at 3:40 PM, Michael Love < > michaelisaiahlove@gmail.com> > > wrote: > >> > >> hi Sophie, > >> > >> > >> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss > >> <sophie.weiss@colorado.edu> wrote: > >> > > >> > Hi Mike, > >> > Thanks for the references. By "threshold at 0" do you mean set any > >> > negative values equal to 0? > >> > >> > >> yes. > >> > >> > >> > > >> > Do you think this is the best approach? > >> > >> > >> I haven't explored this area, and would defer to the McMurdie and > >> Holmes paper for the best combinations of distance and transformation. > >> > >> > >> > > >> > Thanks again, > >> > Sophie > >> > > >> > > >> > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love > >> > <michaelisaiahlove@gmail.com> wrote: > >> >> > >> >> I tried poking around here > http://joey711.github.io/phyloseq/distance > >> >> but couldn't see if the authors did anything for distances requiring > >> >> non-negative data. It appears > >> >> > http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.p cbi.1003531 > >> >> that VST was tested with Bray-Curtis distance. I think the distance > is > >> >> designed for counts, but you could always threshold at 0 to insist > that the > >> >> log2-like quantity act more like a count. > >> >> > >> >> > >> >> > >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss > >> >> <sophie.weiss@colorado.edu> wrote: > >> >>> > >> >>> Hi Mike, > >> >>> Thanks for explaining more. I am used to working with rarefied > >> >>> microbial datasets, that is why. Instead of rarefying I would like > to use > >> >>> the DESeq method. > >> >>> > >> >>> How would you then suggest going about calculating bray- curtis > >> >>> distance, or summarized taxa diagrams with these new transformed > matrices > >> >>> with negative values? > >> >>> Thanks again, > >> >>> Sophie > >> >>> > >> >>> > >> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love > >> >>> <michaelisaiahlove@gmail.com> wrote: > >> >>>> > >> >>>> hi Sophie, > >> >>>> > >> >>>> Can you explain why you don't want negative values in the > transformed > >> >>>> values? Adding one to the raw counts is not sufficient. I should > have said > >> >>>> in my previous email, "the expected counts on the common scale". > If the > >> >>>> size factor for a sample is 2, then an expected count of 1 leads > to an > >> >>>> expected count of 1/2 on the common scale (after accounting for > size > >> >>>> factors). > >> >>>> > >> >>>> > >> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss > >> >>>> <sophie.weiss@colorado.edu> wrote: > >> >>>>> > >> >>>>> Hi Mike, > >> >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to all my > >> >>>>> matrix values, so the lowest value in the matrix is 1 - there are > still > >> >>>>> negatives? > >> >>>>> Thanks again, > >> >>>>> Sophie > >> >>>>> > >> >>>>> > >> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love > >> >>>>> <michaelisaiahlove@gmail.com> wrote: > >> >>>>>> > >> >>>>>> hi Sophie, > >> >>>>>> > >> >>>>>> The transformations in DESeq and DESeq2 are log2-like > >> >>>>>> transformations. If the expected count is between 0 and 1, the > values can be > >> >>>>>> negative, this does not indicate a problem. > >> >>>>>> > >> >>>>>> Mike > >> >>>>>> > >> >>>>>> > >> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss > >> >>>>>> <sophie.weiss@colorado.edu> wrote: > >> >>>>>>> > >> >>>>>>> Hello, > >> >>>>>>> I have microbiome data with no replicates, from different > >> >>>>>>> conditions. I am > >> >>>>>>> trying to transform the data using the DESeq method, as > described > >> >>>>>>> in > >> >>>>>>> McMurdie and Holmes 2014. > >> >>>>>>> > >> >>>>>>> The attached file is the definition I am using, as per the > >> >>>>>>> supplemental > >> >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am using. > >> >>>>>>> > >> >>>>>>> Thank you for your help, > >> >>>>>>> Sophie > >> >>>>>>> > >> >>>>>>> _______________________________________________ > >> >>>>>>> Bioconductor mailing list > >> >>>>>>> Bioconductor@r-project.org > >> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> >>>>>>> Search the archives: > >> >>>>>>> > http://news.gmane.org/gmane.science.biology.informatics.conductor > >> >>>>>> > >> >>>>>> > >> >>>>> > >> >>>> > >> >>> > >> >> > >> > > > > > > [[alternative HTML version deleted]]
ADD REPLYlink written 4.4 years ago by Sophie Josephine Weiss130
Please help with this? Thanks again. On Mon, Apr 14, 2014 at 6:02 PM, Sophie Josephine Weiss < Sophie.Weiss@colorado.edu> wrote: > Thanks again Mike - would it be ok to do chi-2 and other significance > tests on the DESeq transformed datasets using independent code, or is it > necessary to do the differential expression tests strictly within DESeq2? > > Sophie > > > On Mon, Apr 14, 2014 at 5:41 PM, Michael Love <michaelisaiahlove@gmail.com> > wrote: > >> hi Sophie, >> >> The VST code is the same in DESeq and DESeq2. The estimation of >> dispersion is slightly different (details are in the vignette "Changes >> from DESeq to DESeq2"), but the fitted line (which is used by the VST) >> should be very similar. >> >> Mike >> >> On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss >> <sophie.weiss@colorado.edu> wrote: >> > Hi Mike, >> > The McMurdie and Holmes paper uses DESeq for matrix normalization - do >> you >> > think that is ok, or would it be better to use DESeq 2? >> > Thanks again, >> > Sophie >> > >> > >> > On Mon, Apr 14, 2014 at 3:40 PM, Michael Love < >> michaelisaiahlove@gmail.com> >> > wrote: >> >> >> >> hi Sophie, >> >> >> >> >> >> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss >> >> <sophie.weiss@colorado.edu> wrote: >> >> > >> >> > Hi Mike, >> >> > Thanks for the references. By "threshold at 0" do you mean set any >> >> > negative values equal to 0? >> >> >> >> >> >> yes. >> >> >> >> >> >> > >> >> > Do you think this is the best approach? >> >> >> >> >> >> I haven't explored this area, and would defer to the McMurdie and >> >> Holmes paper for the best combinations of distance and transformation. >> >> >> >> >> >> > >> >> > Thanks again, >> >> > Sophie >> >> > >> >> > >> >> > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love >> >> > <michaelisaiahlove@gmail.com> wrote: >> >> >> >> >> >> I tried poking around here >> http://joey711.github.io/phyloseq/distance >> >> >> but couldn't see if the authors did anything for distances requiring >> >> >> non-negative data. It appears >> >> >> >> http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal. pcbi.1003531 >> >> >> that VST was tested with Bray-Curtis distance. I think the distance >> is >> >> >> designed for counts, but you could always threshold at 0 to insist >> that the >> >> >> log2-like quantity act more like a count. >> >> >> >> >> >> >> >> >> >> >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss >> >> >> <sophie.weiss@colorado.edu> wrote: >> >> >>> >> >> >>> Hi Mike, >> >> >>> Thanks for explaining more. I am used to working with rarefied >> >> >>> microbial datasets, that is why. Instead of rarefying I would >> like to use >> >> >>> the DESeq method. >> >> >>> >> >> >>> How would you then suggest going about calculating bray- curtis >> >> >>> distance, or summarized taxa diagrams with these new transformed >> matrices >> >> >>> with negative values? >> >> >>> Thanks again, >> >> >>> Sophie >> >> >>> >> >> >>> >> >> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love >> >> >>> <michaelisaiahlove@gmail.com> wrote: >> >> >>>> >> >> >>>> hi Sophie, >> >> >>>> >> >> >>>> Can you explain why you don't want negative values in the >> transformed >> >> >>>> values? Adding one to the raw counts is not sufficient. I should >> have said >> >> >>>> in my previous email, "the expected counts on the common scale". >> If the >> >> >>>> size factor for a sample is 2, then an expected count of 1 leads >> to an >> >> >>>> expected count of 1/2 on the common scale (after accounting for >> size >> >> >>>> factors). >> >> >>>> >> >> >>>> >> >> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss >> >> >>>> <sophie.weiss@colorado.edu> wrote: >> >> >>>>> >> >> >>>>> Hi Mike, >> >> >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to all my >> >> >>>>> matrix values, so the lowest value in the matrix is 1 - there >> are still >> >> >>>>> negatives? >> >> >>>>> Thanks again, >> >> >>>>> Sophie >> >> >>>>> >> >> >>>>> >> >> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love >> >> >>>>> <michaelisaiahlove@gmail.com> wrote: >> >> >>>>>> >> >> >>>>>> hi Sophie, >> >> >>>>>> >> >> >>>>>> The transformations in DESeq and DESeq2 are log2-like >> >> >>>>>> transformations. If the expected count is between 0 and 1, the >> values can be >> >> >>>>>> negative, this does not indicate a problem. >> >> >>>>>> >> >> >>>>>> Mike >> >> >>>>>> >> >> >>>>>> >> >> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss >> >> >>>>>> <sophie.weiss@colorado.edu> wrote: >> >> >>>>>>> >> >> >>>>>>> Hello, >> >> >>>>>>> I have microbiome data with no replicates, from different >> >> >>>>>>> conditions. I am >> >> >>>>>>> trying to transform the data using the DESeq method, as >> described >> >> >>>>>>> in >> >> >>>>>>> McMurdie and Holmes 2014. >> >> >>>>>>> >> >> >>>>>>> The attached file is the definition I am using, as per the >> >> >>>>>>> supplemental >> >> >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am >> using. >> >> >>>>>>> >> >> >>>>>>> Thank you for your help, >> >> >>>>>>> Sophie >> >> >>>>>>> >> >> >>>>>>> _______________________________________________ >> >> >>>>>>> Bioconductor mailing list >> >> >>>>>>> Bioconductor@r-project.org >> >> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> >>>>>>> Search the archives: >> >> >>>>>>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >> >>>>>> >> >> >>>>>> >> >> >>>>> >> >> >>>> >> >> >>> >> >> >> >> >> > >> > >> > >> > > [[alternative HTML version deleted]]
ADD REPLYlink written 4.4 years ago by Sophie Josephine Weiss130
0
gravatar for Michael Love
4.4 years ago by
Michael Love19k
United States
Michael Love19k wrote:
Hi Sophie, On Wed, Apr 23, 2014 at 5:44 PM, Sophie Josephine Weiss <sophie.weiss at="" colorado.edu=""> wrote: > > Thanks Michael, > The entire dataset (attached code and .biom) is negatives I don't see that the entire dataset is all negatives. I get the same percent of negatives as you had zeros in the original counts: z <- otu_table(x) zz <- otu_table(DESeq_data) > table(as.vector(z) > 0) / prod(dim(z)) FALSE TRUE 0.98022416 0.01977584 > table(as.vector(zz) > 0) / prod(dim(zz)) FALSE TRUE 0.98022416 0.01977584 > summary(as.vector(zz)) Min. 1st Qu. Median Mean 3rd Qu. Max. -1.200 -1.197 -1.197 -1.126 -1.197 280.300 > summary(as.vector(zz)[as.vector(zz) > 0]) Min. 1st Qu. Median Mean 3rd Qu. Max. 1.304 1.304 1.304 2.409 2.476 280.300 If you want to do something downstream which requires positive values, set all the negative values to 0 as I wrote previously Or you can add the absolute value of the smallest value, so if the smallest value is -1.200, just add 1.2 to the matrix. I don't have any recommendations though for what is a good idea here. Mike > > - there was an error of "out of vertex space" as described here, so I tried setting maxk=300 as suggested. > Commands are below. > Thanks again! > Sophie > > source("http://bioconductor.org/biocLite.R") > biocLite("phyloseq") > biocLite("DESeq") > > library("phyloseq") > library("DESeq") > library("biom") > > file = "~/Downloads/study_449_closed_reference_otu_table.biom" > x = import_biom(file) > source("~/Downloads/deseq_varstab.R") > DESeq_data = deseq_varstab(x, method = "blind", sharingMode = "maximum", fitType = "local", locfit_extra_args=list(maxk=300)) > write_biom(make_biom(DESeq_data at otu_table),"~/Desktop/449_Costello_DESeq.biom.tsv") > > > On Sat, Apr 19, 2014 at 11:29 AM, Michael Love <michaelisaiahlove at="" gmail.com=""> wrote: >> >> hi Sophie, >> >> You are getting negative values from the transformation for the >> reasons I mentioned earlier, the transformation is log2-like. >> >> If you want to do something downstream of our software which requires >> non-negative values, below is some example code of how to threshold >> negative values for a matrix in R. >> >> The question of what is the best distance to use for taxa counts, or >> whether ANOVA on variance stabilized data is a good idea for taxa >> counts, depends on the properties of the data, and this is an area of >> active research. As I don't have experience analyzing this kind of >> data, I don't want to make any guesses. >> >> > m <- matrix(-2:5, ncol=2) >> > m >> [,1] [,2] >> [1,] -2 2 >> [2,] -1 3 >> [3,] 0 4 >> [4,] 1 5 >> > m[m < 0] <- 0 >> > m >> [,1] [,2] >> [1,] 0 2 >> [2,] 0 3 >> [3,] 0 4 >> [4,] 1 5 >> >> On Fri, Apr 18, 2014 at 3:32 PM, Sophie Josephine Weiss >> <sophie.weiss at="" colorado.edu=""> wrote: >> > Hi Mike, >> > Could you please check whether I am running this correctly? I have double >> > checked all the parameters, but for some reason, I am getting negatives >> > using the R script on the attached .biom dataset. There are no replicates >> > in this microbial dataset. >> > Thanks for your advice, >> > Sophie >> > >> > >> > On Wed, Apr 16, 2014 at 4:02 PM, Sophie Josephine Weiss >> > <sophie.weiss at="" colorado.edu=""> wrote: >> >> >> >> Thanks Mike, that is what I thought. What if we wanted to perform kruskal >> >> wallis, or is it possible to perform anova on the variance- stabilized >> >> matrix? >> >> >> >> >> >> On Wed, Apr 16, 2014 at 2:29 PM, Michael Love >> >> <michaelisaiahlove at="" gmail.com=""> wrote: >> >>> >> >>> hi Sophie, >> >>> >> >>> We recommend using the standard DESeq() function for differential >> >>> expression. >> >>> >> >>> This is mentioned in the first line of the vignette section on >> >>> transformations: >> >>> >> >>> "In order to test for diff erential expression, we operate on raw >> >>> counts and use discrete distributions as >> >>> described in the previous section" >> >>> >> >>> Also, in the McMurdie and Holmes, they are using the DESeq() function, >> >>> as shown in their supplemental material: >> >>> >> >>> >> >>> http://joey711.github.io/waste-not-supplemental/simulation- differential-abundance/simulation-differential-abundance-server.html >> >>> >> >>> On Wed, Apr 16, 2014 at 3:22 PM, Sophie Josephine Weiss >> >>> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> > Please help with this? Thanks again. >> >>> > >> >>> > >> >>> > On Mon, Apr 14, 2014 at 6:02 PM, Sophie Josephine Weiss >> >>> > <sophie.weiss at="" colorado.edu=""> wrote: >> >>> >> >> >>> >> Thanks again Mike - would it be ok to do chi-2 and other significance >> >>> >> tests on the DESeq transformed datasets using independent code, or is >> >>> >> it >> >>> >> necessary to do the differential expression tests strictly within >> >>> >> DESeq2? >> >>> >> >> >>> >> Sophie >> >>> >> >> >>> >> >> >>> >> On Mon, Apr 14, 2014 at 5:41 PM, Michael Love >> >>> >> <michaelisaiahlove at="" gmail.com=""> wrote: >> >>> >>> >> >>> >>> hi Sophie, >> >>> >>> >> >>> >>> The VST code is the same in DESeq and DESeq2. The estimation of >> >>> >>> dispersion is slightly different (details are in the vignette >> >>> >>> "Changes >> >>> >>> from DESeq to DESeq2"), but the fitted line (which is used by the >> >>> >>> VST) >> >>> >>> should be very similar. >> >>> >>> >> >>> >>> Mike >> >>> >>> >> >>> >>> On Mon, Apr 14, 2014 at 6:27 PM, Sophie Josephine Weiss >> >>> >>> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> >>> > Hi Mike, >> >>> >>> > The McMurdie and Holmes paper uses DESeq for matrix normalization - >> >>> >>> > do >> >>> >>> > you >> >>> >>> > think that is ok, or would it be better to use DESeq 2? >> >>> >>> > Thanks again, >> >>> >>> > Sophie >> >>> >>> > >> >>> >>> > >> >>> >>> > On Mon, Apr 14, 2014 at 3:40 PM, Michael Love >> >>> >>> > <michaelisaiahlove at="" gmail.com=""> >> >>> >>> > wrote: >> >>> >>> >> >> >>> >>> >> hi Sophie, >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> On Mon, Apr 14, 2014 at 1:15 PM, Sophie Josephine Weiss >> >>> >>> >> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> >>> >> > >> >>> >>> >> > Hi Mike, >> >>> >>> >> > Thanks for the references. By "threshold at 0" do you mean set >> >>> >>> >> > any >> >>> >>> >> > negative values equal to 0? >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> yes. >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> > >> >>> >>> >> > Do you think this is the best approach? >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> I haven't explored this area, and would defer to the McMurdie and >> >>> >>> >> Holmes paper for the best combinations of distance and >> >>> >>> >> transformation. >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> > >> >>> >>> >> > Thanks again, >> >>> >>> >> > Sophie >> >>> >>> >> > >> >>> >>> >> > >> >>> >>> >> > On Mon, Apr 14, 2014 at 11:01 AM, Michael Love >> >>> >>> >> > <michaelisaiahlove at="" gmail.com=""> wrote: >> >>> >>> >> >> >> >>> >>> >> >> I tried poking around here >> >>> >>> >> >> http://joey711.github.io/phyloseq/distance >> >>> >>> >> >> but couldn't see if the authors did anything for distances >> >>> >>> >> >> requiring >> >>> >>> >> >> non-negative data. It appears >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >> >> http://www.ploscompbiol.org/article/info%3Adoi%2F10.1 371%2Fjournal.pcbi.1003531 >> >>> >>> >> >> that VST was tested with Bray-Curtis distance. I think the >> >>> >>> >> >> distance >> >>> >>> >> >> is >> >>> >>> >> >> designed for counts, but you could always threshold at 0 to >> >>> >>> >> >> insist >> >>> >>> >> >> that the >> >>> >>> >> >> log2-like quantity act more like a count. >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >> >> On Mon, Apr 14, 2014 at 12:23 PM, Sophie Josephine Weiss >> >>> >>> >> >> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> >>> >> >>> >> >>> >>> >> >>> Hi Mike, >> >>> >>> >> >>> Thanks for explaining more. I am used to working with >> >>> >>> >> >>> rarefied >> >>> >>> >> >>> microbial datasets, that is why. Instead of rarefying I would >> >>> >>> >> >>> like to use >> >>> >>> >> >>> the DESeq method. >> >>> >>> >> >>> >> >>> >>> >> >>> How would you then suggest going about calculating bray-curtis >> >>> >>> >> >>> distance, or summarized taxa diagrams with these new >> >>> >>> >> >>> transformed >> >>> >>> >> >>> matrices >> >>> >>> >> >>> with negative values? >> >>> >>> >> >>> Thanks again, >> >>> >>> >> >>> Sophie >> >>> >>> >> >>> >> >>> >>> >> >>> >> >>> >>> >> >>> On Mon, Apr 14, 2014 at 7:17 AM, Michael Love >> >>> >>> >> >>> <michaelisaiahlove at="" gmail.com=""> wrote: >> >>> >>> >> >>>> >> >>> >>> >> >>>> hi Sophie, >> >>> >>> >> >>>> >> >>> >>> >> >>>> Can you explain why you don't want negative values in the >> >>> >>> >> >>>> transformed >> >>> >>> >> >>>> values? Adding one to the raw counts is not sufficient. I >> >>> >>> >> >>>> should >> >>> >>> >> >>>> have said >> >>> >>> >> >>>> in my previous email, "the expected counts on the common >> >>> >>> >> >>>> scale". >> >>> >>> >> >>>> If the >> >>> >>> >> >>>> size factor for a sample is 2, then an expected count of 1 >> >>> >>> >> >>>> leads >> >>> >>> >> >>>> to an >> >>> >>> >> >>>> expected count of 1/2 on the common scale (after accounting >> >>> >>> >> >>>> for >> >>> >>> >> >>>> size >> >>> >>> >> >>>> factors). >> >>> >>> >> >>>> >> >>> >>> >> >>>> >> >>> >>> >> >>>> On Sun, Apr 13, 2014 at 11:50 PM, Sophie Josephine Weiss >> >>> >>> >> >>>> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> >>> >> >>>>> >> >>> >>> >> >>>>> Hi Mike, >> >>> >>> >> >>>>> Thanks for your reply! Ok, makes sense, but I added 1 to >> >>> >>> >> >>>>> all my >> >>> >>> >> >>>>> matrix values, so the lowest value in the matrix is 1 - >> >>> >>> >> >>>>> there >> >>> >>> >> >>>>> are still >> >>> >>> >> >>>>> negatives? >> >>> >>> >> >>>>> Thanks again, >> >>> >>> >> >>>>> Sophie >> >>> >>> >> >>>>> >> >>> >>> >> >>>>> >> >>> >>> >> >>>>> On Sun, Apr 13, 2014 at 9:01 PM, Michael Love >> >>> >>> >> >>>>> <michaelisaiahlove at="" gmail.com=""> wrote: >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> hi Sophie, >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> The transformations in DESeq and DESeq2 are log2-like >> >>> >>> >> >>>>>> transformations. If the expected count is between 0 and 1, >> >>> >>> >> >>>>>> the >> >>> >>> >> >>>>>> values can be >> >>> >>> >> >>>>>> negative, this does not indicate a problem. >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> Mike >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> On Sun, Apr 13, 2014 at 5:17 PM, Sophie Josephine Weiss >> >>> >>> >> >>>>>> <sophie.weiss at="" colorado.edu=""> wrote: >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> Hello, >> >>> >>> >> >>>>>>> I have microbiome data with no replicates, from different >> >>> >>> >> >>>>>>> conditions. I am >> >>> >>> >> >>>>>>> trying to transform the data using the DESeq method, as >> >>> >>> >> >>>>>>> described >> >>> >>> >> >>>>>>> in >> >>> >>> >> >>>>>>> McMurdie and Holmes 2014. >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> The attached file is the definition I am using, as per the >> >>> >>> >> >>>>>>> supplemental >> >>> >>> >> >>>>>>> info in McMurdie and Holmes 2014, and the .biom file I am >> >>> >>> >> >>>>>>> using. >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> Thank you for your help, >> >>> >>> >> >>>>>>> Sophie >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> _______________________________________________ >> >>> >>> >> >>>>>>> Bioconductor mailing list >> >>> >>> >> >>>>>>> Bioconductor at r-project.org >> >>> >>> >> >>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >>> >>> >> >>>>>>> Search the archives: >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> >> >>> >>> >> >>>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>>> >> >>> >>> >> >>>>> >> >>> >>> >> >>>> >> >>> >>> >> >>> >> >>> >>> >> >> >> >>> >>> >> > >> >>> >>> > >> >>> >>> > >> >>> >> >> >>> >> >> >>> > >> >> >> >> >> > > >
ADD COMMENTlink written 4.4 years ago by Michael Love19k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 133 users visited in the last hour