DESeq2 16S copy number correction

0

Entering edit mode

Manoeli Lupatini ▴ 10

@manoeli-lupatini-6514

Last seen 9.6 years ago

Hi, I have counts of DNA for 16S with different library sizes and want to use DESeq2 to normalize the counts. However, I used Picrust to correct the 16S copy number for OTUs and the number generated by this correction are not integers (but decimals). Can I used DESeq2 to normalized my count data (using size factor) obtained by this 16S number correction considering that the DESeq2 was developed based in counts and not in counts corrected by 16S copy number? Thanks, Manoeli -- Manoeli Lupatini PhD candidate Netherlands Institute of Ecology (NIOO/KNAW) Wageningen, The Netherlands [[alternative HTML version deleted]]

DESeq2 DESeq2 • 1.7k views

ADD COMMENT • link updated 10.0 years ago by Susan Holmes ▴ 10 • written 10.0 years ago by Manoeli Lupatini ▴ 10

0

Entering edit mode

Susan Holmes ▴ 10

@susan-holmes-6517

Last seen 9.6 years ago

Manoeli When you say you have decimals are they numbers between 0 and 1 or are they larger? You can see how to use DESeq2 on count data, which could be approprriate here as you have different library sizes in the supplementary material of the paper by PJ McMurdie http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcb i.1003531 and I here: http://joey711.github.io/waste-not-supplemental/ Best Susan Susan Holmes Professor, Statistics and BioX John Henry Samter University Fellow in Undergraduate Education Director, Mathematical and Computational Sciences Stanford http://www-stat.stanford.edu/~susan/ On Thu, Apr 24, 2014 at 5:13 PM, Manoeli Lupatini <mlupatini@gmail.com>wrote: > Hi, > > > I have counts of DNA for 16S with different library sizes and want to use > DESeq2 to normalize the counts. However, I used Picrust to correct the 16S > copy number for OTUs and the number generated by this correction are not > integers (but decimals). Can I used DESeq2 to normalized my count data > (using size factor) obtained by this 16S number correction considering that > the DESeq2 was developed based in counts and not in counts corrected by 16S > copy number? > > > Thanks, > > > Manoeli > > -- > > Manoeli Lupatini > PhD candidate > Netherlands Institute of Ecology (NIOO/KNAW) > Wageningen, The Netherlands > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > [[alternative HTML version deleted]]

ADD COMMENT • link 10.0 years ago Susan Holmes ▴ 10

0

Entering edit mode

Michael Love 41k

@mikelove

Last seen 4 hours ago

United States

hi Manoeli, I think I follow your question, and I've been meaning to put in a function to help in this case, but I didn't make it in time for the latest release. Below is some code for a toy example, tell me if this resembles your problem. Suppose we want to estimate the size factors "sf". Here are the true values: > sf <- c(.5,1,1,2) And additionally, we have a matrix of factors which will contribute to the counts, so I am thinking this is analogous to your copy number information. The matrix is OTU x samples. > m <- matrix(c(1,10,1,10,1,rep(1,3*5)),ncol=4) Here we are encoding that there is a copy number of x10 for the first sample and for the 2nd and 4th OTU. > m [,1] [,2] [,3] [,4] [1,] 1 1 1 1 [2,] 10 1 1 1 [3,] 1 1 1 1 [4,] 10 1 1 1 [5,] 1 1 1 1 I generate counts using these size factors and the matrix m: > (k <- matrix(rpois(20,100*rep(sf,each=5)*m),ncol=4)) [,1] [,2] [,3] [,4] [1,] 45 106 112 221 [2,] 478 103 91 199 [3,] 40 116 89 190 [4,] 497 81 102 183 [5,] 55 112 79 192 We get back the size factors estimates on the counts normalized by dividing out m: > (sf.hat <- estimateSizeFactorsForMatrix(k/m)) [1] 0.4919127 1.0599792 0.9456373 2.0187763 Then we can build a matrix of normalization factors: > (nf <- rep(sf.hat, each=5) * m) [,1] [,2] [,3] [,4] [1,] 0.4919127 1.059979 0.9456373 2.018776 [2,] 4.9191266 1.059979 0.9456373 2.018776 [3,] 0.4919127 1.059979 0.9456373 2.018776 [4,] 4.9191266 1.059979 0.9456373 2.018776 [5,] 0.4919127 1.059979 0.9456373 2.018776 then normalized counts are k divided by the normalization factors: > k / nf [,1] [,2] [,3] [,4] [1,] 91.47965 100.00197 118.43864 109.47226 [2,] 97.17172 97.17172 96.23140 98.57457 [3,] 81.31525 109.43611 94.11642 94.11642 [4,] 101.03420 76.41660 107.86376 90.64897 [5,] 111.80847 105.66245 83.54154 95.10712 this is what you would get by: normalizationFactors(dds) <- nf counts(dds, normalized=TRUE) -Mike On Thu, Apr 24, 2014 at 11:13 AM, Manoeli Lupatini <mlupatini at="" gmail.com=""> wrote: > Hi, > > > I have counts of DNA for 16S with different library sizes and want to use > DESeq2 to normalize the counts. However, I used Picrust to correct the 16S > copy number for OTUs and the number generated by this correction are not > integers (but decimals). Can I used DESeq2 to normalized my count data > (using size factor) obtained by this 16S number correction considering that > the DESeq2 was developed based in counts and not in counts corrected by 16S > copy number? > > > Thanks, > > > Manoeli > > -- > > Manoeli Lupatini > PhD candidate > Netherlands Institute of Ecology (NIOO/KNAW) > Wageningen, The Netherlands > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 10.0 years ago Michael Love 41k

Login before adding your answer.