Deseq2 for down stream analysis

0

Entering edit mode

Fabrice Tourre ▴ 970

@fabrice-tourre-4394

Last seen 11.4 years ago

Dear expert, I've been using DESeq for my RNA-Seq differential expression analysis. Now I want to do GSEA. I have got follow expression value. which one should I used for the down stream analysis? rc, rld or vsd? rc <- counts(dds) rld <- rlog(dds) vsd <- varianceStabilizingTransformation(dds) rlogMat <- assay(rld) vstMat <- assay(vsd) Then I want to use the DESeq result to generate a ranked-list, which will be used as the input in GSEA. My question is: Should I rank the genes using the fold changes or using the q-values? Thank you very much in advance.

DESeq DESeq • 3.6k views

ADD COMMENT • link updated 11.5 years ago by Michael Love 43k • written 11.5 years ago by Fabrice Tourre ▴ 970

0

Entering edit mode

Michael Love 43k

@mikelove

Last seen 2 days ago

United States

hi Fabrice, On Sun, Aug 10, 2014 at 8:27 AM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: > Dear expert, > > I've been using DESeq for my RNA-Seq differential expression analysis. > Now I want to do GSEA. I have got follow expression value. which one > should I used for the down stream analysis? Please provide more details about the downstream analysis. Do you need a matrix of values for each gene and sample, or just the test statistic for each gene? > rc, rld or vsd? > > rc <- counts(dds) > rld <- rlog(dds) > vsd <- varianceStabilizingTransformation(dds) > rlogMat <- assay(rld) > vstMat <- assay(vsd) > > Then I want to use the DESeq result to generate a ranked-list, which > will be used as the input in GSEA. My question is: Should I rank the > genes using the fold changes or using the q-values? > You can use the shrunken fold changes or p-values for ranking. The fold change measures the effect itself, while the p-value is a function of how distinct the changes are, so the signal over the noise. For example, consider a comparison of two groups with three values each (here continuous values just for demonstration): [3,4,5] vs [1,2,3] has a fold change of 2, whereas [11,11,11] vs [10,10,10] has a fold change of 1.1. but the second comparison will have a lower p-value because the variance within groups is so small. Mike > Thank you very much in advance. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 11.5 years ago Michael Love 43k

0

Entering edit mode

Dear Mike, Thank you for your reply. I need a matrix for each gene and sample for gene set enrichment analysis. In you example, how will about this situation: [0,0,0] vs [1,2,3] [0,0,0] vs [10,10,10] I have a lot such case genes. On Sun, Aug 10, 2014 at 9:29 PM, Michael Love <michaelisaiahlove at="" gmail.com=""> wrote: > hi Fabrice, > > On Sun, Aug 10, 2014 at 8:27 AM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: >> Dear expert, >> >> I've been using DESeq for my RNA-Seq differential expression analysis. >> Now I want to do GSEA. I have got follow expression value. which one >> should I used for the down stream analysis? > > Please provide more details about the downstream analysis. > > Do you need a matrix of values for each gene and sample, or just the > test statistic for each gene? > >> rc, rld or vsd? >> >> rc <- counts(dds) >> rld <- rlog(dds) >> vsd <- varianceStabilizingTransformation(dds) >> rlogMat <- assay(rld) >> vstMat <- assay(vsd) >> >> Then I want to use the DESeq result to generate a ranked-list, which >> will be used as the input in GSEA. My question is: Should I rank the >> genes using the fold changes or using the q-values? >> > > You can use the shrunken fold changes or p-values for ranking. The > fold change measures the effect itself, while the p-value is a > function of how distinct the changes are, so the signal over the > noise. For example, consider a comparison of two groups with three > values each (here continuous values just for demonstration): [3,4,5] > vs [1,2,3] has a fold change of 2, whereas [11,11,11] vs [10,10,10] > has a fold change of 1.1. but the second comparison will have a lower > p-value because the variance within groups is so small. > > Mike > >> Thank you very much in advance. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 11.5 years ago Fabrice Tourre ▴ 970

0

Entering edit mode

On Sun, Aug 10, 2014 at 9:41 AM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: > Dear Mike, > > Thank you for your reply. I need a matrix for each gene and sample for > gene set enrichment analysis. > > In you example, how will about this situation: > > [0,0,0] vs [1,2,3] > > [0,0,0] vs [10,10,10] > In my previous email, I was just trying to illustrate the concept. Better that you rank your results table by LFC and by p-value to see for yourself the difference on real data. If your downstream method is designed to take as input expression matrices similar to normalized microarray datasets (log scale) then you can use rlog or VST, and use the matrix accessed with assay(object). If the downstream method is designed to take as input RNA-Seq counts, then you shouldn't use our transformations, as typically count-based methods have special requirements on the properties of the input data. It's up to you to read the documentation of the downstream method and figure out which should be the input, or if uncertain, email the maintainers of that software. Mike > I have a lot such case genes. > > On Sun, Aug 10, 2014 at 9:29 PM, Michael Love > <michaelisaiahlove at="" gmail.com=""> wrote: >> hi Fabrice, >> >> On Sun, Aug 10, 2014 at 8:27 AM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: >>> Dear expert, >>> >>> I've been using DESeq for my RNA-Seq differential expression analysis. >>> Now I want to do GSEA. I have got follow expression value. which one >>> should I used for the down stream analysis? >> >> Please provide more details about the downstream analysis. >> >> Do you need a matrix of values for each gene and sample, or just the >> test statistic for each gene? >> >>> rc, rld or vsd? >>> >>> rc <- counts(dds) >>> rld <- rlog(dds) >>> vsd <- varianceStabilizingTransformation(dds) >>> rlogMat <- assay(rld) >>> vstMat <- assay(vsd) >>> >>> Then I want to use the DESeq result to generate a ranked-list, which >>> will be used as the input in GSEA. My question is: Should I rank the >>> genes using the fold changes or using the q-values? >>> >> >> You can use the shrunken fold changes or p-values for ranking. The >> fold change measures the effect itself, while the p-value is a >> function of how distinct the changes are, so the signal over the >> noise. For example, consider a comparison of two groups with three >> values each (here continuous values just for demonstration): [3,4,5] >> vs [1,2,3] has a fold change of 2, whereas [11,11,11] vs [10,10,10] >> has a fold change of 1.1. but the second comparison will have a lower >> p-value because the variance within groups is so small. >> >> Mike >> >>> Thank you very much in advance. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 11.5 years ago Michael Love 43k

0

Entering edit mode

I need this matrix as the input for Gene Set Enrichment Analysis (GSEA, http://www.broadinstitute.org/gsea/) On Sun, Aug 10, 2014 at 9:41 PM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: > Dear Mike, > > Thank you for your reply. I need a matrix for each gene and sample for > gene set enrichment analysis. > > In you example, how will about this situation: > > [0,0,0] vs [1,2,3] > > [0,0,0] vs [10,10,10] > > I have a lot such case genes. > > On Sun, Aug 10, 2014 at 9:29 PM, Michael Love > <michaelisaiahlove at="" gmail.com=""> wrote: >> hi Fabrice, >> >> On Sun, Aug 10, 2014 at 8:27 AM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: >>> Dear expert, >>> >>> I've been using DESeq for my RNA-Seq differential expression analysis. >>> Now I want to do GSEA. I have got follow expression value. which one >>> should I used for the down stream analysis? >> >> Please provide more details about the downstream analysis. >> >> Do you need a matrix of values for each gene and sample, or just the >> test statistic for each gene? >> >>> rc, rld or vsd? >>> >>> rc <- counts(dds) >>> rld <- rlog(dds) >>> vsd <- varianceStabilizingTransformation(dds) >>> rlogMat <- assay(rld) >>> vstMat <- assay(vsd) >>> >>> Then I want to use the DESeq result to generate a ranked-list, which >>> will be used as the input in GSEA. My question is: Should I rank the >>> genes using the fold changes or using the q-values? >>> >> >> You can use the shrunken fold changes or p-values for ranking. The >> fold change measures the effect itself, while the p-value is a >> function of how distinct the changes are, so the signal over the >> noise. For example, consider a comparison of two groups with three >> values each (here continuous values just for demonstration): [3,4,5] >> vs [1,2,3] has a fold change of 2, whereas [11,11,11] vs [10,10,10] >> has a fold change of 1.1. but the second comparison will have a lower >> p-value because the variance within groups is so small. >> >> Mike >> >>> Thank you very much in advance. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 11.5 years ago Fabrice Tourre ▴ 970

Login before adding your answer.