Deseq2 for down stream analysis
1
0
Entering edit mode
@fabrice-tourre-4394
Last seen 10.2 years ago
Dear expert, I've been using DESeq for my RNA-Seq differential expression analysis. Now I want to do GSEA. I have got follow expression value. which one should I used for the down stream analysis? rc, rld or vsd? rc <- counts(dds) rld <- rlog(dds) vsd <- varianceStabilizingTransformation(dds) rlogMat <- assay(rld) vstMat <- assay(vsd) Then I want to use the DESeq result to generate a ranked-list, which will be used as the input in GSEA. My question is: Should I rank the genes using the fold changes or using the q-values? Thank you very much in advance.
DESeq DESeq • 3.2k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 2 days ago
United States
hi Fabrice, On Sun, Aug 10, 2014 at 8:27 AM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: > Dear expert, > > I've been using DESeq for my RNA-Seq differential expression analysis. > Now I want to do GSEA. I have got follow expression value. which one > should I used for the down stream analysis? Please provide more details about the downstream analysis. Do you need a matrix of values for each gene and sample, or just the test statistic for each gene? > rc, rld or vsd? > > rc <- counts(dds) > rld <- rlog(dds) > vsd <- varianceStabilizingTransformation(dds) > rlogMat <- assay(rld) > vstMat <- assay(vsd) > > Then I want to use the DESeq result to generate a ranked-list, which > will be used as the input in GSEA. My question is: Should I rank the > genes using the fold changes or using the q-values? > You can use the shrunken fold changes or p-values for ranking. The fold change measures the effect itself, while the p-value is a function of how distinct the changes are, so the signal over the noise. For example, consider a comparison of two groups with three values each (here continuous values just for demonstration): [3,4,5] vs [1,2,3] has a fold change of 2, whereas [11,11,11] vs [10,10,10] has a fold change of 1.1. but the second comparison will have a lower p-value because the variance within groups is so small. Mike > Thank you very much in advance. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Dear Mike, Thank you for your reply. I need a matrix for each gene and sample for gene set enrichment analysis. In you example, how will about this situation: [0,0,0] vs [1,2,3] [0,0,0] vs [10,10,10] I have a lot such case genes. On Sun, Aug 10, 2014 at 9:29 PM, Michael Love <michaelisaiahlove at="" gmail.com=""> wrote: > hi Fabrice, > > On Sun, Aug 10, 2014 at 8:27 AM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: >> Dear expert, >> >> I've been using DESeq for my RNA-Seq differential expression analysis. >> Now I want to do GSEA. I have got follow expression value. which one >> should I used for the down stream analysis? > > Please provide more details about the downstream analysis. > > Do you need a matrix of values for each gene and sample, or just the > test statistic for each gene? > >> rc, rld or vsd? >> >> rc <- counts(dds) >> rld <- rlog(dds) >> vsd <- varianceStabilizingTransformation(dds) >> rlogMat <- assay(rld) >> vstMat <- assay(vsd) >> >> Then I want to use the DESeq result to generate a ranked-list, which >> will be used as the input in GSEA. My question is: Should I rank the >> genes using the fold changes or using the q-values? >> > > You can use the shrunken fold changes or p-values for ranking. The > fold change measures the effect itself, while the p-value is a > function of how distinct the changes are, so the signal over the > noise. For example, consider a comparison of two groups with three > values each (here continuous values just for demonstration): [3,4,5] > vs [1,2,3] has a fold change of 2, whereas [11,11,11] vs [10,10,10] > has a fold change of 1.1. but the second comparison will have a lower > p-value because the variance within groups is so small. > > Mike > >> Thank you very much in advance. >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
On Sun, Aug 10, 2014 at 9:41 AM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: > Dear Mike, > > Thank you for your reply. I need a matrix for each gene and sample for > gene set enrichment analysis. > > In you example, how will about this situation: > > [0,0,0] vs [1,2,3] > > [0,0,0] vs [10,10,10] > In my previous email, I was just trying to illustrate the concept. Better that you rank your results table by LFC and by p-value to see for yourself the difference on real data. If your downstream method is designed to take as input expression matrices similar to normalized microarray datasets (log scale) then you can use rlog or VST, and use the matrix accessed with assay(object). If the downstream method is designed to take as input RNA-Seq counts, then you shouldn't use our transformations, as typically count-based methods have special requirements on the properties of the input data. It's up to you to read the documentation of the downstream method and figure out which should be the input, or if uncertain, email the maintainers of that software. Mike > I have a lot such case genes. > > On Sun, Aug 10, 2014 at 9:29 PM, Michael Love > <michaelisaiahlove at="" gmail.com=""> wrote: >> hi Fabrice, >> >> On Sun, Aug 10, 2014 at 8:27 AM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: >>> Dear expert, >>> >>> I've been using DESeq for my RNA-Seq differential expression analysis. >>> Now I want to do GSEA. I have got follow expression value. which one >>> should I used for the down stream analysis? >> >> Please provide more details about the downstream analysis. >> >> Do you need a matrix of values for each gene and sample, or just the >> test statistic for each gene? >> >>> rc, rld or vsd? >>> >>> rc <- counts(dds) >>> rld <- rlog(dds) >>> vsd <- varianceStabilizingTransformation(dds) >>> rlogMat <- assay(rld) >>> vstMat <- assay(vsd) >>> >>> Then I want to use the DESeq result to generate a ranked-list, which >>> will be used as the input in GSEA. My question is: Should I rank the >>> genes using the fold changes or using the q-values? >>> >> >> You can use the shrunken fold changes or p-values for ranking. The >> fold change measures the effect itself, while the p-value is a >> function of how distinct the changes are, so the signal over the >> noise. For example, consider a comparison of two groups with three >> values each (here continuous values just for demonstration): [3,4,5] >> vs [1,2,3] has a fold change of 2, whereas [11,11,11] vs [10,10,10] >> has a fold change of 1.1. but the second comparison will have a lower >> p-value because the variance within groups is so small. >> >> Mike >> >>> Thank you very much in advance. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY
0
Entering edit mode
I need this matrix as the input for Gene Set Enrichment Analysis (GSEA, http://www.broadinstitute.org/gsea/) On Sun, Aug 10, 2014 at 9:41 PM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: > Dear Mike, > > Thank you for your reply. I need a matrix for each gene and sample for > gene set enrichment analysis. > > In you example, how will about this situation: > > [0,0,0] vs [1,2,3] > > [0,0,0] vs [10,10,10] > > I have a lot such case genes. > > On Sun, Aug 10, 2014 at 9:29 PM, Michael Love > <michaelisaiahlove at="" gmail.com=""> wrote: >> hi Fabrice, >> >> On Sun, Aug 10, 2014 at 8:27 AM, Fabrice Tourre <fabrice.ciup at="" gmail.com=""> wrote: >>> Dear expert, >>> >>> I've been using DESeq for my RNA-Seq differential expression analysis. >>> Now I want to do GSEA. I have got follow expression value. which one >>> should I used for the down stream analysis? >> >> Please provide more details about the downstream analysis. >> >> Do you need a matrix of values for each gene and sample, or just the >> test statistic for each gene? >> >>> rc, rld or vsd? >>> >>> rc <- counts(dds) >>> rld <- rlog(dds) >>> vsd <- varianceStabilizingTransformation(dds) >>> rlogMat <- assay(rld) >>> vstMat <- assay(vsd) >>> >>> Then I want to use the DESeq result to generate a ranked-list, which >>> will be used as the input in GSEA. My question is: Should I rank the >>> genes using the fold changes or using the q-values? >>> >> >> You can use the shrunken fold changes or p-values for ranking. The >> fold change measures the effect itself, while the p-value is a >> function of how distinct the changes are, so the signal over the >> noise. For example, consider a comparison of two groups with three >> values each (here continuous values just for demonstration): [3,4,5] >> vs [1,2,3] has a fold change of 2, whereas [11,11,11] vs [10,10,10] >> has a fold change of 1.1. but the second comparison will have a lower >> p-value because the variance within groups is so small. >> >> Mike >> >>> Thank you very much in advance. >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD REPLY

Login before adding your answer.

Traffic: 659 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6