number of genes for DESeq analysis
2
0
Entering edit mode
@vladimir-mashanov-5118
Last seen 9.6 years ago
Dear All, I have carried out an RNAseq experiment with 4 conditions, 2 biological replicates of each. In the moment, I am interested in how my conditions differ in terms of expression of a subset of 36 genes. The idea is to count only the reads, which correspond to those 36 genes and use this piece of data for the analysis of their differential expression across the conditions. Will this approach be valid? What is the minimum number of genes required by the statistical model implemented in DESeq? I apologize if the question are too naive. Thank you Vladimir.
RNASeq RNASeq • 1.6k views
ADD COMMENT
0
Entering edit mode
@abhishek-pratap-5083
Last seen 9.6 years ago
Hi Vladimir One way to do this without cutting down on your gene set is to do the dispersion estimation and binomial test for all the genes you have in your annotation model and then take a subset of the genes you are interested from the resulting data frame spitted out after the binomial test. I am not sure what kind of impact will it have on the statistical model if you reduce the number of genes. I guess the estimates are taken for each gene but since your gene sample size will be very small may be the model will have issues on estimation a dispersion parameter. I am not sure. Simon or Wolfgang can best answer that. HTH, -Abhi On Tue, Feb 14, 2012 at 6:37 AM, vladimir mashanov <mashanovvlad at="" googlemail.com=""> wrote: > Dear All, > > I have carried out an RNAseq experiment with 4 conditions, 2 > biological replicates of each. In the moment, I am interested in how > my conditions differ in terms of expression of a subset of 36 genes. > The idea is to count only the reads, which correspond to those 36 > genes and use this piece of data for the analysis of their > differential expression across the conditions. Will this approach be > valid? What is the minimum number of genes required by the statistical > model implemented in DESeq? I apologize if the question are too naive. > > Thank you > > Vladimir. > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
ADD COMMENT
0
Entering edit mode
Simon Anders ★ 3.7k
@simon-anders-3855
Last seen 3.7 years ago
Zentrum für Molekularbiologie, Universi…
Dear Vladimir > I have carried out an RNAseq experiment with 4 conditions, 2 > biological replicates of each. In the moment, I am interested in how > my conditions differ in terms of expression of a subset of 36 genes. > The idea is to count only the reads, which correspond to those 36 > genes and use this piece of data for the analysis of their > differential expression across the conditions. Will this approach be > valid? What is the minimum number of genes required by the statistical > model implemented in DESeq? I apologize if the question are too naive. What is wrong with doing the analysis for all genes, and then looking only at those that you are interested in? For the dispersion estimation, you should use all available genes. However, at least if you have really selected the list of 36 genes prior to your experiment or at least independently of your RNA-Seq data and do not intend to look at any further genes to decide on the hypothesis you currently have in mind, you might be justified at performing the multiple testing adjustment on the raw p values of only those 36 genes, which would surely improve your power. To do so, subset them from the "pvalue" column of the final result and hand them to the 'p.adjust' function (with 'method="BH"'). Simon
ADD COMMENT

Login before adding your answer.

Traffic: 723 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6