weired bayseq results

0

Entering edit mode

Fatemehsadat Seyednasrollah ▴ 260

@fatemehsadat-seyednasrollah-5367

Last seen 9.6 years ago

Dear list, I have used BaySeq to my RNA-Seq data to extract DE genes and I have several biological replicates for each condition(20 replicates for each condition). The point is that if I rerun the BaySeq over my dataset then the number of detected genes with a specific common FDR will change. Can it be possible? For 10 times rerunning the script the number of detected genes with FDR < 0.05 are : 82, 85, 84, 87, 85, 85, 84, 86, 82, 83 > sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-redhat-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] baySeq_1.12.0 loaded via a namespace (and not attached): [1] tools_2.15.1 Thank you in advance

baySeq baySeq • 1.2k views

ADD COMMENT • link 11.4 years ago Fatemehsadat Seyednasrollah ▴ 260

0

Entering edit mode

Fatemehsadat Seyednasrollah ▴ 260

@fatemehsadat-seyednasrollah-5367

Last seen 9.6 years ago

Just I forgot to mention that the order of replicates in the count table has changed in each rerun process. And below is the my code: function (table, each, samplesize) { counttable <- read.delim(table, header = FALSE, row.names = 1) replicates <- c(rep("control", each), rep("treatment", each)) groups <- list(NDE = rep(1, (2 * each)), DE = c(rep(1, each), rep(2, each))) counttable <- as.matrix(counttable) CD <- new("countData", data = counttable, replicates = replicates, groups = groups) CD at libsizes <- getLibsizes(CD) cl <- NULL CD <- getPriors.NB(CD, samplesize = samplesize, estimation = "QL", cl = cl) CD <- getLikelihoods.NB(CD, pET = "BIC", cl = cl) print("estProps: ") print(CD at estProps) return(CD) } Fatemeh ________________________________________ From: bioconductor-bounces@r-project.org [bioconductor- bounces@r-project.org] on behalf of Fatemehsadat Seyednasrollah [fatsey@utu.fi] Sent: Monday, November 19, 2012 2:14 PM To: bioconductor at r-project.org Subject: [BioC] weired bayseq results Dear list, I have used BaySeq to my RNA-Seq data to extract DE genes and I have several biological replicates for each condition(20 replicates for each condition). The point is that if I rerun the BaySeq over my dataset then the number of detected genes with a specific common FDR will change. Can it be possible? For 10 times rerunning the script the number of detected genes with FDR < 0.05 are : 82, 85, 84, 87, 85, 85, 84, 86, 82, 83 > sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-redhat-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] baySeq_1.12.0 loaded via a namespace (and not attached): [1] tools_2.15.1 Thank you in advance _______________________________________________ Bioconductor mailing list Bioconductor at r-project.org https://stat.ethz.ch/mailman/listinfo/bioconductor Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD COMMENT • link 11.4 years ago Fatemehsadat Seyednasrollah ▴ 260

0

Entering edit mode

You are randomly sampling 'samplesize' number of data points when you estimate the priors as I understand it. I've never actually used baySeq myself, but intuitively I would expect the results to stabilise if you increase the sample size (at the expense of computational time). I'm guessing you could record which data was used to estimate the priors the first time you run it and then restrict the sampling to that subset on subsequent runs to ensure your results are repeatable. Alex Gutteridge On 19.11.2012 12:45, Fatemehsadat Seyednasrollah wrote: > Just I forgot to mention that the order of replicates in the count > table has changed in each rerun process. > And below is the my code: > > function (table, each, samplesize) > { > counttable <- read.delim(table, header = FALSE, row.names = 1) > replicates <- c(rep("control", each), rep("treatment", each)) > groups <- list(NDE = rep(1, (2 * each)), DE = c(rep(1, each), > rep(2, each))) > counttable <- as.matrix(counttable) > CD <- new("countData", data = counttable, replicates = > replicates, > groups = groups) > CD at libsizes <- getLibsizes(CD) > cl <- NULL > CD <- getPriors.NB(CD, samplesize = samplesize, estimation = > "QL", > cl = cl) > CD <- getLikelihoods.NB(CD, pET = "BIC", cl = cl) > print("estProps: ") > print(CD at estProps) > return(CD) > } > > Fatemeh > ________________________________________ > From: bioconductor-bounces at r-project.org > [bioconductor-bounces at r-project.org] on behalf of Fatemehsadat > Seyednasrollah [fatsey at utu.fi] > Sent: Monday, November 19, 2012 2:14 PM > To: bioconductor at r-project.org > Subject: [BioC] weired bayseq results > > Dear list, > > I have used BaySeq to my RNA-Seq data to extract DE genes and I have > several biological replicates for each condition(20 replicates for > each condition). The point is that if I rerun the BaySeq over my > dataset then the number of detected genes with a specific common FDR > will change. Can it be possible? > For 10 times rerunning the script the number of detected genes with > FDR < 0.05 are : > 82, 85, 84, 87, 85, 85, 84, 86, 82, 83 > >> sessionInfo() > R version 2.15.1 (2012-06-22) > Platform: x86_64-redhat-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=C LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] baySeq_1.12.0 > > loaded via a namespace (and not attached): > [1] tools_2.15.1 > > Thank you in advance > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor -- Alex Gutteridge

ADD REPLY • link 11.4 years ago Alex Gutteridge ▴ 650

Login before adding your answer.