ReportingTools gene IDs

0

Entering edit mode

Assa Yeroslaviz ★ 1.5k

@assa-yeroslaviz-1597

Last seen 3 months ago

Germany

Hi, Is it neccessary to have entrez gene IDs to work with this package? I am working on a dataset with Ensembl IDs. Do I need to convert them to Entrez? When trying to create a report for a DESeqDataSet or DESeqResults objects i am getting the error messege: Error: Ids do not appear to be Entrez Ids for the specified species. Is there a way to work straight with the ensembl IDs? Thanks Assa my script: head(Counts_set) A_pKO_aV_FCS G_pKO_aV_FCS M_pKO_aV_FCS D_pKO_aV J_pKO_aV ENSMUSG00000000001 4744 4632 4535 4748 3736 ENSMUSG00000000003 0 0 0 0 0 ENSMUSG00000000028 1246 1420 1429 2304 1261 ENSMUSG00000000031 3 25 65 0 50 ENSMUSG00000000037 0 0 0 0 0 ENSMUSG00000000049 0 0 3 1 3 cds <- DESeqDataSetFromMatrix ( countData = Counts_set, colData = colData, design = ~ condition ) fit = DESeq(cds) des2Report <- HTMLReport(shortName =paste('RNAseq_analysis_', group1, "_", group2, sep=""),title ='RNA-seq analysis of differential expression using DESeq2',reportDirectory = "./reports") publish(fit,des2Report, pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", factor = colData(fit)$condition,reportDir="./reports") Error: Ids do not appear to be Entrez Ids for the specified species. finish(des2Report) > sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] org.Mm.eg.db_2.14.0 ReportingTools_2.4.0 AnnotationDbi_1.26.0 [4] Biobase_2.24.0 RSQLite_0.11.4 DBI_0.2-7 [7] knitr_1.5 DESeq2_1.4.0 RcppArmadillo_0.4.200.0 [10] Rcpp_0.11.1 GenomicRanges_1.16.2 GenomeInfoDb_1.0.2 [13] IRanges_1.22.3 BiocGenerics_0.10.0 loaded via a namespace (and not attached): [1] annotate_1.42.0 AnnotationForge_1.6.0 BatchJobs_1.2 [4] BBmisc_1.5 BiocParallel_0.6.0 biomaRt_2.20.0 [7] Biostrings_2.32.0 biovizBase_1.12.0 bitops_1.0-6 [10] brew_1.0-6 BSgenome_1.32.0 Category_2.30.0 [13] cluster_1.14.4 codetools_0.2-8 colorspace_1.2-4 [16] dichromat_2.0-0 digest_0.6.4 edgeR_3.6.0 [19] evaluate_0.5.3 fail_1.2 foreach_1.4.2 [22] formatR_0.10 Formula_1.1-1 genefilter_1.46.0 [25] geneplotter_1.42.0 GenomicAlignments_1.0.0 GenomicFeatures_1.16.0 [28] ggbio_1.12.0 ggplot2_0.9.3.1 GO.db_2.14.0 [31] GOstats_2.30.0 graph_1.42.0 grid_3.1.0 [34] gridExtra_0.9.1 GSEABase_1.26.0 gtable_0.1.2 [37] Hmisc_3.14-4 hwriter_1.3 iterators_1.0.7 [40] lattice_0.20-24 latticeExtra_0.6-26 limma_3.20.1 [43] locfit_1.5-9.1 MASS_7.3-29 Matrix_1.1-2 [46] munsell_0.4.2 PFAM.db_2.14.0 plyr_1.8.1 [49] proto_0.3-10 RBGL_1.40.0 RColorBrewer_1.0-5 [52] RCurl_1.95-4.1 reshape2_1.2.2 R.methodsS3_1.6.1 [55] R.oo_1.18.0 Rsamtools_1.16.0 rtracklayer_1.24.0 [58] R.utils_1.29.8 scales_0.2.4 sendmailR_1.1-2 [61] splines_3.1.0 stats4_3.1.0 stringr_0.6.2 [64] survival_2.37-7 tools_3.1.0 VariantAnnotation_1.10.0 [67] XML_3.98-1.1 xtable_1.7-3 XVector_0.4.0 [70] zlibbioc_1.10.0 [[alternative HTML version deleted]]

convert convert • 2.6k views

ADD COMMENT • link updated 10.0 years ago by James W. MacDonald 65k • written 10.0 years ago by Assa Yeroslaviz ★ 1.5k

0

Entering edit mode

James W. MacDonald 65k

@james-w-macdonald-5106

Last seen 2 hours ago

United States

Hi Assa, There may well be a way to work with Ensembl IDs, and you will likely get an answer to your direct question from one of the maintainers. However you should note that ReportingTools simply takes the input object and then coerces the data to a data.frame, which is then used to create the HTML table. You can always create the data.frame to your own liking up front, and then pass that to publish(). While this is more work than just passing in the DESeqDataSet, you do have complete control over the output. Best, Jim On 4/24/2014 8:50 AM, Assa Yeroslaviz wrote: > Hi, > > Is it neccessary to have entrez gene IDs to work with this package? > > I am working on a dataset with Ensembl IDs. Do I need to convert them to > Entrez? > > When trying to create a report for a DESeqDataSet or DESeqResults objects i > am getting the error messege: > > Error: Ids do not appear to be Entrez Ids for the specified species. > > Is there a way to work straight with the ensembl IDs? > > Thanks > > Assa > > my script: > > head(Counts_set) > A_pKO_aV_FCS G_pKO_aV_FCS M_pKO_aV_FCS D_pKO_aV J_pKO_aV > ENSMUSG00000000001 4744 4632 4535 4748 3736 > ENSMUSG00000000003 0 0 0 0 0 > ENSMUSG00000000028 1246 1420 1429 2304 1261 > ENSMUSG00000000031 3 25 65 0 50 > ENSMUSG00000000037 0 0 0 0 0 > ENSMUSG00000000049 0 0 3 1 3 > > cds <- DESeqDataSetFromMatrix ( > countData = Counts_set, > colData = colData, > design = ~ condition > ) > > fit = DESeq(cds) > des2Report <- HTMLReport(shortName =paste('RNAseq_analysis_', group1, "_", > group2, sep=""),title ='RNA-seq analysis of differential expression using > DESeq2',reportDirectory = "./reports") > publish(fit,des2Report, pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", > factor = colData(fit)$condition,reportDir="./reports") > Error: Ids do not appear to be Entrez Ids for the specified species. > finish(des2Report) > > >> sessionInfo() > R version 3.1.0 (2014-04-10) > Platform: x86_64-pc-linux-gnu (64-bit) > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] parallel stats graphics grDevices utils datasets methods > [8] base > > other attached packages: > [1] org.Mm.eg.db_2.14.0 ReportingTools_2.4.0 AnnotationDbi_1.26.0 > [4] Biobase_2.24.0 RSQLite_0.11.4 DBI_0.2-7 > [7] knitr_1.5 DESeq2_1.4.0 RcppArmadillo_0.4.200.0 > [10] Rcpp_0.11.1 GenomicRanges_1.16.2 GenomeInfoDb_1.0.2 > [13] IRanges_1.22.3 BiocGenerics_0.10.0 > > loaded via a namespace (and not attached): > [1] annotate_1.42.0 AnnotationForge_1.6.0 > BatchJobs_1.2 > [4] BBmisc_1.5 BiocParallel_0.6.0 > biomaRt_2.20.0 > [7] Biostrings_2.32.0 biovizBase_1.12.0 > bitops_1.0-6 > [10] brew_1.0-6 BSgenome_1.32.0 > Category_2.30.0 > [13] cluster_1.14.4 codetools_0.2-8 > colorspace_1.2-4 > [16] dichromat_2.0-0 digest_0.6.4 > edgeR_3.6.0 > [19] evaluate_0.5.3 fail_1.2 > foreach_1.4.2 > [22] formatR_0.10 Formula_1.1-1 > genefilter_1.46.0 > [25] geneplotter_1.42.0 GenomicAlignments_1.0.0 > GenomicFeatures_1.16.0 > [28] ggbio_1.12.0 ggplot2_0.9.3.1 > GO.db_2.14.0 > [31] GOstats_2.30.0 graph_1.42.0 > grid_3.1.0 > [34] gridExtra_0.9.1 GSEABase_1.26.0 > gtable_0.1.2 > [37] Hmisc_3.14-4 hwriter_1.3 > iterators_1.0.7 > [40] lattice_0.20-24 latticeExtra_0.6-26 > limma_3.20.1 > [43] locfit_1.5-9.1 MASS_7.3-29 > Matrix_1.1-2 > [46] munsell_0.4.2 PFAM.db_2.14.0 > plyr_1.8.1 > [49] proto_0.3-10 RBGL_1.40.0 > RColorBrewer_1.0-5 > [52] RCurl_1.95-4.1 reshape2_1.2.2 > R.methodsS3_1.6.1 > [55] R.oo_1.18.0 Rsamtools_1.16.0 > rtracklayer_1.24.0 > [58] R.utils_1.29.8 scales_0.2.4 > sendmailR_1.1-2 > [61] splines_3.1.0 stats4_3.1.0 > stringr_0.6.2 > [64] survival_2.37-7 tools_3.1.0 > VariantAnnotation_1.10.0 > [67] XML_3.98-1.1 xtable_1.7-3 > XVector_0.4.0 > [70] zlibbioc_1.10.0 > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD COMMENT • link 10.0 years ago James W. MacDonald 65k

0

Entering edit mode

Thanks Jim, I have found in one of the forums a response from Jason (thanks again) for the option to set annotation.db=NULL and though force the publish command to work with the Ids I provide in the DESeqDataSet object. So this is now working, But I would like to have also the option to add some annotations to the table. Is this only possible when working directly with a data .frame? Thanks again Assa On Thu, Apr 24, 2014 at 3:45 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Assa, > > There may well be a way to work with Ensembl IDs, and you will likely get > an answer to your direct question from one of the maintainers. > > However you should note that ReportingTools simply takes the input object > and then coerces the data to a data.frame, which is then used to create the > HTML table. You can always create the data.frame to your own liking up > front, and then pass that to publish(). While this is more work than just > passing in the DESeqDataSet, you do have complete control over the output. > > Best, > > Jim > > > > On 4/24/2014 8:50 AM, Assa Yeroslaviz wrote: > >> Hi, >> >> Is it neccessary to have entrez gene IDs to work with this package? >> >> I am working on a dataset with Ensembl IDs. Do I need to convert them to >> Entrez? >> >> When trying to create a report for a DESeqDataSet or DESeqResults objects >> i >> am getting the error messege: >> >> Error: Ids do not appear to be Entrez Ids for the specified species. >> >> Is there a way to work straight with the ensembl IDs? >> >> Thanks >> >> Assa >> >> my script: >> >> head(Counts_set) >> A_pKO_aV_FCS G_pKO_aV_FCS M_pKO_aV_FCS D_pKO_aV >> J_pKO_aV >> ENSMUSG00000000001 4744 4632 4535 4748 >> 3736 >> ENSMUSG00000000003 0 0 0 0 >> 0 >> ENSMUSG00000000028 1246 1420 1429 2304 >> 1261 >> ENSMUSG00000000031 3 25 65 0 >> 50 >> ENSMUSG00000000037 0 0 0 0 >> 0 >> ENSMUSG00000000049 0 0 3 1 >> 3 >> >> cds <- DESeqDataSetFromMatrix ( >> countData = Counts_set, >> colData = colData, >> design = ~ condition >> ) >> >> fit = DESeq(cds) >> des2Report <- HTMLReport(shortName =paste('RNAseq_analysis_', group1, "_", >> group2, sep=""),title ='RNA-seq analysis of differential expression using >> DESeq2',reportDirectory = "./reports") >> publish(fit,des2Report, pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", >> factor = colData(fit)$condition,reportDir="./reports") >> Error: Ids do not appear to be Entrez Ids for the specified species. >> finish(des2Report) >> >> >> sessionInfo() >>> >> R version 3.1.0 (2014-04-10) >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] parallel stats graphics grDevices utils datasets methods >> [8] base >> >> other attached packages: >> [1] org.Mm.eg.db_2.14.0 ReportingTools_2.4.0 AnnotationDbi_1.26.0 >> [4] Biobase_2.24.0 RSQLite_0.11.4 DBI_0.2-7 >> [7] knitr_1.5 DESeq2_1.4.0 >> RcppArmadillo_0.4.200.0 >> [10] Rcpp_0.11.1 GenomicRanges_1.16.2 GenomeInfoDb_1.0.2 >> [13] IRanges_1.22.3 BiocGenerics_0.10.0 >> >> loaded via a namespace (and not attached): >> [1] annotate_1.42.0 AnnotationForge_1.6.0 >> BatchJobs_1.2 >> [4] BBmisc_1.5 BiocParallel_0.6.0 >> biomaRt_2.20.0 >> [7] Biostrings_2.32.0 biovizBase_1.12.0 >> bitops_1.0-6 >> [10] brew_1.0-6 BSgenome_1.32.0 >> Category_2.30.0 >> [13] cluster_1.14.4 codetools_0.2-8 >> colorspace_1.2-4 >> [16] dichromat_2.0-0 digest_0.6.4 >> edgeR_3.6.0 >> [19] evaluate_0.5.3 fail_1.2 >> foreach_1.4.2 >> [22] formatR_0.10 Formula_1.1-1 >> genefilter_1.46.0 >> [25] geneplotter_1.42.0 GenomicAlignments_1.0.0 >> GenomicFeatures_1.16.0 >> [28] ggbio_1.12.0 ggplot2_0.9.3.1 >> GO.db_2.14.0 >> [31] GOstats_2.30.0 graph_1.42.0 >> grid_3.1.0 >> [34] gridExtra_0.9.1 GSEABase_1.26.0 >> gtable_0.1.2 >> [37] Hmisc_3.14-4 hwriter_1.3 >> iterators_1.0.7 >> [40] lattice_0.20-24 latticeExtra_0.6-26 >> limma_3.20.1 >> [43] locfit_1.5-9.1 MASS_7.3-29 >> Matrix_1.1-2 >> [46] munsell_0.4.2 PFAM.db_2.14.0 >> plyr_1.8.1 >> [49] proto_0.3-10 RBGL_1.40.0 >> RColorBrewer_1.0-5 >> [52] RCurl_1.95-4.1 reshape2_1.2.2 >> R.methodsS3_1.6.1 >> [55] R.oo_1.18.0 Rsamtools_1.16.0 >> rtracklayer_1.24.0 >> [58] R.utils_1.29.8 scales_0.2.4 >> sendmailR_1.1-2 >> [61] splines_3.1.0 stats4_3.1.0 >> stringr_0.6.2 >> [64] survival_2.37-7 tools_3.1.0 >> VariantAnnotation_1.10.0 >> [67] XML_3.98-1.1 xtable_1.7-3 >> XVector_0.4.0 >> [70] zlibbioc_1.10.0 >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane. >> science.biology.informatics.conductor >> > > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]

ADD REPLY • link 10.0 years ago Assa Yeroslaviz ★ 1.5k

0

Entering edit mode

Assa, In general yes, if you want to add to the table you will be working with the data.frame. You can do so after the initial conversion, though, so you don't have to recreate the wheel to get from your object to an initial data.frame. To modify the default table (data.frame) generated for an object, you can pass publish()'s .modifyDF parameter a function of list of functions, each of which should accept object (the data.frame) and "..." and return a data.frame. These will be called in order, each accepting the output from the last. The output of the final function is what will be transformed into HTML and inserted into the report. You'll probably want to add the default handling of your object type, which you can do by putting getMethod("modifyReportDF", "<your object's="" class="">") at the beginning of the list. See section 4 of the ReportingTools basics vignette for example code. HTH, ~G On Thu, Apr 24, 2014 at 6:54 AM, Assa Yeroslaviz <frymor@gmail.com> wrote: > Thanks Jim, > > I have found in one of the forums a response from Jason (thanks again) for > the option to set annotation.db=NULL and though force the publish command > to work with the Ids I provide in the DESeqDataSet object. > > So this is now working, But I would like to have also the option to add > some annotations to the table. > > Is this only possible when working directly with a data .frame? > > Thanks again > Assa > > On Thu, Apr 24, 2014 at 3:45 PM, James W. MacDonald <jmacdon@uw.edu> > wrote: > > > Hi Assa, > > > > There may well be a way to work with Ensembl IDs, and you will likely get > > an answer to your direct question from one of the maintainers. > > > > However you should note that ReportingTools simply takes the input object > > and then coerces the data to a data.frame, which is then used to create > the > > HTML table. You can always create the data.frame to your own liking up > > front, and then pass that to publish(). While this is more work than just > > passing in the DESeqDataSet, you do have complete control over the > output. > > > > Best, > > > > Jim > > > > > > > > On 4/24/2014 8:50 AM, Assa Yeroslaviz wrote: > > > >> Hi, > >> > >> Is it neccessary to have entrez gene IDs to work with this package? > >> > >> I am working on a dataset with Ensembl IDs. Do I need to convert them to > >> Entrez? > >> > >> When trying to create a report for a DESeqDataSet or DESeqResults > objects > >> i > >> am getting the error messege: > >> > >> Error: Ids do not appear to be Entrez Ids for the specified species. > >> > >> Is there a way to work straight with the ensembl IDs? > >> > >> Thanks > >> > >> Assa > >> > >> my script: > >> > >> head(Counts_set) > >> A_pKO_aV_FCS G_pKO_aV_FCS M_pKO_aV_FCS D_pKO_aV > >> J_pKO_aV > >> ENSMUSG00000000001 4744 4632 4535 4748 > >> 3736 > >> ENSMUSG00000000003 0 0 0 0 > >> 0 > >> ENSMUSG00000000028 1246 1420 1429 2304 > >> 1261 > >> ENSMUSG00000000031 3 25 65 0 > >> 50 > >> ENSMUSG00000000037 0 0 0 0 > >> 0 > >> ENSMUSG00000000049 0 0 3 1 > >> 3 > >> > >> cds <- DESeqDataSetFromMatrix ( > >> countData = Counts_set, > >> colData = colData, > >> design = ~ condition > >> ) > >> > >> fit = DESeq(cds) > >> des2Report <- HTMLReport(shortName =paste('RNAseq_analysis_', group1, > "_", > >> group2, sep=""),title ='RNA-seq analysis of differential expression > using > >> DESeq2',reportDirectory = "./reports") > >> publish(fit,des2Report, pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", > >> factor = colData(fit)$condition,reportDir="./reports") > >> Error: Ids do not appear to be Entrez Ids for the specified species. > >> finish(des2Report) > >> > >> > >> sessionInfo() > >>> > >> R version 3.1.0 (2014-04-10) > >> Platform: x86_64-pc-linux-gnu (64-bit) > >> > >> locale: > >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > >> [9] LC_ADDRESS=C LC_TELEPHONE=C > >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > >> > >> attached base packages: > >> [1] parallel stats graphics grDevices utils datasets methods > >> [8] base > >> > >> other attached packages: > >> [1] org.Mm.eg.db_2.14.0 ReportingTools_2.4.0 > AnnotationDbi_1.26.0 > >> [4] Biobase_2.24.0 RSQLite_0.11.4 DBI_0.2-7 > >> [7] knitr_1.5 DESeq2_1.4.0 > >> RcppArmadillo_0.4.200.0 > >> [10] Rcpp_0.11.1 GenomicRanges_1.16.2 GenomeInfoDb_1.0.2 > >> [13] IRanges_1.22.3 BiocGenerics_0.10.0 > >> > >> loaded via a namespace (and not attached): > >> [1] annotate_1.42.0 AnnotationForge_1.6.0 > >> BatchJobs_1.2 > >> [4] BBmisc_1.5 BiocParallel_0.6.0 > >> biomaRt_2.20.0 > >> [7] Biostrings_2.32.0 biovizBase_1.12.0 > >> bitops_1.0-6 > >> [10] brew_1.0-6 BSgenome_1.32.0 > >> Category_2.30.0 > >> [13] cluster_1.14.4 codetools_0.2-8 > >> colorspace_1.2-4 > >> [16] dichromat_2.0-0 digest_0.6.4 > >> edgeR_3.6.0 > >> [19] evaluate_0.5.3 fail_1.2 > >> foreach_1.4.2 > >> [22] formatR_0.10 Formula_1.1-1 > >> genefilter_1.46.0 > >> [25] geneplotter_1.42.0 GenomicAlignments_1.0.0 > >> GenomicFeatures_1.16.0 > >> [28] ggbio_1.12.0 ggplot2_0.9.3.1 > >> GO.db_2.14.0 > >> [31] GOstats_2.30.0 graph_1.42.0 > >> grid_3.1.0 > >> [34] gridExtra_0.9.1 GSEABase_1.26.0 > >> gtable_0.1.2 > >> [37] Hmisc_3.14-4 hwriter_1.3 > >> iterators_1.0.7 > >> [40] lattice_0.20-24 latticeExtra_0.6-26 > >> limma_3.20.1 > >> [43] locfit_1.5-9.1 MASS_7.3-29 > >> Matrix_1.1-2 > >> [46] munsell_0.4.2 PFAM.db_2.14.0 > >> plyr_1.8.1 > >> [49] proto_0.3-10 RBGL_1.40.0 > >> RColorBrewer_1.0-5 > >> [52] RCurl_1.95-4.1 reshape2_1.2.2 > >> R.methodsS3_1.6.1 > >> [55] R.oo_1.18.0 Rsamtools_1.16.0 > >> rtracklayer_1.24.0 > >> [58] R.utils_1.29.8 scales_0.2.4 > >> sendmailR_1.1-2 > >> [61] splines_3.1.0 stats4_3.1.0 > >> stringr_0.6.2 > >> [64] survival_2.37-7 tools_3.1.0 > >> VariantAnnotation_1.10.0 > >> [67] XML_3.98-1.1 xtable_1.7-3 > >> XVector_0.4.0 > >> [70] zlibbioc_1.10.0 > >> > >> [[alternative HTML version deleted]] > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor@r-project.org > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane. > >> science.biology.informatics.conductor > >> > > > > -- > > James W. MacDonald, M.S. > > Biostatistician > > University of Washington > > Environmental and Occupational Health Sciences > > 4225 Roosevelt Way NE, # 100 > > Seattle WA 98105-6099 > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor@r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > -- Gabriel Becker Graduate Student Statistics Department University of California, Davis [[alternative HTML version deleted]]

ADD REPLY • link 10.0 years ago Gabriel Becker ▴ 70

0

Entering edit mode

I wrote my previous message too quickly. Apologies. Your functions must have the signature function(df, object, ...) df is current data.frame represenation of the object, object is the *original* object (so that the class can be identified), ... are passed in from the call to publish And you can just place the generic modifyReportDF function at the beginning of the list, rather than using getMethod. The getMethod thing I said is for when you want to apply the default handling for a *different* class to your object. It is a rare use-case, but came up recently so it was on my mind. That will teach me to respond quickly to emails early in the morning. Sorry about that. ~G On Thu, Apr 24, 2014 at 7:18 AM, Gabriel Becker <gmbecker@ucdavis.edu>wrote: > Assa, > > In general yes, if you want to add to the table you will be working with > the data.frame. > > You can do so after the initial conversion, though, so you don't have to > recreate the wheel to get from your object to an initial data.frame. > > To modify the default table (data.frame) generated for an object, you can > pass publish()'s .modifyDF parameter a function of list of functions, each > of which should accept object (the data.frame) and "..." and return a > data.frame. > > These will be called in order, each accepting the output from the last. > The output of the final function is what will be transformed into HTML and > inserted into the report. > > You'll probably want to add the default handling of your object type, > which you can do by putting > getMethod("modifyReportDF", "<your object's="" class="">") at the beginning of > the list. > > See section 4 of the ReportingTools basics vignette for example code. > > HTH, > ~G > > > On Thu, Apr 24, 2014 at 6:54 AM, Assa Yeroslaviz <frymor@gmail.com> wrote: > >> Thanks Jim, >> >> I have found in one of the forums a response from Jason (thanks again) for >> the option to set annotation.db=NULL and though force the publish command >> to work with the Ids I provide in the DESeqDataSet object. >> >> So this is now working, But I would like to have also the option to add >> some annotations to the table. >> >> Is this only possible when working directly with a data .frame? >> >> Thanks again >> Assa >> >> On Thu, Apr 24, 2014 at 3:45 PM, James W. MacDonald <jmacdon@uw.edu> >> wrote: >> >> > Hi Assa, >> > >> > There may well be a way to work with Ensembl IDs, and you will likely >> get >> > an answer to your direct question from one of the maintainers. >> > >> > However you should note that ReportingTools simply takes the input >> object >> > and then coerces the data to a data.frame, which is then used to create >> the >> > HTML table. You can always create the data.frame to your own liking up >> > front, and then pass that to publish(). While this is more work than >> just >> > passing in the DESeqDataSet, you do have complete control over the >> output. >> > >> > Best, >> > >> > Jim >> > >> > >> > >> > On 4/24/2014 8:50 AM, Assa Yeroslaviz wrote: >> > >> >> Hi, >> >> >> >> Is it neccessary to have entrez gene IDs to work with this package? >> >> >> >> I am working on a dataset with Ensembl IDs. Do I need to convert them >> to >> >> Entrez? >> >> >> >> When trying to create a report for a DESeqDataSet or DESeqResults >> objects >> >> i >> >> am getting the error messege: >> >> >> >> Error: Ids do not appear to be Entrez Ids for the specified species. >> >> >> >> Is there a way to work straight with the ensembl IDs? >> >> >> >> Thanks >> >> >> >> Assa >> >> >> >> my script: >> >> >> >> head(Counts_set) >> >> A_pKO_aV_FCS G_pKO_aV_FCS M_pKO_aV_FCS D_pKO_aV >> >> J_pKO_aV >> >> ENSMUSG00000000001 4744 4632 4535 4748 >> >> 3736 >> >> ENSMUSG00000000003 0 0 0 0 >> >> 0 >> >> ENSMUSG00000000028 1246 1420 1429 2304 >> >> 1261 >> >> ENSMUSG00000000031 3 25 65 0 >> >> 50 >> >> ENSMUSG00000000037 0 0 0 0 >> >> 0 >> >> ENSMUSG00000000049 0 0 3 1 >> >> 3 >> >> >> >> cds <- DESeqDataSetFromMatrix ( >> >> countData = Counts_set, >> >> colData = colData, >> >> design = ~ condition >> >> ) >> >> >> >> fit = DESeq(cds) >> >> des2Report <- HTMLReport(shortName =paste('RNAseq_analysis_', group1, >> "_", >> >> group2, sep=""),title ='RNA-seq analysis of differential expression >> using >> >> DESeq2',reportDirectory = "./reports") >> >> publish(fit,des2Report, pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", >> >> factor = colData(fit)$condition,reportDir="./reports") >> >> Error: Ids do not appear to be Entrez Ids for the specified species. >> >> finish(des2Report) >> >> >> >> >> >> sessionInfo() >> >>> >> >> R version 3.1.0 (2014-04-10) >> >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> >> >> locale: >> >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> >> >> attached base packages: >> >> [1] parallel stats graphics grDevices utils datasets methods >> >> [8] base >> >> >> >> other attached packages: >> >> [1] org.Mm.eg.db_2.14.0 ReportingTools_2.4.0 >> AnnotationDbi_1.26.0 >> >> [4] Biobase_2.24.0 RSQLite_0.11.4 DBI_0.2-7 >> >> [7] knitr_1.5 DESeq2_1.4.0 >> >> RcppArmadillo_0.4.200.0 >> >> [10] Rcpp_0.11.1 GenomicRanges_1.16.2 GenomeInfoDb_1.0.2 >> >> [13] IRanges_1.22.3 BiocGenerics_0.10.0 >> >> >> >> loaded via a namespace (and not attached): >> >> [1] annotate_1.42.0 AnnotationForge_1.6.0 >> >> BatchJobs_1.2 >> >> [4] BBmisc_1.5 BiocParallel_0.6.0 >> >> biomaRt_2.20.0 >> >> [7] Biostrings_2.32.0 biovizBase_1.12.0 >> >> bitops_1.0-6 >> >> [10] brew_1.0-6 BSgenome_1.32.0 >> >> Category_2.30.0 >> >> [13] cluster_1.14.4 codetools_0.2-8 >> >> colorspace_1.2-4 >> >> [16] dichromat_2.0-0 digest_0.6.4 >> >> edgeR_3.6.0 >> >> [19] evaluate_0.5.3 fail_1.2 >> >> foreach_1.4.2 >> >> [22] formatR_0.10 Formula_1.1-1 >> >> genefilter_1.46.0 >> >> [25] geneplotter_1.42.0 GenomicAlignments_1.0.0 >> >> GenomicFeatures_1.16.0 >> >> [28] ggbio_1.12.0 ggplot2_0.9.3.1 >> >> GO.db_2.14.0 >> >> [31] GOstats_2.30.0 graph_1.42.0 >> >> grid_3.1.0 >> >> [34] gridExtra_0.9.1 GSEABase_1.26.0 >> >> gtable_0.1.2 >> >> [37] Hmisc_3.14-4 hwriter_1.3 >> >> iterators_1.0.7 >> >> [40] lattice_0.20-24 latticeExtra_0.6-26 >> >> limma_3.20.1 >> >> [43] locfit_1.5-9.1 MASS_7.3-29 >> >> Matrix_1.1-2 >> >> [46] munsell_0.4.2 PFAM.db_2.14.0 >> >> plyr_1.8.1 >> >> [49] proto_0.3-10 RBGL_1.40.0 >> >> RColorBrewer_1.0-5 >> >> [52] RCurl_1.95-4.1 reshape2_1.2.2 >> >> R.methodsS3_1.6.1 >> >> [55] R.oo_1.18.0 Rsamtools_1.16.0 >> >> rtracklayer_1.24.0 >> >> [58] R.utils_1.29.8 scales_0.2.4 >> >> sendmailR_1.1-2 >> >> [61] splines_3.1.0 stats4_3.1.0 >> >> stringr_0.6.2 >> >> [64] survival_2.37-7 tools_3.1.0 >> >> VariantAnnotation_1.10.0 >> >> [67] XML_3.98-1.1 xtable_1.7-3 >> >> XVector_0.4.0 >> >> [70] zlibbioc_1.10.0 >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> _______________________________________________ >> >> Bioconductor mailing list >> >> Bioconductor@r-project.org >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> Search the archives: http://news.gmane.org/gmane. >> >> science.biology.informatics.conductor >> >> >> > >> > -- >> > James W. MacDonald, M.S. >> > Biostatistician >> > University of Washington >> > Environmental and Occupational Health Sciences >> > 4225 Roosevelt Way NE, # 100 >> > Seattle WA 98105-6099 >> > >> > >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics.conductor >> > > > > -- > Gabriel Becker > Graduate Student > Statistics Department > University of California, Davis > -- Gabriel Becker Graduate Student Statistics Department University of California, Davis [[alternative HTML version deleted]]

ADD REPLY • link 10.0 years ago Gabriel Becker ▴ 70

0

Entering edit mode

Hi Gabriel, Thanks for the quick answer I will look into that as soon as I have the time. Another question was if it is possible to work directy with the Ensembl IDs. I have a table of ~37K ensembl Ids, for which almost 50% have no Entrez Ids, so I can't convert them. Is there a way to work directly with the Ensembl IDs and still benefit from the annotation.de possibilities? Thanks Assa On Thu, Apr 24, 2014 at 4:48 PM, Gabriel Becker <gmbecker@ucdavis.edu>wrote: > I wrote my previous message too quickly. Apologies. > > Your functions must have the signature > > function(df, object, ...) > > df is current data.frame represenation of the object, > object is the *original* object (so that the class can be identified), > ... are passed in from the call to publish > > And you can just place the generic modifyReportDF function at the > beginning of the list, rather than using getMethod. The getMethod thing I > said is for when you want to apply the default handling for a *different* > class to your object. It is a rare use-case, but came up recently so it was > on my mind. > > That will teach me to respond quickly to emails early in the morning. > > Sorry about that. > > ~G > > > On Thu, Apr 24, 2014 at 7:18 AM, Gabriel Becker <gmbecker@ucdavis.edu>wrote: > >> Assa, >> >> In general yes, if you want to add to the table you will be working with >> the data.frame. >> >> You can do so after the initial conversion, though, so you don't have to >> recreate the wheel to get from your object to an initial data.frame. >> >> To modify the default table (data.frame) generated for an object, you can >> pass publish()'s .modifyDF parameter a function of list of functions, each >> of which should accept object (the data.frame) and "..." and return a >> data.frame. >> >> These will be called in order, each accepting the output from the last. >> The output of the final function is what will be transformed into HTML and >> inserted into the report. >> >> You'll probably want to add the default handling of your object type, >> which you can do by putting >> getMethod("modifyReportDF", "<your object's="" class="">") at the beginning of >> the list. >> >> See section 4 of the ReportingTools basics vignette for example code. >> >> HTH, >> ~G >> >> >> On Thu, Apr 24, 2014 at 6:54 AM, Assa Yeroslaviz <frymor@gmail.com>wrote: >> >>> Thanks Jim, >>> >>> I have found in one of the forums a response from Jason (thanks again) >>> for >>> the option to set annotation.db=NULL and though force the publish command >>> to work with the Ids I provide in the DESeqDataSet object. >>> >>> So this is now working, But I would like to have also the option to add >>> some annotations to the table. >>> >>> Is this only possible when working directly with a data .frame? >>> >>> Thanks again >>> Assa >>> >>> On Thu, Apr 24, 2014 at 3:45 PM, James W. MacDonald <jmacdon@uw.edu> >>> wrote: >>> >>> > Hi Assa, >>> > >>> > There may well be a way to work with Ensembl IDs, and you will likely >>> get >>> > an answer to your direct question from one of the maintainers. >>> > >>> > However you should note that ReportingTools simply takes the input >>> object >>> > and then coerces the data to a data.frame, which is then used to >>> create the >>> > HTML table. You can always create the data.frame to your own liking up >>> > front, and then pass that to publish(). While this is more work than >>> just >>> > passing in the DESeqDataSet, you do have complete control over the >>> output. >>> > >>> > Best, >>> > >>> > Jim >>> > >>> > >>> > >>> > On 4/24/2014 8:50 AM, Assa Yeroslaviz wrote: >>> > >>> >> Hi, >>> >> >>> >> Is it neccessary to have entrez gene IDs to work with this package? >>> >> >>> >> I am working on a dataset with Ensembl IDs. Do I need to convert them >>> to >>> >> Entrez? >>> >> >>> >> When trying to create a report for a DESeqDataSet or DESeqResults >>> objects >>> >> i >>> >> am getting the error messege: >>> >> >>> >> Error: Ids do not appear to be Entrez Ids for the specified species. >>> >> >>> >> Is there a way to work straight with the ensembl IDs? >>> >> >>> >> Thanks >>> >> >>> >> Assa >>> >> >>> >> my script: >>> >> >>> >> head(Counts_set) >>> >> A_pKO_aV_FCS G_pKO_aV_FCS M_pKO_aV_FCS D_pKO_aV >>> >> J_pKO_aV >>> >> ENSMUSG00000000001 4744 4632 4535 4748 >>> >> 3736 >>> >> ENSMUSG00000000003 0 0 0 0 >>> >> 0 >>> >> ENSMUSG00000000028 1246 1420 1429 2304 >>> >> 1261 >>> >> ENSMUSG00000000031 3 25 65 0 >>> >> 50 >>> >> ENSMUSG00000000037 0 0 0 0 >>> >> 0 >>> >> ENSMUSG00000000049 0 0 3 1 >>> >> 3 >>> >> >>> >> cds <- DESeqDataSetFromMatrix ( >>> >> countData = Counts_set, >>> >> colData = colData, >>> >> design = ~ condition >>> >> ) >>> >> >>> >> fit = DESeq(cds) >>> >> des2Report <- HTMLReport(shortName =paste('RNAseq_analysis_', group1, >>> "_", >>> >> group2, sep=""),title ='RNA-seq analysis of differential expression >>> using >>> >> DESeq2',reportDirectory = "./reports") >>> >> publish(fit,des2Report, >>> pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", >>> >> factor = colData(fit)$condition,reportDir="./reports") >>> >> Error: Ids do not appear to be Entrez Ids for the specified species. >>> >> finish(des2Report) >>> >> >>> >> >>> >> sessionInfo() >>> >>> >>> >> R version 3.1.0 (2014-04-10) >>> >> Platform: x86_64-pc-linux-gnu (64-bit) >>> >> >>> >> locale: >>> >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>> >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>> >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >>> >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>> >> [9] LC_ADDRESS=C LC_TELEPHONE=C >>> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>> >> >>> >> attached base packages: >>> >> [1] parallel stats graphics grDevices utils datasets >>> methods >>> >> [8] base >>> >> >>> >> other attached packages: >>> >> [1] org.Mm.eg.db_2.14.0 ReportingTools_2.4.0 >>> AnnotationDbi_1.26.0 >>> >> [4] Biobase_2.24.0 RSQLite_0.11.4 DBI_0.2-7 >>> >> [7] knitr_1.5 DESeq2_1.4.0 >>> >> RcppArmadillo_0.4.200.0 >>> >> [10] Rcpp_0.11.1 GenomicRanges_1.16.2 >>> GenomeInfoDb_1.0.2 >>> >> [13] IRanges_1.22.3 BiocGenerics_0.10.0 >>> >> >>> >> loaded via a namespace (and not attached): >>> >> [1] annotate_1.42.0 AnnotationForge_1.6.0 >>> >> BatchJobs_1.2 >>> >> [4] BBmisc_1.5 BiocParallel_0.6.0 >>> >> biomaRt_2.20.0 >>> >> [7] Biostrings_2.32.0 biovizBase_1.12.0 >>> >> bitops_1.0-6 >>> >> [10] brew_1.0-6 BSgenome_1.32.0 >>> >> Category_2.30.0 >>> >> [13] cluster_1.14.4 codetools_0.2-8 >>> >> colorspace_1.2-4 >>> >> [16] dichromat_2.0-0 digest_0.6.4 >>> >> edgeR_3.6.0 >>> >> [19] evaluate_0.5.3 fail_1.2 >>> >> foreach_1.4.2 >>> >> [22] formatR_0.10 Formula_1.1-1 >>> >> genefilter_1.46.0 >>> >> [25] geneplotter_1.42.0 GenomicAlignments_1.0.0 >>> >> GenomicFeatures_1.16.0 >>> >> [28] ggbio_1.12.0 ggplot2_0.9.3.1 >>> >> GO.db_2.14.0 >>> >> [31] GOstats_2.30.0 graph_1.42.0 >>> >> grid_3.1.0 >>> >> [34] gridExtra_0.9.1 GSEABase_1.26.0 >>> >> gtable_0.1.2 >>> >> [37] Hmisc_3.14-4 hwriter_1.3 >>> >> iterators_1.0.7 >>> >> [40] lattice_0.20-24 latticeExtra_0.6-26 >>> >> limma_3.20.1 >>> >> [43] locfit_1.5-9.1 MASS_7.3-29 >>> >> Matrix_1.1-2 >>> >> [46] munsell_0.4.2 PFAM.db_2.14.0 >>> >> plyr_1.8.1 >>> >> [49] proto_0.3-10 RBGL_1.40.0 >>> >> RColorBrewer_1.0-5 >>> >> [52] RCurl_1.95-4.1 reshape2_1.2.2 >>> >> R.methodsS3_1.6.1 >>> >> [55] R.oo_1.18.0 Rsamtools_1.16.0 >>> >> rtracklayer_1.24.0 >>> >> [58] R.utils_1.29.8 scales_0.2.4 >>> >> sendmailR_1.1-2 >>> >> [61] splines_3.1.0 stats4_3.1.0 >>> >> stringr_0.6.2 >>> >> [64] survival_2.37-7 tools_3.1.0 >>> >> VariantAnnotation_1.10.0 >>> >> [67] XML_3.98-1.1 xtable_1.7-3 >>> >> XVector_0.4.0 >>> >> [70] zlibbioc_1.10.0 >>> >> >>> >> [[alternative HTML version deleted]] >>> >> >>> >> _______________________________________________ >>> >> Bioconductor mailing list >>> >> Bioconductor@r-project.org >>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> >> Search the archives: http://news.gmane.org/gmane. >>> >> science.biology.informatics.conductor >>> >> >>> > >>> > -- >>> > James W. MacDonald, M.S. >>> > Biostatistician >>> > University of Washington >>> > Environmental and Occupational Health Sciences >>> > 4225 Roosevelt Way NE, # 100 >>> > Seattle WA 98105-6099 >>> > >>> > >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor@r-project.org >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics.conductor >>> >> >> >> >> -- >> Gabriel Becker >> Graduate Student >> Statistics Department >> University of California, Davis >> > > > > -- > Gabriel Becker > Graduate Student > Statistics Department > University of California, Davis > [[alternative HTML version deleted]]

ADD REPLY • link 10.0 years ago Assa Yeroslaviz ★ 1.5k

0

Entering edit mode

Hi Assa, Gabriel actually already gave you the answer, and it is yes. You just have to add things to the .modifyDF argument. There are several examples in http://www.bioconductor.org/packages/release/bioc/vignettes/ReportingT ools/inst/doc/basicReportingTools.pdf and here is one (untested) that should apply to your situation: fun <- function(df, object, ...){ if(!ENSEMBL %in% names(df)) stop("The column name for ensembl ids has to be 'ENSEMBL'!") ensids <- df$ENSEMBL whichcol <- which(names(df) == "ENSEMBL") annot <- select(org.Mm.eg.db, ensids, c("SYMBOL","GENENAME"), "ENSEMBL") if(nrow(annot) > nrow(df)) annot <- annot[!duplicated(annot[,1]),] df <- data.frame(annot, df[,-whichcol]) df$ENSEMBL <- hwrite(as.character(df$ENSEMBL), link = paste0(" http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=", as.character(df$ENSEMBL)), table = FALSE) df } This function implicitly assumes (and checks) that there is an ENSEMBL column in your data.frame that it can use to extract the Ensembl IDs. It also assumes that your species is human, and that you have the org.Mm.eg.db package already loaded. It then gets the symbol and genename for those IDs, and does a really naive subsetting of the data if there are duplicates. Other more sophisticated things are possible, but I leave it to you to make any such modifications. You would use this (as Gabriel already said), as part of an argument passed in via .modifyDF. You also need modifyReportDF as well. So your publish argument would now look like publish(fit,des2Report, pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", factor = colData(fit)$condition,reportDir="./reports", .modifyDF = list(modifyReportDF, fun)) That at least is the basic idea, and you might need to play around to make things work correctly. Best, Jim On 4/25/2014 4:21 AM, Assa Yeroslaviz wrote: > Hi Gabriel, > > Thanks for the quick answer I will look into that as soon as I have > the time. > Another question was if it is possible to work directy with the > Ensembl IDs. > > I have a table of ~37K ensembl Ids, for which almost 50% have no > Entrez Ids, so I can't convert them. Is there a way to work directly > with the Ensembl IDs and still benefit from the annotation.de > <http: annotation.de=""> possibilities? > > Thanks > > Assa > > > On Thu, Apr 24, 2014 at 4:48 PM, Gabriel Becker <gmbecker at="" ucdavis.edu=""> <mailto:gmbecker at="" ucdavis.edu="">> wrote: > > I wrote my previous message too quickly. Apologies. > > Your functions must have the signature > > function(df, object, ...) > > df is current data.frame represenation of the object, > object is the *original* object (so that the class can be identified), > ... are passed in from the call to publish > > And you can just place the generic modifyReportDF function at the > beginning of the list, rather than using getMethod. The getMethod > thing I said is for when you want to apply the default handling > for a *different* class to your object. It is a rare use-case, but > came up recently so it was on my mind. > > That will teach me to respond quickly to emails early in the morning. > > Sorry about that. > > ~G > > > On Thu, Apr 24, 2014 at 7:18 AM, Gabriel Becker > <gmbecker at="" ucdavis.edu="" <mailto:gmbecker="" at="" ucdavis.edu="">> wrote: > > Assa, > > In general yes, if you want to add to the table you will be > working with the data.frame. > > You can do so after the initial conversion, though, so you > don't have to recreate the wheel to get from your object to an > initial data.frame. > > To modify the default table (data.frame) generated for an > object, you can pass publish()'s .modifyDF parameter a > function of list of functions, each of which should accept > object (the data.frame) and "..." and return a data.frame. > > These will be called in order, each accepting the output from > the last. The output of the final function is what will be > transformed into HTML and inserted into the report. > > You'll probably want to add the default handling of your > object type, which you can do by putting > getMethod("modifyReportDF", "<your object's="" class="">") at the > beginning of the list. > > See section 4 of the ReportingTools basics vignette for > example code. > > HTH, > ~G > > > On Thu, Apr 24, 2014 at 6:54 AM, Assa Yeroslaviz > <frymor at="" gmail.com="" <mailto:frymor="" at="" gmail.com="">> wrote: > > Thanks Jim, > > I have found in one of the forums a response from Jason > (thanks again) for > the option to set annotation.db=NULL and though force the > publish command > to work with the Ids I provide in the DESeqDataSet object. > > So this is now working, But I would like to have also the > option to add > some annotations to the table. > > Is this only possible when working directly with a data > .frame? > > Thanks again > Assa > > On Thu, Apr 24, 2014 at 3:45 PM, James W. MacDonald > <jmacdon at="" uw.edu="" <mailto:jmacdon="" at="" uw.edu="">> wrote: > > > Hi Assa, > > > > There may well be a way to work with Ensembl IDs, and > you will likely get > > an answer to your direct question from one of the > maintainers. > > > > However you should note that ReportingTools simply takes > the input object > > and then coerces the data to a data.frame, which is then > used to create the > > HTML table. You can always create the data.frame to your > own liking up > > front, and then pass that to publish(). While this is > more work than just > > passing in the DESeqDataSet, you do have complete > control over the output. > > > > Best, > > > > Jim > > > > > > > > On 4/24/2014 8:50 AM, Assa Yeroslaviz wrote: > > > >> Hi, > >> > >> Is it neccessary to have entrez gene IDs to work with > this package? > >> > >> I am working on a dataset with Ensembl IDs. Do I need > to convert them to > >> Entrez? > >> > >> When trying to create a report for a DESeqDataSet or > DESeqResults objects > >> i > >> am getting the error messege: > >> > >> Error: Ids do not appear to be Entrez Ids for the > specified species. > >> > >> Is there a way to work straight with the ensembl IDs? > >> > >> Thanks > >> > >> Assa > >> > >> my script: > >> > >> head(Counts_set) > >> A_pKO_aV_FCS G_pKO_aV_FCS M_pKO_aV_FCS D_pKO_aV > >> J_pKO_aV > >> ENSMUSG00000000001 4744 4632 4535 4748 > >> 3736 > >> ENSMUSG00000000003 0 0 0 0 > >> 0 > >> ENSMUSG00000000028 1246 1420 1429 2304 > >> 1261 > >> ENSMUSG00000000031 3 25 65 0 > >> 50 > >> ENSMUSG00000000037 0 0 0 0 > >> 0 > >> ENSMUSG00000000049 0 0 3 1 > >> 3 > >> > >> cds <- DESeqDataSetFromMatrix ( > >> countData = Counts_set, > >> colData = colData, > >> design = ~ condition > >> ) > >> > >> fit = DESeq(cds) > >> des2Report <- HTMLReport(shortName > =paste('RNAseq_analysis_', group1, "_", > >> group2, sep=""),title ='RNA-seq analysis of > differential expression using > >> DESeq2',reportDirectory = "./reports") > >> publish(fit,des2Report, > pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", > >> factor = colData(fit)$condition,reportDir="./reports") > >> Error: Ids do not appear to be Entrez Ids for the > specified species. > >> finish(des2Report) > >> > >> > >> sessionInfo() > >>> > >> R version 3.1.0 (2014-04-10) > >> Platform: x86_64-pc-linux-gnu (64-bit) > >> > >> locale: > >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 > >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > >> [9] LC_ADDRESS=C LC_TELEPHONE=C > >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > >> > >> attached base packages: > >> [1] parallel stats graphics grDevices utils datasets > methods > >> [8] base > >> > >> other attached packages: > >> [1] org.Mm.eg.db_2.14.0 ReportingTools_2.4.0 > AnnotationDbi_1.26.0 > >> [4] Biobase_2.24.0 RSQLite_0.11.4 DBI_0.2-7 > >> [7] knitr_1.5 DESeq2_1.4.0 > >> RcppArmadillo_0.4.200.0 > >> [10] Rcpp_0.11.1 GenomicRanges_1.16.2 GenomeInfoDb_1.0.2 > >> [13] IRanges_1.22.3 BiocGenerics_0.10.0 > >> > >> loaded via a namespace (and not attached): > >> [1] annotate_1.42.0 AnnotationForge_1.6.0 > >> BatchJobs_1.2 > >> [4] BBmisc_1.5 BiocParallel_0.6.0 > >> biomaRt_2.20.0 > >> [7] Biostrings_2.32.0 biovizBase_1.12.0 > >> bitops_1.0-6 > >> [10] brew_1.0-6 BSgenome_1.32.0 > >> Category_2.30.0 > >> [13] cluster_1.14.4 codetools_0.2-8 > >> colorspace_1.2-4 > >> [16] dichromat_2.0-0 digest_0.6.4 > >> edgeR_3.6.0 > >> [19] evaluate_0.5.3 fail_1.2 > >> foreach_1.4.2 > >> [22] formatR_0.10 Formula_1.1-1 > >> genefilter_1.46.0 > >> [25] geneplotter_1.42.0 GenomicAlignments_1.0.0 > >> GenomicFeatures_1.16.0 > >> [28] ggbio_1.12.0 ggplot2_0.9.3.1 > >> GO.db_2.14.0 > >> [31] GOstats_2.30.0 graph_1.42.0 > >> grid_3.1.0 > >> [34] gridExtra_0.9.1 GSEABase_1.26.0 > >> gtable_0.1.2 > >> [37] Hmisc_3.14-4 hwriter_1.3 > >> iterators_1.0.7 > >> [40] lattice_0.20-24 latticeExtra_0.6-26 > >> limma_3.20.1 > >> [43] locfit_1.5-9.1 MASS_7.3-29 > >> Matrix_1.1-2 > >> [46] munsell_0.4.2 PFAM.db_2.14.0 > >> plyr_1.8.1 > >> [49] proto_0.3-10 RBGL_1.40.0 > >> RColorBrewer_1.0-5 > >> [52] RCurl_1.95-4.1 reshape2_1.2.2 > >> R.methodsS3_1.6.1 > >> [55] R.oo_1.18.0 Rsamtools_1.16.0 > >> rtracklayer_1.24.0 > >> [58] R.utils_1.29.8 scales_0.2.4 > >> sendmailR_1.1-2 > >> [61] splines_3.1.0 stats4_3.1.0 > >> stringr_0.6.2 > >> [64] survival_2.37-7 tools_3.1.0 > >> VariantAnnotation_1.10.0 > >> [67] XML_3.98-1.1 xtable_1.7-3 > >> XVector_0.4.0 > >> [70] zlibbioc_1.10.0 > >> > >> [[alternative HTML version deleted]] > >> > >> _______________________________________________ > >> Bioconductor mailing list > >> Bioconductor at r-project.org > <mailto:bioconductor at="" r-project.org=""> > >> https://stat.ethz.ch/mailman/listinfo/bioconductor > >> Search the archives: http://news.gmane.org/gmane. > >> science.biology.informatics.conductor > >> > > > > -- > > James W. MacDonald, M.S. > > Biostatistician > > University of Washington > > Environmental and Occupational Health Sciences > > 4225 Roosevelt Way NE, # 100 > > Seattle WA 98105-6099 > > > > > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org=""> > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: > http://news.gmane.org/gmane.science.biology.informatics.conductor > > > > > -- > Gabriel Becker > Graduate Student > Statistics Department > University of California, Davis > > > > > -- > Gabriel Becker > Graduate Student > Statistics Department > University of California, Davis > > -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 10.0 years ago James W. MacDonald 65k

0

Entering edit mode

Hi Jim, thanks for the tip. Unfortunately i am not sure i understand the idea behind it. You say, it is possible to work straight with the DESeqDataSet Object, but than the function expects a data.frame to work with. If I understand the mechanism with which the publish function is working - it takes the DESeqDataSet obejct and, using the results function, coerce it into a data.frame. This is the function I ended up using: fun <- function(df, object, ...){ df$ENSEMBL <- rownames(df) annot <- select(org.Mm.eg.db, df$ENSEMBL, c("SYMBOL","GENENAME"), "ENSEMBL") if(nrow(annot) > nrow(df)) annot <- annot[!duplicated(annot[,1]),] df <- data.frame(annot, df) df <- df[ , -which(names(df) %in% c("ENSEMBL.1"))] df$ENSEMBL <- hwrite(as.character(df$ENSEMBL), link = paste0(" http://www.ensembl.org/Mus_musculus/Gene/Summary?g=", as.character(df$ENSEMBL)), table = FALSE) df } As you can see, I changes the column df$ENSEMBL into the rownames of the coerced df. this is because the fit object doen't have a column name ENSEMBL. Q. Is there a way to add coluns to the object? Am I doing it in the most efficient way? thanks for the help and the tip about the Ensembl links (mouse genome - Mm). Assa On Fri, Apr 25, 2014 at 3:43 PM, James W. MacDonald <jmacdon@uw.edu> wrote: > Hi Assa, > > Gabriel actually already gave you the answer, and it is yes. You just have > to add things to the .modifyDF argument. There are several examples in > > http://www.bioconductor.org/packages/release/bioc/ > vignettes/ReportingTools/inst/doc/basicReportingTools.pdf > > and here is one (untested) that should apply to your situation: > > fun <- function(df, object, ...){ > if(!ENSEMBL %in% names(df)) > stop("The column name for ensembl ids has to be 'ENSEMBL'!") > ensids <- df$ENSEMBL > whichcol <- which(names(df) == "ENSEMBL") > annot <- select(org.Mm.eg.db, ensids, c("SYMBOL","GENENAME"), > "ENSEMBL") > if(nrow(annot) > nrow(df)) annot <- annot[!duplicated(annot[,1]),] > df <- data.frame(annot, df[,-whichcol]) > df$ENSEMBL <- hwrite(as.character(df$ENSEMBL), > link = paste0(" http://www.ensembl.org/Homo_ > sapiens/Gene/Summary?g=", > as.character(df$ENSEMBL)), table = FALSE) > df > } > > > This function implicitly assumes (and checks) that there is an ENSEMBL > column in your data.frame that it can use to extract the Ensembl IDs. It > also assumes that your species is human, and that you have the org.Mm.eg.db > package already loaded. It then gets the symbol and genename for those IDs, > and does a really naive subsetting of the data if there are duplicates. > Other more sophisticated things are possible, but I leave it to you to make > any such modifications. > > You would use this (as Gabriel already said), as part of an argument > passed in via .modifyDF. You also need modifyReportDF as well. So your > publish argument would now look like > > publish(fit,des2Report, pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", > factor = colData(fit)$condition,reportDir="./reports", .modifyDF = > list(modifyReportDF, fun)) > > That at least is the basic idea, and you might need to play around to make > things work correctly. > > Best, > > Jim > > > > On 4/25/2014 4:21 AM, Assa Yeroslaviz wrote: > >> Hi Gabriel, >> >> Thanks for the quick answer I will look into that as soon as I have the >> time. >> Another question was if it is possible to work directy with the Ensembl >> IDs. >> >> I have a table of ~37K ensembl Ids, for which almost 50% have no Entrez >> Ids, so I can't convert them. Is there a way to work directly with the >> Ensembl IDs and still benefit from the annotation.de < >> http://annotation.de> possibilities? >> >> Thanks >> >> Assa >> >> >> >> On Thu, Apr 24, 2014 at 4:48 PM, Gabriel Becker <gmbecker@ucdavis.edu<mailto:>> gmbecker@ucdavis.edu>> wrote: >> >> I wrote my previous message too quickly. Apologies. >> >> Your functions must have the signature >> >> function(df, object, ...) >> >> df is current data.frame represenation of the object, >> object is the *original* object (so that the class can be identified), >> ... are passed in from the call to publish >> >> And you can just place the generic modifyReportDF function at the >> beginning of the list, rather than using getMethod. The getMethod >> thing I said is for when you want to apply the default handling >> for a *different* class to your object. It is a rare use-case, but >> came up recently so it was on my mind. >> >> That will teach me to respond quickly to emails early in the morning. >> >> Sorry about that. >> >> ~G >> >> >> On Thu, Apr 24, 2014 at 7:18 AM, Gabriel Becker >> <gmbecker@ucdavis.edu <mailto:gmbecker@ucdavis.edu="">> wrote: >> >> Assa, >> >> In general yes, if you want to add to the table you will be >> working with the data.frame. >> >> You can do so after the initial conversion, though, so you >> don't have to recreate the wheel to get from your object to an >> initial data.frame. >> >> To modify the default table (data.frame) generated for an >> object, you can pass publish()'s .modifyDF parameter a >> function of list of functions, each of which should accept >> object (the data.frame) and "..." and return a data.frame. >> >> These will be called in order, each accepting the output from >> the last. The output of the final function is what will be >> transformed into HTML and inserted into the report. >> >> You'll probably want to add the default handling of your >> object type, which you can do by putting >> getMethod("modifyReportDF", "<your object's="" class="">") at the >> beginning of the list. >> >> See section 4 of the ReportingTools basics vignette for >> example code. >> >> HTH, >> ~G >> >> >> On Thu, Apr 24, 2014 at 6:54 AM, Assa Yeroslaviz >> <frymor@gmail.com <mailto:frymor@gmail.com="">> wrote: >> >> Thanks Jim, >> >> I have found in one of the forums a response from Jason >> (thanks again) for >> the option to set annotation.db=NULL and though force the >> publish command >> to work with the Ids I provide in the DESeqDataSet object. >> >> So this is now working, But I would like to have also the >> option to add >> some annotations to the table. >> >> Is this only possible when working directly with a data >> .frame? >> >> Thanks again >> Assa >> >> On Thu, Apr 24, 2014 at 3:45 PM, James W. MacDonald >> <jmacdon@uw.edu <mailto:jmacdon@uw.edu="">> wrote: >> >> > Hi Assa, >> > >> > There may well be a way to work with Ensembl IDs, and >> you will likely get >> > an answer to your direct question from one of the >> maintainers. >> > >> > However you should note that ReportingTools simply takes >> the input object >> > and then coerces the data to a data.frame, which is then >> used to create the >> > HTML table. You can always create the data.frame to your >> own liking up >> > front, and then pass that to publish(). While this is >> more work than just >> > passing in the DESeqDataSet, you do have complete >> control over the output. >> > >> > Best, >> > >> > Jim >> > >> > >> > >> > On 4/24/2014 8:50 AM, Assa Yeroslaviz wrote: >> > >> >> Hi, >> >> >> >> Is it neccessary to have entrez gene IDs to work with >> this package? >> >> >> >> I am working on a dataset with Ensembl IDs. Do I need >> to convert them to >> >> Entrez? >> >> >> >> When trying to create a report for a DESeqDataSet or >> DESeqResults objects >> >> i >> >> am getting the error messege: >> >> >> >> Error: Ids do not appear to be Entrez Ids for the >> specified species. >> >> >> >> Is there a way to work straight with the ensembl IDs? >> >> >> >> Thanks >> >> >> >> Assa >> >> >> >> my script: >> >> >> >> head(Counts_set) >> >> A_pKO_aV_FCS G_pKO_aV_FCS M_pKO_aV_FCS D_pKO_aV >> >> J_pKO_aV >> >> ENSMUSG00000000001 4744 4632 4535 4748 >> >> 3736 >> >> ENSMUSG00000000003 0 0 0 0 >> >> 0 >> >> ENSMUSG00000000028 1246 1420 1429 2304 >> >> 1261 >> >> ENSMUSG00000000031 3 25 65 0 >> >> 50 >> >> ENSMUSG00000000037 0 0 0 0 >> >> 0 >> >> ENSMUSG00000000049 0 0 3 1 >> >> 3 >> >> >> >> cds <- DESeqDataSetFromMatrix ( >> >> countData = Counts_set, >> >> colData = colData, >> >> design = ~ condition >> >> ) >> >> >> >> fit = DESeq(cds) >> >> des2Report <- HTMLReport(shortName >> =paste('RNAseq_analysis_', group1, "_", >> >> group2, sep=""),title ='RNA-seq analysis of >> differential expression using >> >> DESeq2',reportDirectory = "./reports") >> >> publish(fit,des2Report, >> pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", >> >> factor = colData(fit)$condition,reportDir="./reports") >> >> Error: Ids do not appear to be Entrez Ids for the >> specified species. >> >> finish(des2Report) >> >> >> >> >> >> sessionInfo() >> >>> >> >> R version 3.1.0 (2014-04-10) >> >> Platform: x86_64-pc-linux-gnu (64-bit) >> >> >> >> locale: >> >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> >> >> attached base packages: >> >> [1] parallel stats graphics grDevices utils datasets >> methods >> >> [8] base >> >> >> >> other attached packages: >> >> [1] org.Mm.eg.db_2.14.0 ReportingTools_2.4.0 >> AnnotationDbi_1.26.0 >> >> [4] Biobase_2.24.0 RSQLite_0.11.4 DBI_0.2-7 >> >> [7] knitr_1.5 DESeq2_1.4.0 >> >> RcppArmadillo_0.4.200.0 >> >> [10] Rcpp_0.11.1 GenomicRanges_1.16.2 GenomeInfoDb_1.0.2 >> >> [13] IRanges_1.22.3 BiocGenerics_0.10.0 >> >> >> >> loaded via a namespace (and not attached): >> >> [1] annotate_1.42.0 AnnotationForge_1.6.0 >> >> BatchJobs_1.2 >> >> [4] BBmisc_1.5 BiocParallel_0.6.0 >> >> biomaRt_2.20.0 >> >> [7] Biostrings_2.32.0 biovizBase_1.12.0 >> >> bitops_1.0-6 >> >> [10] brew_1.0-6 BSgenome_1.32.0 >> >> Category_2.30.0 >> >> [13] cluster_1.14.4 codetools_0.2-8 >> >> colorspace_1.2-4 >> >> [16] dichromat_2.0-0 digest_0.6.4 >> >> edgeR_3.6.0 >> >> [19] evaluate_0.5.3 fail_1.2 >> >> foreach_1.4.2 >> >> [22] formatR_0.10 Formula_1.1-1 >> >> genefilter_1.46.0 >> >> [25] geneplotter_1.42.0 GenomicAlignments_1.0.0 >> >> GenomicFeatures_1.16.0 >> >> [28] ggbio_1.12.0 ggplot2_0.9.3.1 >> >> GO.db_2.14.0 >> >> [31] GOstats_2.30.0 graph_1.42.0 >> >> grid_3.1.0 >> >> [34] gridExtra_0.9.1 GSEABase_1.26.0 >> >> gtable_0.1.2 >> >> [37] Hmisc_3.14-4 hwriter_1.3 >> >> iterators_1.0.7 >> >> [40] lattice_0.20-24 latticeExtra_0.6-26 >> >> limma_3.20.1 >> >> [43] locfit_1.5-9.1 MASS_7.3-29 >> >> Matrix_1.1-2 >> >> [46] munsell_0.4.2 PFAM.db_2.14.0 >> >> plyr_1.8.1 >> >> [49] proto_0.3-10 RBGL_1.40.0 >> >> RColorBrewer_1.0-5 >> >> [52] RCurl_1.95-4.1 reshape2_1.2.2 >> >> R.methodsS3_1.6.1 >> >> [55] R.oo_1.18.0 Rsamtools_1.16.0 >> >> rtracklayer_1.24.0 >> >> [58] R.utils_1.29.8 scales_0.2.4 >> >> sendmailR_1.1-2 >> >> [61] splines_3.1.0 stats4_3.1.0 >> >> stringr_0.6.2 >> >> [64] survival_2.37-7 tools_3.1.0 >> >> VariantAnnotation_1.10.0 >> >> [67] XML_3.98-1.1 xtable_1.7-3 >> >> XVector_0.4.0 >> >> [70] zlibbioc_1.10.0 >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> _______________________________________________ >> >> Bioconductor mailing list >> >> Bioconductor@r-project.org >> <mailto:bioconductor@r-project.org> >> >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> >> Search the archives: http://news.gmane.org/gmane. >> >> science.biology.informatics.conductor >> >> >> > >> > -- >> > James W. MacDonald, M.S. >> > Biostatistician >> > University of Washington >> > Environmental and Occupational Health Sciences >> > 4225 Roosevelt Way NE, # 100 >> > Seattle WA 98105-6099 >> > >> > >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor@r-project.org <mailto:bioconductor@r-project.org>> > >> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: >> http://news.gmane.org/gmane.science.biology.informatics. >> conductor >> >> >> >> >> -- Gabriel Becker >> Graduate Student >> Statistics Department >> University of California, Davis >> >> >> >> >> -- Gabriel Becker >> Graduate Student >> Statistics Department >> University of California, Davis >> >> >> > -- > James W. MacDonald, M.S. > Biostatistician > University of Washington > Environmental and Occupational Health Sciences > 4225 Roosevelt Way NE, # 100 > Seattle WA 98105-6099 > > [[alternative HTML version deleted]]

ADD REPLY • link 10.0 years ago Assa Yeroslaviz ★ 1.5k

0

Entering edit mode

hi Assa, If you look up the help for ?"publish-methods", there is support for DESeqResults (the 4th data type listed). DESeqResults is the DataFrame produced by DESeq2::results(). The point of creating this class was to help simplify the hand-off to ReportingTools. Maybe this will help? Mike On Tue, Apr 29, 2014 at 9:27 AM, Assa Yeroslaviz <frymor at="" gmail.com=""> wrote: > Hi Jim, > > thanks for the tip. > Unfortunately i am not sure i understand the idea behind it. > > You say, it is possible to work straight with the DESeqDataSet Object, but > than the function expects a data.frame to work with. If I understand the > mechanism with which the publish function is working - it takes the > DESeqDataSet obejct and, using the results function, coerce it into a > data.frame. > > This is the function I ended up using: > > fun <- function(df, object, ...){ > df$ENSEMBL <- rownames(df) > annot <- select(org.Mm.eg.db, df$ENSEMBL, c("SYMBOL","GENENAME"), > "ENSEMBL") > if(nrow(annot) > nrow(df)) annot <- annot[!duplicated(annot[,1]),] > df <- data.frame(annot, df) > df <- df[ , -which(names(df) %in% c("ENSEMBL.1"))] > df$ENSEMBL <- hwrite(as.character(df$ENSEMBL), > link = paste0(" > http://www.ensembl.org/Mus_musculus/Gene/Summary?g=", > as.character(df$ENSEMBL)), table = FALSE) > df > } > > > As you can see, I changes the column df$ENSEMBL into the rownames of the > coerced df. this is because the fit object doen't have a column name > ENSEMBL. > > Q. Is there a way to add coluns to the object? > > Am I doing it in the most efficient way? > > thanks for the help and the tip about the Ensembl links (mouse genome - > Mm). > > > Assa > > > > On Fri, Apr 25, 2014 at 3:43 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: > >> Hi Assa, >> >> Gabriel actually already gave you the answer, and it is yes. You just have >> to add things to the .modifyDF argument. There are several examples in >> >> http://www.bioconductor.org/packages/release/bioc/ >> vignettes/ReportingTools/inst/doc/basicReportingTools.pdf >> >> and here is one (untested) that should apply to your situation: >> >> fun <- function(df, object, ...){ >> if(!ENSEMBL %in% names(df)) >> stop("The column name for ensembl ids has to be 'ENSEMBL'!") >> ensids <- df$ENSEMBL >> whichcol <- which(names(df) == "ENSEMBL") >> annot <- select(org.Mm.eg.db, ensids, c("SYMBOL","GENENAME"), >> "ENSEMBL") >> if(nrow(annot) > nrow(df)) annot <- annot[!duplicated(annot[,1]),] >> df <- data.frame(annot, df[,-whichcol]) >> df$ENSEMBL <- hwrite(as.character(df$ENSEMBL), >> link = paste0(" http://www.ensembl.org/Homo_ >> sapiens/Gene/Summary?g=", >> as.character(df$ENSEMBL)), table = FALSE) >> df >> } >> >> >> This function implicitly assumes (and checks) that there is an ENSEMBL >> column in your data.frame that it can use to extract the Ensembl IDs. It >> also assumes that your species is human, and that you have the org.Mm.eg.db >> package already loaded. It then gets the symbol and genename for those IDs, >> and does a really naive subsetting of the data if there are duplicates. >> Other more sophisticated things are possible, but I leave it to you to make >> any such modifications. >> >> You would use this (as Gabriel already said), as part of an argument >> passed in via .modifyDF. You also need modifyReportDF as well. So your >> publish argument would now look like >> >> publish(fit,des2Report, pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", >> factor = colData(fit)$condition,reportDir="./reports", .modifyDF = >> list(modifyReportDF, fun)) >> >> That at least is the basic idea, and you might need to play around to make >> things work correctly. >> >> Best, >> >> Jim >> >> >> >> On 4/25/2014 4:21 AM, Assa Yeroslaviz wrote: >> >>> Hi Gabriel, >>> >>> Thanks for the quick answer I will look into that as soon as I have the >>> time. >>> Another question was if it is possible to work directy with the Ensembl >>> IDs. >>> >>> I have a table of ~37K ensembl Ids, for which almost 50% have no Entrez >>> Ids, so I can't convert them. Is there a way to work directly with the >>> Ensembl IDs and still benefit from the annotation.de < >>> http://annotation.de> possibilities? >>> >>> Thanks >>> >>> Assa >>> >>> >>> >>> On Thu, Apr 24, 2014 at 4:48 PM, Gabriel Becker <gmbecker at="" ucdavis.edu<mailto:="">>> gmbecker at ucdavis.edu>> wrote: >>> >>> I wrote my previous message too quickly. Apologies. >>> >>> Your functions must have the signature >>> >>> function(df, object, ...) >>> >>> df is current data.frame represenation of the object, >>> object is the *original* object (so that the class can be identified), >>> ... are passed in from the call to publish >>> >>> And you can just place the generic modifyReportDF function at the >>> beginning of the list, rather than using getMethod. The getMethod >>> thing I said is for when you want to apply the default handling >>> for a *different* class to your object. It is a rare use-case, but >>> came up recently so it was on my mind. >>> >>> That will teach me to respond quickly to emails early in the morning. >>> >>> Sorry about that. >>> >>> ~G >>> >>> >>> On Thu, Apr 24, 2014 at 7:18 AM, Gabriel Becker >>> <gmbecker at="" ucdavis.edu="" <mailto:gmbecker="" at="" ucdavis.edu="">> wrote: >>> >>> Assa, >>> >>> In general yes, if you want to add to the table you will be >>> working with the data.frame. >>> >>> You can do so after the initial conversion, though, so you >>> don't have to recreate the wheel to get from your object to an >>> initial data.frame. >>> >>> To modify the default table (data.frame) generated for an >>> object, you can pass publish()'s .modifyDF parameter a >>> function of list of functions, each of which should accept >>> object (the data.frame) and "..." and return a data.frame. >>> >>> These will be called in order, each accepting the output from >>> the last. The output of the final function is what will be >>> transformed into HTML and inserted into the report. >>> >>> You'll probably want to add the default handling of your >>> object type, which you can do by putting >>> getMethod("modifyReportDF", "<your object's="" class="">") at the >>> beginning of the list. >>> >>> See section 4 of the ReportingTools basics vignette for >>> example code. >>> >>> HTH, >>> ~G >>> >>> >>> On Thu, Apr 24, 2014 at 6:54 AM, Assa Yeroslaviz >>> <frymor at="" gmail.com="" <mailto:frymor="" at="" gmail.com="">> wrote: >>> >>> Thanks Jim, >>> >>> I have found in one of the forums a response from Jason >>> (thanks again) for >>> the option to set annotation.db=NULL and though force the >>> publish command >>> to work with the Ids I provide in the DESeqDataSet object. >>> >>> So this is now working, But I would like to have also the >>> option to add >>> some annotations to the table. >>> >>> Is this only possible when working directly with a data >>> .frame? >>> >>> Thanks again >>> Assa >>> >>> On Thu, Apr 24, 2014 at 3:45 PM, James W. MacDonald >>> <jmacdon at="" uw.edu="" <mailto:jmacdon="" at="" uw.edu="">> wrote: >>> >>> > Hi Assa, >>> > >>> > There may well be a way to work with Ensembl IDs, and >>> you will likely get >>> > an answer to your direct question from one of the >>> maintainers. >>> > >>> > However you should note that ReportingTools simply takes >>> the input object >>> > and then coerces the data to a data.frame, which is then >>> used to create the >>> > HTML table. You can always create the data.frame to your >>> own liking up >>> > front, and then pass that to publish(). While this is >>> more work than just >>> > passing in the DESeqDataSet, you do have complete >>> control over the output. >>> > >>> > Best, >>> > >>> > Jim >>> > >>> > >>> > >>> > On 4/24/2014 8:50 AM, Assa Yeroslaviz wrote: >>> > >>> >> Hi, >>> >> >>> >> Is it neccessary to have entrez gene IDs to work with >>> this package? >>> >> >>> >> I am working on a dataset with Ensembl IDs. Do I need >>> to convert them to >>> >> Entrez? >>> >> >>> >> When trying to create a report for a DESeqDataSet or >>> DESeqResults objects >>> >> i >>> >> am getting the error messege: >>> >> >>> >> Error: Ids do not appear to be Entrez Ids for the >>> specified species. >>> >> >>> >> Is there a way to work straight with the ensembl IDs? >>> >> >>> >> Thanks >>> >> >>> >> Assa >>> >> >>> >> my script: >>> >> >>> >> head(Counts_set) >>> >> A_pKO_aV_FCS G_pKO_aV_FCS M_pKO_aV_FCS D_pKO_aV >>> >> J_pKO_aV >>> >> ENSMUSG00000000001 4744 4632 4535 4748 >>> >> 3736 >>> >> ENSMUSG00000000003 0 0 0 0 >>> >> 0 >>> >> ENSMUSG00000000028 1246 1420 1429 2304 >>> >> 1261 >>> >> ENSMUSG00000000031 3 25 65 0 >>> >> 50 >>> >> ENSMUSG00000000037 0 0 0 0 >>> >> 0 >>> >> ENSMUSG00000000049 0 0 3 1 >>> >> 3 >>> >> >>> >> cds <- DESeqDataSetFromMatrix ( >>> >> countData = Counts_set, >>> >> colData = colData, >>> >> design = ~ condition >>> >> ) >>> >> >>> >> fit = DESeq(cds) >>> >> des2Report <- HTMLReport(shortName >>> =paste('RNAseq_analysis_', group1, "_", >>> >> group2, sep=""),title ='RNA-seq analysis of >>> differential expression using >>> >> DESeq2',reportDirectory = "./reports") >>> >> publish(fit,des2Report, >>> pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", >>> >> factor = colData(fit)$condition,reportDir="./reports") >>> >> Error: Ids do not appear to be Entrez Ids for the >>> specified species. >>> >> finish(des2Report) >>> >> >>> >> >>> >> sessionInfo() >>> >>> >>> >> R version 3.1.0 (2014-04-10) >>> >> Platform: x86_64-pc-linux-gnu (64-bit) >>> >> >>> >> locale: >>> >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>> >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>> >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >>> >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>> >> [9] LC_ADDRESS=C LC_TELEPHONE=C >>> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>> >> >>> >> attached base packages: >>> >> [1] parallel stats graphics grDevices utils datasets >>> methods >>> >> [8] base >>> >> >>> >> other attached packages: >>> >> [1] org.Mm.eg.db_2.14.0 ReportingTools_2.4.0 >>> AnnotationDbi_1.26.0 >>> >> [4] Biobase_2.24.0 RSQLite_0.11.4 DBI_0.2-7 >>> >> [7] knitr_1.5 DESeq2_1.4.0 >>> >> RcppArmadillo_0.4.200.0 >>> >> [10] Rcpp_0.11.1 GenomicRanges_1.16.2 GenomeInfoDb_1.0.2 >>> >> [13] IRanges_1.22.3 BiocGenerics_0.10.0 >>> >> >>> >> loaded via a namespace (and not attached): >>> >> [1] annotate_1.42.0 AnnotationForge_1.6.0 >>> >> BatchJobs_1.2 >>> >> [4] BBmisc_1.5 BiocParallel_0.6.0 >>> >> biomaRt_2.20.0 >>> >> [7] Biostrings_2.32.0 biovizBase_1.12.0 >>> >> bitops_1.0-6 >>> >> [10] brew_1.0-6 BSgenome_1.32.0 >>> >> Category_2.30.0 >>> >> [13] cluster_1.14.4 codetools_0.2-8 >>> >> colorspace_1.2-4 >>> >> [16] dichromat_2.0-0 digest_0.6.4 >>> >> edgeR_3.6.0 >>> >> [19] evaluate_0.5.3 fail_1.2 >>> >> foreach_1.4.2 >>> >> [22] formatR_0.10 Formula_1.1-1 >>> >> genefilter_1.46.0 >>> >> [25] geneplotter_1.42.0 GenomicAlignments_1.0.0 >>> >> GenomicFeatures_1.16.0 >>> >> [28] ggbio_1.12.0 ggplot2_0.9.3.1 >>> >> GO.db_2.14.0 >>> >> [31] GOstats_2.30.0 graph_1.42.0 >>> >> grid_3.1.0 >>> >> [34] gridExtra_0.9.1 GSEABase_1.26.0 >>> >> gtable_0.1.2 >>> >> [37] Hmisc_3.14-4 hwriter_1.3 >>> >> iterators_1.0.7 >>> >> [40] lattice_0.20-24 latticeExtra_0.6-26 >>> >> limma_3.20.1 >>> >> [43] locfit_1.5-9.1 MASS_7.3-29 >>> >> Matrix_1.1-2 >>> >> [46] munsell_0.4.2 PFAM.db_2.14.0 >>> >> plyr_1.8.1 >>> >> [49] proto_0.3-10 RBGL_1.40.0 >>> >> RColorBrewer_1.0-5 >>> >> [52] RCurl_1.95-4.1 reshape2_1.2.2 >>> >> R.methodsS3_1.6.1 >>> >> [55] R.oo_1.18.0 Rsamtools_1.16.0 >>> >> rtracklayer_1.24.0 >>> >> [58] R.utils_1.29.8 scales_0.2.4 >>> >> sendmailR_1.1-2 >>> >> [61] splines_3.1.0 stats4_3.1.0 >>> >> stringr_0.6.2 >>> >> [64] survival_2.37-7 tools_3.1.0 >>> >> VariantAnnotation_1.10.0 >>> >> [67] XML_3.98-1.1 xtable_1.7-3 >>> >> XVector_0.4.0 >>> >> [70] zlibbioc_1.10.0 >>> >> >>> >> [[alternative HTML version deleted]] >>> >> >>> >> _______________________________________________ >>> >> Bioconductor mailing list >>> >> Bioconductor at r-project.org >>> <mailto:bioconductor at="" r-project.org=""> >>> >>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> >> Search the archives: http://news.gmane.org/gmane. >>> >> science.biology.informatics.conductor >>> >> >>> > >>> > -- >>> > James W. MacDonald, M.S. >>> > Biostatistician >>> > University of Washington >>> > Environmental and Occupational Health Sciences >>> > 4225 Roosevelt Way NE, # 100 >>> > Seattle WA 98105-6099 >>> > >>> > >>> >>> [[alternative HTML version deleted]] >>> >>> _______________________________________________ >>> Bioconductor mailing list >>> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org="">>> > >>> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>> Search the archives: >>> http://news.gmane.org/gmane.science.biology.informatics. >>> conductor >>> >>> >>> >>> >>> -- Gabriel Becker >>> Graduate Student >>> Statistics Department >>> University of California, Davis >>> >>> >>> >>> >>> -- Gabriel Becker >>> Graduate Student >>> Statistics Department >>> University of California, Davis >>> >>> >>> >> -- >> James W. MacDonald, M.S. >> Biostatistician >> University of Washington >> Environmental and Occupational Health Sciences >> 4225 Roosevelt Way NE, # 100 >> Seattle WA 98105-6099 >> >> > > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor at r-project.org > https://stat.ethz.ch/mailman/listinfo/bioconductor > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

ADD REPLY • link 10.0 years ago Michael Love 41k

0

Entering edit mode

Hi Assa, Mike has answered one question. And this is the reason you have to use the argument .modifyDF= list(modifyReportDF, fun) The first list item is the already existing function in ReportingTools that knows what a DESeqResults object is, and what to do to coerce it to a data.frame. To answer the other question, the answer is yes! In fact you are already adding columns to the data.frame (you add the two columns from the 'annot' object). You can add other columns in a similar fashion. Best, Jim On Tuesday, April 29, 2014 9:55:10 AM, Michael Love wrote: > hi Assa, > > If you look up the help for ?"publish-methods", there is support for > DESeqResults (the 4th data type listed). DESeqResults is the > DataFrame produced by DESeq2::results(). The point of creating this > class was to help simplify the hand-off to ReportingTools. Maybe this > will help? > > Mike > > On Tue, Apr 29, 2014 at 9:27 AM, Assa Yeroslaviz <frymor at="" gmail.com=""> wrote: >> Hi Jim, >> >> thanks for the tip. >> Unfortunately i am not sure i understand the idea behind it. >> >> You say, it is possible to work straight with the DESeqDataSet Object, but >> than the function expects a data.frame to work with. If I understand the >> mechanism with which the publish function is working - it takes the >> DESeqDataSet obejct and, using the results function, coerce it into a >> data.frame. >> >> This is the function I ended up using: >> >> fun <- function(df, object, ...){ >> df$ENSEMBL <- rownames(df) >> annot <- select(org.Mm.eg.db, df$ENSEMBL, c("SYMBOL","GENENAME"), >> "ENSEMBL") >> if(nrow(annot) > nrow(df)) annot <- annot[!duplicated(annot[,1]),] >> df <- data.frame(annot, df) >> df <- df[ , -which(names(df) %in% c("ENSEMBL.1"))] >> df$ENSEMBL <- hwrite(as.character(df$ENSEMBL), >> link = paste0(" >> http://www.ensembl.org/Mus_musculus/Gene/Summary?g=", >> as.character(df$ENSEMBL)), table = FALSE) >> df >> } >> >> >> As you can see, I changes the column df$ENSEMBL into the rownames of the >> coerced df. this is because the fit object doen't have a column name >> ENSEMBL. >> >> Q. Is there a way to add coluns to the object? >> >> Am I doing it in the most efficient way? >> >> thanks for the help and the tip about the Ensembl links (mouse genome - >> Mm). >> >> >> Assa >> >> >> >> On Fri, Apr 25, 2014 at 3:43 PM, James W. MacDonald <jmacdon at="" uw.edu=""> wrote: >> >>> Hi Assa, >>> >>> Gabriel actually already gave you the answer, and it is yes. You just have >>> to add things to the .modifyDF argument. There are several examples in >>> >>> http://www.bioconductor.org/packages/release/bioc/ >>> vignettes/ReportingTools/inst/doc/basicReportingTools.pdf >>> >>> and here is one (untested) that should apply to your situation: >>> >>> fun <- function(df, object, ...){ >>> if(!ENSEMBL %in% names(df)) >>> stop("The column name for ensembl ids has to be 'ENSEMBL'!") >>> ensids <- df$ENSEMBL >>> whichcol <- which(names(df) == "ENSEMBL") >>> annot <- select(org.Mm.eg.db, ensids, c("SYMBOL","GENENAME"), >>> "ENSEMBL") >>> if(nrow(annot) > nrow(df)) annot <- annot[!duplicated(annot[,1]),] >>> df <- data.frame(annot, df[,-whichcol]) >>> df$ENSEMBL <- hwrite(as.character(df$ENSEMBL), >>> link = paste0(" http://www.ensembl.org/Homo_ >>> sapiens/Gene/Summary?g=", >>> as.character(df$ENSEMBL)), table = FALSE) >>> df >>> } >>> >>> >>> This function implicitly assumes (and checks) that there is an ENSEMBL >>> column in your data.frame that it can use to extract the Ensembl IDs. It >>> also assumes that your species is human, and that you have the org.Mm.eg.db >>> package already loaded. It then gets the symbol and genename for those IDs, >>> and does a really naive subsetting of the data if there are duplicates. >>> Other more sophisticated things are possible, but I leave it to you to make >>> any such modifications. >>> >>> You would use this (as Gabriel already said), as part of an argument >>> passed in via .modifyDF. You also need modifyReportDF as well. So your >>> publish argument would now look like >>> >>> publish(fit,des2Report, pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", >>> factor = colData(fit)$condition,reportDir="./reports", .modifyDF = >>> list(modifyReportDF, fun)) >>> >>> That at least is the basic idea, and you might need to play around to make >>> things work correctly. >>> >>> Best, >>> >>> Jim >>> >>> >>> >>> On 4/25/2014 4:21 AM, Assa Yeroslaviz wrote: >>> >>>> Hi Gabriel, >>>> >>>> Thanks for the quick answer I will look into that as soon as I have the >>>> time. >>>> Another question was if it is possible to work directy with the Ensembl >>>> IDs. >>>> >>>> I have a table of ~37K ensembl Ids, for which almost 50% have no Entrez >>>> Ids, so I can't convert them. Is there a way to work directly with the >>>> Ensembl IDs and still benefit from the annotation.de < >>>> http://annotation.de> possibilities? >>>> >>>> Thanks >>>> >>>> Assa >>>> >>>> >>>> >>>> On Thu, Apr 24, 2014 at 4:48 PM, Gabriel Becker <gmbecker at="" ucdavis.edu<mailto:="">>>> gmbecker at ucdavis.edu>> wrote: >>>> >>>> I wrote my previous message too quickly. Apologies. >>>> >>>> Your functions must have the signature >>>> >>>> function(df, object, ...) >>>> >>>> df is current data.frame represenation of the object, >>>> object is the *original* object (so that the class can be identified), >>>> ... are passed in from the call to publish >>>> >>>> And you can just place the generic modifyReportDF function at the >>>> beginning of the list, rather than using getMethod. The getMethod >>>> thing I said is for when you want to apply the default handling >>>> for a *different* class to your object. It is a rare use- case, but >>>> came up recently so it was on my mind. >>>> >>>> That will teach me to respond quickly to emails early in the morning. >>>> >>>> Sorry about that. >>>> >>>> ~G >>>> >>>> >>>> On Thu, Apr 24, 2014 at 7:18 AM, Gabriel Becker >>>> <gmbecker at="" ucdavis.edu="" <mailto:gmbecker="" at="" ucdavis.edu="">> wrote: >>>> >>>> Assa, >>>> >>>> In general yes, if you want to add to the table you will be >>>> working with the data.frame. >>>> >>>> You can do so after the initial conversion, though, so you >>>> don't have to recreate the wheel to get from your object to an >>>> initial data.frame. >>>> >>>> To modify the default table (data.frame) generated for an >>>> object, you can pass publish()'s .modifyDF parameter a >>>> function of list of functions, each of which should accept >>>> object (the data.frame) and "..." and return a data.frame. >>>> >>>> These will be called in order, each accepting the output from >>>> the last. The output of the final function is what will be >>>> transformed into HTML and inserted into the report. >>>> >>>> You'll probably want to add the default handling of your >>>> object type, which you can do by putting >>>> getMethod("modifyReportDF", "<your object's="" class="">") at the >>>> beginning of the list. >>>> >>>> See section 4 of the ReportingTools basics vignette for >>>> example code. >>>> >>>> HTH, >>>> ~G >>>> >>>> >>>> On Thu, Apr 24, 2014 at 6:54 AM, Assa Yeroslaviz >>>> <frymor at="" gmail.com="" <mailto:frymor="" at="" gmail.com="">> wrote: >>>> >>>> Thanks Jim, >>>> >>>> I have found in one of the forums a response from Jason >>>> (thanks again) for >>>> the option to set annotation.db=NULL and though force the >>>> publish command >>>> to work with the Ids I provide in the DESeqDataSet object. >>>> >>>> So this is now working, But I would like to have also the >>>> option to add >>>> some annotations to the table. >>>> >>>> Is this only possible when working directly with a data >>>> .frame? >>>> >>>> Thanks again >>>> Assa >>>> >>>> On Thu, Apr 24, 2014 at 3:45 PM, James W. MacDonald >>>> <jmacdon at="" uw.edu="" <mailto:jmacdon="" at="" uw.edu="">> wrote: >>>> >>>> > Hi Assa, >>>> > >>>> > There may well be a way to work with Ensembl IDs, and >>>> you will likely get >>>> > an answer to your direct question from one of the >>>> maintainers. >>>> > >>>> > However you should note that ReportingTools simply takes >>>> the input object >>>> > and then coerces the data to a data.frame, which is then >>>> used to create the >>>> > HTML table. You can always create the data.frame to your >>>> own liking up >>>> > front, and then pass that to publish(). While this is >>>> more work than just >>>> > passing in the DESeqDataSet, you do have complete >>>> control over the output. >>>> > >>>> > Best, >>>> > >>>> > Jim >>>> > >>>> > >>>> > >>>> > On 4/24/2014 8:50 AM, Assa Yeroslaviz wrote: >>>> > >>>> >> Hi, >>>> >> >>>> >> Is it neccessary to have entrez gene IDs to work with >>>> this package? >>>> >> >>>> >> I am working on a dataset with Ensembl IDs. Do I need >>>> to convert them to >>>> >> Entrez? >>>> >> >>>> >> When trying to create a report for a DESeqDataSet or >>>> DESeqResults objects >>>> >> i >>>> >> am getting the error messege: >>>> >> >>>> >> Error: Ids do not appear to be Entrez Ids for the >>>> specified species. >>>> >> >>>> >> Is there a way to work straight with the ensembl IDs? >>>> >> >>>> >> Thanks >>>> >> >>>> >> Assa >>>> >> >>>> >> my script: >>>> >> >>>> >> head(Counts_set) >>>> >> A_pKO_aV_FCS G_pKO_aV_FCS M_pKO_aV_FCS D_pKO_aV >>>> >> J_pKO_aV >>>> >> ENSMUSG00000000001 4744 4632 4535 4748 >>>> >> 3736 >>>> >> ENSMUSG00000000003 0 0 0 0 >>>> >> 0 >>>> >> ENSMUSG00000000028 1246 1420 1429 2304 >>>> >> 1261 >>>> >> ENSMUSG00000000031 3 25 65 0 >>>> >> 50 >>>> >> ENSMUSG00000000037 0 0 0 0 >>>> >> 0 >>>> >> ENSMUSG00000000049 0 0 3 1 >>>> >> 3 >>>> >> >>>> >> cds <- DESeqDataSetFromMatrix ( >>>> >> countData = Counts_set, >>>> >> colData = colData, >>>> >> design = ~ condition >>>> >> ) >>>> >> >>>> >> fit = DESeq(cds) >>>> >> des2Report <- HTMLReport(shortName >>>> =paste('RNAseq_analysis_', group1, "_", >>>> >> group2, sep=""),title ='RNA-seq analysis of >>>> differential expression using >>>> >> DESeq2',reportDirectory = "./reports") >>>> >> publish(fit,des2Report, >>>> pvalueCutoff=0.05,annotation.db="org.Mm.eg.db", >>>> >> factor = colData(fit)$condition,reportDir="./reports") >>>> >> Error: Ids do not appear to be Entrez Ids for the >>>> specified species. >>>> >> finish(des2Report) >>>> >> >>>> >> >>>> >> sessionInfo() >>>> >>> >>>> >> R version 3.1.0 (2014-04-10) >>>> >> Platform: x86_64-pc-linux-gnu (64-bit) >>>> >> >>>> >> locale: >>>> >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>>> >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>>> >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >>>> >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>>> >> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>> >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>> >> >>>> >> attached base packages: >>>> >> [1] parallel stats graphics grDevices utils datasets >>>> methods >>>> >> [8] base >>>> >> >>>> >> other attached packages: >>>> >> [1] org.Mm.eg.db_2.14.0 ReportingTools_2.4.0 >>>> AnnotationDbi_1.26.0 >>>> >> [4] Biobase_2.24.0 RSQLite_0.11.4 DBI_0.2-7 >>>> >> [7] knitr_1.5 DESeq2_1.4.0 >>>> >> RcppArmadillo_0.4.200.0 >>>> >> [10] Rcpp_0.11.1 GenomicRanges_1.16.2 GenomeInfoDb_1.0.2 >>>> >> [13] IRanges_1.22.3 BiocGenerics_0.10.0 >>>> >> >>>> >> loaded via a namespace (and not attached): >>>> >> [1] annotate_1.42.0 AnnotationForge_1.6.0 >>>> >> BatchJobs_1.2 >>>> >> [4] BBmisc_1.5 BiocParallel_0.6.0 >>>> >> biomaRt_2.20.0 >>>> >> [7] Biostrings_2.32.0 biovizBase_1.12.0 >>>> >> bitops_1.0-6 >>>> >> [10] brew_1.0-6 BSgenome_1.32.0 >>>> >> Category_2.30.0 >>>> >> [13] cluster_1.14.4 codetools_0.2-8 >>>> >> colorspace_1.2-4 >>>> >> [16] dichromat_2.0-0 digest_0.6.4 >>>> >> edgeR_3.6.0 >>>> >> [19] evaluate_0.5.3 fail_1.2 >>>> >> foreach_1.4.2 >>>> >> [22] formatR_0.10 Formula_1.1-1 >>>> >> genefilter_1.46.0 >>>> >> [25] geneplotter_1.42.0 GenomicAlignments_1.0.0 >>>> >> GenomicFeatures_1.16.0 >>>> >> [28] ggbio_1.12.0 ggplot2_0.9.3.1 >>>> >> GO.db_2.14.0 >>>> >> [31] GOstats_2.30.0 graph_1.42.0 >>>> >> grid_3.1.0 >>>> >> [34] gridExtra_0.9.1 GSEABase_1.26.0 >>>> >> gtable_0.1.2 >>>> >> [37] Hmisc_3.14-4 hwriter_1.3 >>>> >> iterators_1.0.7 >>>> >> [40] lattice_0.20-24 latticeExtra_0.6-26 >>>> >> limma_3.20.1 >>>> >> [43] locfit_1.5-9.1 MASS_7.3-29 >>>> >> Matrix_1.1-2 >>>> >> [46] munsell_0.4.2 PFAM.db_2.14.0 >>>> >> plyr_1.8.1 >>>> >> [49] proto_0.3-10 RBGL_1.40.0 >>>> >> RColorBrewer_1.0-5 >>>> >> [52] RCurl_1.95-4.1 reshape2_1.2.2 >>>> >> R.methodsS3_1.6.1 >>>> >> [55] R.oo_1.18.0 Rsamtools_1.16.0 >>>> >> rtracklayer_1.24.0 >>>> >> [58] R.utils_1.29.8 scales_0.2.4 >>>> >> sendmailR_1.1-2 >>>> >> [61] splines_3.1.0 stats4_3.1.0 >>>> >> stringr_0.6.2 >>>> >> [64] survival_2.37-7 tools_3.1.0 >>>> >> VariantAnnotation_1.10.0 >>>> >> [67] XML_3.98-1.1 xtable_1.7-3 >>>> >> XVector_0.4.0 >>>> >> [70] zlibbioc_1.10.0 >>>> >> >>>> >> [[alternative HTML version deleted]] >>>> >> >>>> >> _______________________________________________ >>>> >> Bioconductor mailing list >>>> >> Bioconductor at r-project.org >>>> <mailto:bioconductor at="" r-project.org=""> >>>> >>>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> >> Search the archives: http://news.gmane.org/gmane. >>>> >> science.biology.informatics.conductor >>>> >> >>>> > >>>> > -- >>>> > James W. MacDonald, M.S. >>>> > Biostatistician >>>> > University of Washington >>>> > Environmental and Occupational Health Sciences >>>> > 4225 Roosevelt Way NE, # 100 >>>> > Seattle WA 98105-6099 >>>> > >>>> > >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> _______________________________________________ >>>> Bioconductor mailing list >>>> Bioconductor at r-project.org <mailto:bioconductor at="" r-project.org="">>>>> >>>> >>>> https://stat.ethz.ch/mailman/listinfo/bioconductor >>>> Search the archives: >>>> http://news.gmane.org/gmane.science.biology.informatics. >>>> conductor >>>> >>>> >>>> >>>> >>>> -- Gabriel Becker >>>> Graduate Student >>>> Statistics Department >>>> University of California, Davis >>>> >>>> >>>> >>>> >>>> -- Gabriel Becker >>>> Graduate Student >>>> Statistics Department >>>> University of California, Davis >>>> >>>> >>>> >>> -- >>> James W. MacDonald, M.S. >>> Biostatistician >>> University of Washington >>> Environmental and Occupational Health Sciences >>> 4225 Roosevelt Way NE, # 100 >>> Seattle WA 98105-6099 >>> >>> >> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioconductor mailing list >> Bioconductor at r-project.org >> https://stat.ethz.ch/mailman/listinfo/bioconductor >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor -- James W. MacDonald, M.S. Biostatistician University of Washington Environmental and Occupational Health Sciences 4225 Roosevelt Way NE, # 100 Seattle WA 98105-6099

ADD REPLY • link 10.0 years ago James W. MacDonald 65k

Login before adding your answer.