GO enrichment
1
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Hello everyone I am very new with bioinformatics work, which I hope someone can give me the answer and suggestions. I am try to use GOseq package to get GO enrichment for my data which is not built-in oraganism. >enriched.GO=unsorted_L14.15_S_GO.wall$category[p.adjust(unsorted_L14. 15_S_GO.wall$over_ represented_pvalue, method="BH") < 0.05] > head(enriched.GO) character(0) I have prepared data as below #create LengthData > unsorted_L14.15_S_LengthData <- unsorted_L14.15_S_gene2length > unsorted_L14.15_S_id <- as.vector(unsorted_L14.15_S_gene2length[,1]) > unsorted_L14.15_S_length <- as.numeric(unsorted_L14.15_S_gene2length[,2]) > unsorted_L14.15_S_LengthData <- structure(unsorted_L14.15_S_length, .names=unsorted_L14.15_S_id) #PWF=fitting the probability weighting function unsorted_L14.15_S_pwf = nullp(unsorted_genesL14.15_S, bias.data=unsorted_L14.15_S_length, plot.fit=TRUE) unsorted_L14.15_S_pwf = nullp(unsorted_genesL14.15_S, bias.data=unsorted_L14.15_S_LengthData, plot.fit=TRUE) > head(unsorted_L14.15_S_pwf) DEgenes bias.data pwf Cucsa.000210 0 1512 0.5013243 Cucsa.000250 0 405 0.5182944 Cucsa.000270 0 258 0.5205436 > unsorted_L14.15_S_GO.wall <- goseq(unsorted_L14.15_S_pwf, gene2cat=unsorted_L14.15_S_gene2go, test.cats=c("GO:CC", "GO:BP", "GO:MF"), method="Wallenius", repcnt=2000, use_genes_without_cat=TRUE) Using manually entered categories. Calculating the p-values... > head(unsorted_L14.15_S_GO.wall) category over_represented_pvalue under_represented_pvalue numDEInCat numInCat 594 GO:0043565 0.0001000255 0.9999945 17 18 177 GO:0005515 0.0079055773 0.9933285 618 1162 380 GO:0008565 0.0088243286 1.0000000 7 7 I found some category such as GO:0043565 has potential to be one of the category that have significant enrichment because from the result it obtains 17 DE_genes out 18 genes that assigned to this category. Then I went back to check in my genelist table I found only 7 DE_genes out of 18 genes for this category. So I don't know what I have done wrong. I have someone can help me with this. Thank you so much. Regards, warin -- output of sessionInfo(): > sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 [3] LC_TIME=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 [9] LC_ADDRESS=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_NUMERIC=C LC_COLLATE=C LC_MESSAGES=en_US.UTF-8 LC_NAME=C LC_TELEPHONE=C LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics other attached packages: [1] GO.db_2.14.0 [4] DBI_0.2-7 grDevices utils org.Hs.eg.db_2.14.0 AnnotationDbi_1.26.0 datasets methods RSQLite_0.11.4 GenomeInfoDb_1.0.2 base [7] Biobase_2.24.0 [10] limma_3.20.8 [13] BiasedUrn_1.06.1 BiocGenerics_0.10.0 goseq_1.16.2 edgeR_3.6.4 geneLenDataBase_1.0.0 loaded via a namespace (and not attached): [1] BBmisc_1.7 BSgenome_1.32.0 [4] BiocParallel_0.6.1 Biostrings_2.32.0 [7] GenomicFeatures_1.16.2 GenomicRanges_1.16.3 [10] Matrix_1.1-4 RCurl_1.95-4.1 [13] Rsamtools_1.16.1 XML_3.98-1.1 [16] biomaRt_2.20.0 bitops_1.0-6 [19] checkmate_1.1 codetools_0.2-8 [22] fail_1.2 foreach_1.4.2 [25] iterators_1.0.7 lattice_0.20-29 [28] nlme_3.1-117 plyr_1.8.1 [31] sendmailR_1.1-2 stats4_3.1.0 [34] tools_3.1.0 zlibbioc_1.10.0 BatchJobs_1.2 GenomicAlignments_1.0.2 IRanges_1.22.9 Rcpp_0.11.2 XVector_0.4.0 brew_1.0-6 digest_0.6.4 grid_3.1.0 mgcv_1.8-0 rtracklayer_1.24.2 stringr_0.6.2 -- Sent via the guest posting facility at bioconductor.org.
GO Category goseq GO Category goseq • 1.4k views
ADD COMMENT
0
Entering edit mode
Guest User ★ 13k
@guest-user-4897
Last seen 9.6 years ago
Hello all I have checked the number of DE genes that assigned GO category agian. I found that the result from GOseq is correct. I am apologize for all people who tried to help me. regards, warin -- output of sessionInfo(): sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 [3] LC_TIME=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 [9] LC_ADDRESS=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_NUMERIC=C LC_COLLATE=C LC_MESSAGES=en_US.UTF-8 LC_NAME=C LC_TELEPHONE=C LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics other attached packages: [1] GO.db_2.14.0 [4] DBI_0.2-7 grDevices utils org.Hs.eg.db_2.14.0 AnnotationDbi_1.26.0 -- Sent via the guest posting facility at bioconductor.org.
ADD COMMENT

Login before adding your answer.

Traffic: 764 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6