Entering edit mode
Guest User
★
13k
@guest-user-4897
Last seen 10.3 years ago
Hello everyone
I am very new with bioinformatics work, which I hope someone can give
me the answer and suggestions.
I am try to use GOseq package to get GO enrichment for my data which
is not built-in oraganism.
>enriched.GO=unsorted_L14.15_S_GO.wall$category[p.adjust(unsorted_L14.
15_S_GO.wall$over_ represented_pvalue, method="BH") < 0.05]
> head(enriched.GO)
character(0)
I have prepared data as below
#create LengthData
> unsorted_L14.15_S_LengthData <- unsorted_L14.15_S_gene2length
> unsorted_L14.15_S_id <- as.vector(unsorted_L14.15_S_gene2length[,1])
> unsorted_L14.15_S_length <-
as.numeric(unsorted_L14.15_S_gene2length[,2])
> unsorted_L14.15_S_LengthData <- structure(unsorted_L14.15_S_length,
.names=unsorted_L14.15_S_id)
#PWF=fitting the probability weighting function
unsorted_L14.15_S_pwf = nullp(unsorted_genesL14.15_S,
bias.data=unsorted_L14.15_S_length, plot.fit=TRUE)
unsorted_L14.15_S_pwf = nullp(unsorted_genesL14.15_S,
bias.data=unsorted_L14.15_S_LengthData, plot.fit=TRUE)
> head(unsorted_L14.15_S_pwf)
DEgenes bias.data pwf
Cucsa.000210 0 1512 0.5013243
Cucsa.000250 0 405 0.5182944
Cucsa.000270 0 258 0.5205436
> unsorted_L14.15_S_GO.wall <- goseq(unsorted_L14.15_S_pwf,
gene2cat=unsorted_L14.15_S_gene2go, test.cats=c("GO:CC", "GO:BP",
"GO:MF"), method="Wallenius", repcnt=2000, use_genes_without_cat=TRUE)
Using manually entered categories.
Calculating the p-values...
> head(unsorted_L14.15_S_GO.wall)
category over_represented_pvalue under_represented_pvalue
numDEInCat numInCat
594 GO:0043565 0.0001000255 0.9999945
17 18
177 GO:0005515 0.0079055773 0.9933285
618 1162
380 GO:0008565 0.0088243286 1.0000000
7 7
I found some category such as GO:0043565 has potential to be one of
the category that have significant enrichment because from the result
it obtains 17 DE_genes out 18 genes that assigned to this category.
Then I went back to check in my genelist table I found only 7 DE_genes
out of 18 genes for this category. So I don't know what I have done
wrong. I have someone can help me with this. Thank you so much.
Regards,
warin
-- output of sessionInfo():
> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8
[3] LC_TIME=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8
[9] LC_ADDRESS=C
[11] LC_MEASUREMENT=en_US.UTF-8
LC_NUMERIC=C
LC_COLLATE=C
LC_MESSAGES=en_US.UTF-8
LC_NAME=C
LC_TELEPHONE=C
LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats
graphics
other attached packages:
[1] GO.db_2.14.0
[4] DBI_0.2-7
grDevices utils
org.Hs.eg.db_2.14.0
AnnotationDbi_1.26.0
datasets methods
RSQLite_0.11.4
GenomeInfoDb_1.0.2
base
[7] Biobase_2.24.0
[10] limma_3.20.8
[13] BiasedUrn_1.06.1
BiocGenerics_0.10.0
goseq_1.16.2
edgeR_3.6.4
geneLenDataBase_1.0.0
loaded via a namespace (and not attached):
[1] BBmisc_1.7
BSgenome_1.32.0
[4] BiocParallel_0.6.1
Biostrings_2.32.0
[7] GenomicFeatures_1.16.2 GenomicRanges_1.16.3
[10] Matrix_1.1-4
RCurl_1.95-4.1
[13] Rsamtools_1.16.1
XML_3.98-1.1
[16] biomaRt_2.20.0
bitops_1.0-6
[19] checkmate_1.1
codetools_0.2-8
[22] fail_1.2
foreach_1.4.2
[25] iterators_1.0.7
lattice_0.20-29
[28] nlme_3.1-117
plyr_1.8.1
[31] sendmailR_1.1-2
stats4_3.1.0
[34] tools_3.1.0
zlibbioc_1.10.0
BatchJobs_1.2
GenomicAlignments_1.0.2
IRanges_1.22.9
Rcpp_0.11.2
XVector_0.4.0
brew_1.0-6
digest_0.6.4
grid_3.1.0
mgcv_1.8-0
rtracklayer_1.24.2
stringr_0.6.2
--
Sent via the guest posting facility at bioconductor.org.