Question: DESeq2 followed by goseq GO-analysis: Extracting DE genes of enriched GO category
0
gravatar for josef.oeckl
19 months ago by
josef.oeckl0 wrote:

Hi everyone,

I am a PhD-student from Germany (Munich) and actually I am working on adipose tissue biology. As a side project I started working with RNA-seq data. I successfully managed to test for differential gene expression using "DESeq2" followed by GO-analysis with "goseq". So far, so good. However, I would like to know which genes (annotated to this pathway) are differentially expressed in my dataset.

 

library(DESeq2)
library(goseq)
library(GO.db)
library (org.Mm.eg.db)
library(annotate)
#"DESeq2": Pulling out the contrast of interest
res <- results(dds, contrast=c("genotype","iWATKO", "iWATWT"))

#Constructing a gene vector for "goseq"
assayed.genes <- rownames(res)
fdr.threshold <- 0.1
de.genes <- rownames(res)[ which(res$padj < fdr.threshold) ]
gene.vector=as.integer(assayed.genes%in%de.genes)
names(gene.vector)=assayed.genes

#Fitting PWF
pwf=nullp(gene.vector,"mm9","geneSymbol")

#From here on I used the code posted by Iain Gallagher goseq analysis - extracting list of my genes which are DE in the enriched GO category
goCats <- goseq(pwf, "mm9","geneSymbol", test.cats=("GO:BP"))
sigCats <- goCats[which(goCats[,2] < 0.05),]
cats <- sigCats$category
terms <- stack(lapply(mget(cats,GOTERM, ifnotfound=NA),Term))
sigCats$Term <- with(sigCats, terms$values[match(terms$ind, sigCats$category)] )

##Original code (Where does topGenes come from?)
allGos <- stack(getgo(rownames(topGenes$table), "mm9", "geneSymbol"))

##I replaced topGenes with res, not sure if this is correct?! However, it does not produce an error.
allGos <- stack(getgo(rownames(res), "mm9", "geneSymbol"))
onlySigCats <- allGos[allGos$values %in% sigCats$category,]
onlySigCats$Term <- with( onlySigCats, terms$value[match(onlySigCats$values, terms$ind)] )

##Next line is producing an error, what is annot2???
#add the gene symbol 
onlySigCats$Symbol <- with( onlySigCats, annot2[,2][match(onlySigCats$ind, rownames(annot2) )] )


Error in eval(substitute(expr), data, enclos = parent.frame()) :
  Object 'annot2' not found
> traceback ()
5: eval(substitute(expr), data, enclos = parent.frame())
4: eval(substitute(expr), data, enclos = parent.frame())
3: with.default(onlySigCats, annot2[, 2][match(onlySigCats$ind,
       rownames(annot2))])
2: with(onlySigCats, annot2[, 2][match(onlySigCats$ind, rownames(annot2))])
1: with(onlySigCats, annot2[, 2][match(onlySigCats$ind, rownames(annot2))])

 

##Again,where does topGenes come from???
#add the gene logFC 
onlySigCats$logFC <- with( onlySigCats, topGenes$table$logFC[match(onlySigCats$ind, rownames(topGenes$table) )] )

 

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 10240)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] org.Mm.eg.db_3.5.0         goseq_1.30.0              
 [3] geneLenDataBase_1.14.0     BiasedUrn_1.07            
 [5] GO.db_3.5.0                annotate_1.56.1           
 [7] XML_3.98-1.9               AnnotationDbi_1.40.0      
 [9] DESeq2_1.18.1              SummarizedExperiment_1.8.1
[11] DelayedArray_0.4.1         matrixStats_0.53.1        
[13] Biobase_2.38.0             GenomicRanges_1.30.0      
[15] GenomeInfoDb_1.14.0        IRanges_2.12.0            
[17] S4Vectors_0.16.0           BiocGenerics_0.24.0       
[19] BiocInstaller_1.28.0      

loaded via a namespace (and not attached):
 [1] httr_1.3.1               RMySQL_0.10.13           bit64_0.9-7             
 [4] splines_3.4.3            assertthat_0.2.0         Formula_1.2-2           
 [7] latticeExtra_0.6-28      blob_1.1.0               GenomeInfoDbData_1.0.0  
[10] Rsamtools_1.30.0         progress_1.1.2           pillar_1.1.0            
[13] RSQLite_2.0              backports_1.1.2          lattice_0.20-35         
[16] digest_0.6.15            RColorBrewer_1.1-2       XVector_0.18.0          
[19] checkmate_1.8.5          colorspace_1.3-2         htmltools_0.3.6         
[22] Matrix_1.2-12            plyr_1.8.4               pkgconfig_2.0.1         
[25] biomaRt_2.34.2           genefilter_1.60.0        zlibbioc_1.24.0         
[28] xtable_1.8-2             scales_0.5.0             BiocParallel_1.12.0     
[31] htmlTable_1.11.2         tibble_1.4.2             mgcv_1.8-23             
[34] ggplot2_2.2.1            GenomicFeatures_1.30.3   nnet_7.3-12             
[37] lazyeval_0.2.1           survival_2.41-3          magrittr_1.5            
[40] memoise_1.1.0            nlme_3.1-131             foreign_0.8-69          
[43] prettyunits_1.0.2        tools_3.4.3              data.table_1.10.4-3     
[46] stringr_1.2.0            munsell_0.4.3            locfit_1.5-9.1          
[49] cluster_2.0.6            Biostrings_2.46.0        compiler_3.4.3          
[52] rlang_0.1.6              grid_3.4.3               RCurl_1.95-4.10         
[55] rstudioapi_0.7           htmlwidgets_1.0          bitops_1.0-6            
[58] base64enc_0.1-3          gtable_0.2.0             DBI_0.7                 
[61] R6_2.2.2                 GenomicAlignments_1.14.1 gridExtra_2.3           
[64] knitr_1.19               rtracklayer_1.38.3       bit_1.1-12              
[67] Hmisc_4.1-1              stringi_1.1.6            Rcpp_0.12.15            
[70] geneplotter_1.56.0       rpart_4.1-12             acepack_1.4.1           
>

 

I really appreciate your help :)

 

Kind regards,

 

Josef

 

ADD COMMENTlink modified 19 months ago by Michael Love25k • written 19 months ago by josef.oeckl0
Answer: DESeq2 followed by goseq GO-analysis: Extracting DE genes of enriched GO categor
0
gravatar for Michael Love
19 months ago by
Michael Love25k
United States
Michael Love25k wrote:

hi Josef,

Since you are not using vignette code from the packages, it's not really the software developers job to sift through an R script and see what's going on, or how it needs to be updated to work with current packages.

I'd suggest starting from the goseq vignette and using the code they have there, and if you have a question, check the manual page for the specific goseq function to see what input it is expecting and what output it gives you. All that you need from DESeq2 is the results table.

ADD COMMENTlink written 19 months ago by Michael Love25k

Thank you, I'll try!

ADD REPLYlink written 19 months ago by josef.oeckl0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 118 users visited in the last hour