Search
Question: DESeq2 followed by goseq GO-analysis: Extracting DE genes of enriched GO category
0
gravatar for josef.oeckl
6 months ago by
josef.oeckl0 wrote:

Hi everyone,

I am a PhD-student from Germany (Munich) and actually I am working on adipose tissue biology. As a side project I started working with RNA-seq data. I successfully managed to test for differential gene expression using "DESeq2" followed by GO-analysis with "goseq". So far, so good. However, I would like to know which genes (annotated to this pathway) are differentially expressed in my dataset.

 

library(DESeq2)
library(goseq)
library(GO.db)
library (org.Mm.eg.db)
library(annotate)
#"DESeq2": Pulling out the contrast of interest
res <- results(dds, contrast=c("genotype","iWATKO", "iWATWT"))

#Constructing a gene vector for "goseq"
assayed.genes <- rownames(res)
fdr.threshold <- 0.1
de.genes <- rownames(res)[ which(res$padj < fdr.threshold) ]
gene.vector=as.integer(assayed.genes%in%de.genes)
names(gene.vector)=assayed.genes

#Fitting PWF
pwf=nullp(gene.vector,"mm9","geneSymbol")

#From here on I used the code posted by Iain Gallagher goseq analysis - extracting list of my genes which are DE in the enriched GO category
goCats <- goseq(pwf, "mm9","geneSymbol", test.cats=("GO:BP"))
sigCats <- goCats[which(goCats[,2] < 0.05),]
cats <- sigCats$category
terms <- stack(lapply(mget(cats,GOTERM, ifnotfound=NA),Term))
sigCats$Term <- with(sigCats, terms$values[match(terms$ind, sigCats$category)] )

##Original code (Where does topGenes come from?)
allGos <- stack(getgo(rownames(topGenes$table), "mm9", "geneSymbol"))

##I replaced topGenes with res, not sure if this is correct?! However, it does not produce an error.
allGos <- stack(getgo(rownames(res), "mm9", "geneSymbol"))
onlySigCats <- allGos[allGos$values %in% sigCats$category,]
onlySigCats$Term <- with( onlySigCats, terms$value[match(onlySigCats$values, terms$ind)] )

##Next line is producing an error, what is annot2???
#add the gene symbol 
onlySigCats$Symbol <- with( onlySigCats, annot2[,2][match(onlySigCats$ind, rownames(annot2) )] )


Error in eval(substitute(expr), data, enclos = parent.frame()) :
  Object 'annot2' not found
> traceback ()
5: eval(substitute(expr), data, enclos = parent.frame())
4: eval(substitute(expr), data, enclos = parent.frame())
3: with.default(onlySigCats, annot2[, 2][match(onlySigCats$ind,
       rownames(annot2))])
2: with(onlySigCats, annot2[, 2][match(onlySigCats$ind, rownames(annot2))])
1: with(onlySigCats, annot2[, 2][match(onlySigCats$ind, rownames(annot2))])

 

##Again,where does topGenes come from???
#add the gene logFC 
onlySigCats$logFC <- with( onlySigCats, topGenes$table$logFC[match(onlySigCats$ind, rownames(topGenes$table) )] )

 

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 10240)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] org.Mm.eg.db_3.5.0         goseq_1.30.0              
 [3] geneLenDataBase_1.14.0     BiasedUrn_1.07            
 [5] GO.db_3.5.0                annotate_1.56.1           
 [7] XML_3.98-1.9               AnnotationDbi_1.40.0      
 [9] DESeq2_1.18.1              SummarizedExperiment_1.8.1
[11] DelayedArray_0.4.1         matrixStats_0.53.1        
[13] Biobase_2.38.0             GenomicRanges_1.30.0      
[15] GenomeInfoDb_1.14.0        IRanges_2.12.0            
[17] S4Vectors_0.16.0           BiocGenerics_0.24.0       
[19] BiocInstaller_1.28.0      

loaded via a namespace (and not attached):
 [1] httr_1.3.1               RMySQL_0.10.13           bit64_0.9-7             
 [4] splines_3.4.3            assertthat_0.2.0         Formula_1.2-2           
 [7] latticeExtra_0.6-28      blob_1.1.0               GenomeInfoDbData_1.0.0  
[10] Rsamtools_1.30.0         progress_1.1.2           pillar_1.1.0            
[13] RSQLite_2.0              backports_1.1.2          lattice_0.20-35         
[16] digest_0.6.15            RColorBrewer_1.1-2       XVector_0.18.0          
[19] checkmate_1.8.5          colorspace_1.3-2         htmltools_0.3.6         
[22] Matrix_1.2-12            plyr_1.8.4               pkgconfig_2.0.1         
[25] biomaRt_2.34.2           genefilter_1.60.0        zlibbioc_1.24.0         
[28] xtable_1.8-2             scales_0.5.0             BiocParallel_1.12.0     
[31] htmlTable_1.11.2         tibble_1.4.2             mgcv_1.8-23             
[34] ggplot2_2.2.1            GenomicFeatures_1.30.3   nnet_7.3-12             
[37] lazyeval_0.2.1           survival_2.41-3          magrittr_1.5            
[40] memoise_1.1.0            nlme_3.1-131             foreign_0.8-69          
[43] prettyunits_1.0.2        tools_3.4.3              data.table_1.10.4-3     
[46] stringr_1.2.0            munsell_0.4.3            locfit_1.5-9.1          
[49] cluster_2.0.6            Biostrings_2.46.0        compiler_3.4.3          
[52] rlang_0.1.6              grid_3.4.3               RCurl_1.95-4.10         
[55] rstudioapi_0.7           htmlwidgets_1.0          bitops_1.0-6            
[58] base64enc_0.1-3          gtable_0.2.0             DBI_0.7                 
[61] R6_2.2.2                 GenomicAlignments_1.14.1 gridExtra_2.3           
[64] knitr_1.19               rtracklayer_1.38.3       bit_1.1-12              
[67] Hmisc_4.1-1              stringi_1.1.6            Rcpp_0.12.15            
[70] geneplotter_1.56.0       rpart_4.1-12             acepack_1.4.1           
>

 

I really appreciate your help :)

 

Kind regards,

 

Josef

 

ADD COMMENTlink modified 6 months ago by Michael Love19k • written 6 months ago by josef.oeckl0
0
gravatar for Michael Love
6 months ago by
Michael Love19k
United States
Michael Love19k wrote:

hi Josef,

Since you are not using vignette code from the packages, it's not really the software developers job to sift through an R script and see what's going on, or how it needs to be updated to work with current packages.

I'd suggest starting from the goseq vignette and using the code they have there, and if you have a question, check the manual page for the specific goseq function to see what input it is expecting and what output it gives you. All that you need from DESeq2 is the results table.

ADD COMMENTlink written 6 months ago by Michael Love19k

Thank you, I'll try!

ADD REPLYlink written 6 months ago by josef.oeckl0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 294 users visited in the last hour