DESeq2 followed by goseq GO-analysis: Extracting DE genes of enriched GO category
1
0
Entering edit mode
@josefoeckl-15020
Last seen 6.8 years ago

Hi everyone,

I am a PhD-student from Germany (Munich) and actually I am working on adipose tissue biology. As a side project I started working with RNA-seq data. I successfully managed to test for differential gene expression using "DESeq2" followed by GO-analysis with "goseq". So far, so good. However, I would like to know which genes (annotated to this pathway) are differentially expressed in my dataset.

 

library(DESeq2)
library(goseq)
library(GO.db)
library (org.Mm.eg.db)
library(annotate)
#"DESeq2": Pulling out the contrast of interest
res <- results(dds, contrast=c("genotype","iWATKO", "iWATWT"))

#Constructing a gene vector for "goseq"
assayed.genes <- rownames(res)
fdr.threshold <- 0.1
de.genes <- rownames(res)[ which(res$padj < fdr.threshold) ]
gene.vector=as.integer(assayed.genes%in%de.genes)
names(gene.vector)=assayed.genes

#Fitting PWF
pwf=nullp(gene.vector,"mm9","geneSymbol")

#From here on I used the code posted by Iain Gallagher goseq analysis - extracting list of my genes which are DE in the enriched GO category
goCats <- goseq(pwf, "mm9","geneSymbol", test.cats=("GO:BP"))
sigCats <- goCats[which(goCats[,2] < 0.05),]
cats <- sigCats$category
terms <- stack(lapply(mget(cats,GOTERM, ifnotfound=NA),Term))
sigCats$Term <- with(sigCats, terms$values[match(terms$ind, sigCats$category)] )

##Original code (Where does topGenes come from?)
allGos <- stack(getgo(rownames(topGenes$table), "mm9", "geneSymbol"))

##I replaced topGenes with res, not sure if this is correct?! However, it does not produce an error.
allGos <- stack(getgo(rownames(res), "mm9", "geneSymbol"))
onlySigCats <- allGos[allGos$values %in% sigCats$category,]
onlySigCats$Term <- with( onlySigCats, terms$value[match(onlySigCats$values, terms$ind)] )

##Next line is producing an error, what is annot2???
#add the gene symbol 
onlySigCats$Symbol <- with( onlySigCats, annot2[,2][match(onlySigCats$ind, rownames(annot2) )] )


Error in eval(substitute(expr), data, enclos = parent.frame()) :
  Object 'annot2' not found
> traceback ()
5: eval(substitute(expr), data, enclos = parent.frame())
4: eval(substitute(expr), data, enclos = parent.frame())
3: with.default(onlySigCats, annot2[, 2][match(onlySigCats$ind,
       rownames(annot2))])
2: with(onlySigCats, annot2[, 2][match(onlySigCats$ind, rownames(annot2))])
1: with(onlySigCats, annot2[, 2][match(onlySigCats$ind, rownames(annot2))])

 

##Again,where does topGenes come from???
#add the gene logFC 
onlySigCats$logFC <- with( onlySigCats, topGenes$table$logFC[match(onlySigCats$ind, rownames(topGenes$table) )] )

 

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 10240)

Matrix products: default

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base     

other attached packages:
 [1] org.Mm.eg.db_3.5.0         goseq_1.30.0              
 [3] geneLenDataBase_1.14.0     BiasedUrn_1.07            
 [5] GO.db_3.5.0                annotate_1.56.1           
 [7] XML_3.98-1.9               AnnotationDbi_1.40.0      
 [9] DESeq2_1.18.1              SummarizedExperiment_1.8.1
[11] DelayedArray_0.4.1         matrixStats_0.53.1        
[13] Biobase_2.38.0             GenomicRanges_1.30.0      
[15] GenomeInfoDb_1.14.0        IRanges_2.12.0            
[17] S4Vectors_0.16.0           BiocGenerics_0.24.0       
[19] BiocInstaller_1.28.0      

loaded via a namespace (and not attached):
 [1] httr_1.3.1               RMySQL_0.10.13           bit64_0.9-7             
 [4] splines_3.4.3            assertthat_0.2.0         Formula_1.2-2           
 [7] latticeExtra_0.6-28      blob_1.1.0               GenomeInfoDbData_1.0.0  
[10] Rsamtools_1.30.0         progress_1.1.2           pillar_1.1.0            
[13] RSQLite_2.0              backports_1.1.2          lattice_0.20-35         
[16] digest_0.6.15            RColorBrewer_1.1-2       XVector_0.18.0          
[19] checkmate_1.8.5          colorspace_1.3-2         htmltools_0.3.6         
[22] Matrix_1.2-12            plyr_1.8.4               pkgconfig_2.0.1         
[25] biomaRt_2.34.2           genefilter_1.60.0        zlibbioc_1.24.0         
[28] xtable_1.8-2             scales_0.5.0             BiocParallel_1.12.0     
[31] htmlTable_1.11.2         tibble_1.4.2             mgcv_1.8-23             
[34] ggplot2_2.2.1            GenomicFeatures_1.30.3   nnet_7.3-12             
[37] lazyeval_0.2.1           survival_2.41-3          magrittr_1.5            
[40] memoise_1.1.0            nlme_3.1-131             foreign_0.8-69          
[43] prettyunits_1.0.2        tools_3.4.3              data.table_1.10.4-3     
[46] stringr_1.2.0            munsell_0.4.3            locfit_1.5-9.1          
[49] cluster_2.0.6            Biostrings_2.46.0        compiler_3.4.3          
[52] rlang_0.1.6              grid_3.4.3               RCurl_1.95-4.10         
[55] rstudioapi_0.7           htmlwidgets_1.0          bitops_1.0-6            
[58] base64enc_0.1-3          gtable_0.2.0             DBI_0.7                 
[61] R6_2.2.2                 GenomicAlignments_1.14.1 gridExtra_2.3           
[64] knitr_1.19               rtracklayer_1.38.3       bit_1.1-12              
[67] Hmisc_4.1-1              stringi_1.1.6            Rcpp_0.12.15            
[70] geneplotter_1.56.0       rpart_4.1-12             acepack_1.4.1           
>

 

I really appreciate your help :)

 

Kind regards,

 

Josef

 

deseq2 goseq rnaseq go enrichment differential gene expression • 4.0k views
ADD COMMENT
0
Entering edit mode
@mikelove
Last seen 3 hours ago
United States

hi Josef,

Since you are not using vignette code from the packages, it's not really the software developers job to sift through an R script and see what's going on, or how it needs to be updated to work with current packages.

I'd suggest starting from the goseq vignette and using the code they have there, and if you have a question, check the manual page for the specific goseq function to see what input it is expecting and what output it gives you. All that you need from DESeq2 is the results table.

ADD COMMENT
0
Entering edit mode

Thank you, I'll try!

ADD REPLY

Login before adding your answer.

Traffic: 941 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6