Search
Question: Extract three types of intergenic regions
0
gravatar for vinod.acear
2.1 years ago by
vinod.acear20
India
vinod.acear20 wrote:

Hi  is there any package to extract three types of intergenic regions (i.e. Tendem, convergent,divergent ) in granges object from genomic and GFF file.

ADD COMMENTlink modified 2.1 years ago by Thomas Girke1.6k • written 2.1 years ago by vinod.acear20
1
gravatar for Thomas Girke
2.1 years ago by
Thomas Girke1.6k
United States
Thomas Girke1.6k wrote:

The genFeatures() function in systemPipeR allows you to compute intergenic regions. The vignette for the Ribo-Seq workflow of systemPipeR package gives some examples (and/or consult ?genFeatures). The sub-classification of the intergenic regions you mention could be obtained downstream by using the plus/minus strand orientation of the genes defining the intergenic regions. The naming scheme of the intergenics by their neighboring genes (e.g. geneID1__geneID2) could help here but this would require some additional coding for you. If this is a common used case then I am happy to add this functionality to the to-do list for next update. Alternatively, one could discuss whether the utility to extract intergenic regions from TxDbs could become part of the GenomicFeatures package. In the past there were some discussions about this I believe.

Thomas 

ADD COMMENTlink written 2.1 years ago by Thomas Girke1.6k

Hi Thomas, Thanks for your suggestion. when i tried to  install above metioned package it is giving following warning and library is not loaded.

> source("http://bioconductor.org/biocLite.R") # Sources the biocLite.R installation script 
biocLite("systemPipeR") # Installs systemPipeR from Bioconductor 

installing to /home/vinod/R/x86_64-pc-linux-gnu-library/3.2/spatial/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (spatial)

The downloaded source packages are in
	‘/tmp/RtmpcmPuuP/downloaded_packages’Warning messages:
1: In download.file(url, destfile, method, mode = "wb", ...) :
  download had nonzero exit status
2: In install.packages(pkgs = doing, lib = lib, ...) :
  installation of package ‘BBmisc’ had non-zero exit status
3: In install.packages(pkgs = doing, lib = lib, ...) :
  installation of package ‘fail’ had non-zero exit status
4: In install.packages(pkgs = doing, lib = lib, ...) :
  installation of package ‘BatchJobs’ had non-zero exit status
5: In install.packages(pkgs = doing, lib = lib, ...) :
  installation of package ‘systemPipeR’ had non-zero exit status
> library("systemPipeR") # Loads the package
Error in library("systemPipeR") : 
  there is no package called ‘systemPipeR’
ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by vinod.acear20

HI Thomas somehow i installed package  ‘systemPipeR’ but genFeatures() is not available.  I am also sending you session info

 

> library("systemPipeR")
Loading required package: Rsamtools
Loading required package: ShortRead
Loading required package: BiocParallel
Loading required package: GenomicAlignments
Loading required package: DBI

> library("Rsamtools")
> library("ShortRead")
> library("BiocParallel")
> library("GenomicAlignments")
> library("DBI")
> library("systemPipeR")
> ?genFeatures
No documentation for ‘genFeatures’ in specified packages and libraries:
you could try ‘??genFeatures’
 
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu precise (12.04.5 LTS)

locale:
 [1] LC_CTYPE=en_IN.UTF-8       LC_NUMERIC=C               LC_TIME=en_IN.UTF-8       
 [4] LC_COLLATE=en_IN.UTF-8     LC_MONETARY=en_IN.UTF-8    LC_MESSAGES=en_IN.UTF-8   
 [7] LC_PAPER=en_IN.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_IN.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] systemPipeR_1.2.23      RSQLite_1.0.0           DBI_0.3.1               ShortRead_1.26.0       
 [5] GenomicAlignments_1.4.2 BiocParallel_1.2.22     Rsamtools_1.20.5        BSgenome_1.36.3        
 [9] rtracklayer_1.28.10     Biostrings_2.36.4       XVector_0.8.0           GenomicRanges_1.20.8   
[13] GenomeInfoDb_1.4.3      IRanges_2.2.9           S4Vectors_0.6.6         BiocGenerics_0.14.0    

loaded via a namespace (and not attached):
 [1] genefilter_1.50.0      reshape2_1.4.1         splines_3.2.2          lattice_0.20-33       
 [5] colorspace_1.2-6       base64enc_0.1-3        Category_2.34.2        XML_3.98-1.3          
 [9] RBGL_1.44.0            survival_2.38-3        GOstats_2.34.0         RColorBrewer_1.1-2    
[13] lambda.r_1.1.7         plyr_1.8.3             stringr_1.0.0          zlibbioc_1.14.0       
[17] munsell_0.4.2          gtable_0.1.2           futile.logger_1.4.1    hwriter_1.3.2         
[21] latticeExtra_0.6-26    Biobase_2.28.0         AnnotationDbi_1.30.1   GSEABase_1.30.2       
[25] proto_0.3-10           Rcpp_0.12.1            xtable_1.7-4           edgeR_3.10.5          
[29] scales_0.3.0           checkmate_1.6.3        limma_3.24.15          graph_1.46.0          
[33] annotate_1.46.1        sendmailR_1.2-1        brew_1.0-6             BatchJobs_1.6         
[37] fail_1.3               rjson_0.2.15           ggplot2_1.0.1          digest_0.6.8          
[41] stringi_1.0-1          BBmisc_1.9             grid_3.2.2             tools_3.2.2           
[45] bitops_1.0-6           magrittr_1.5           RCurl_1.95-4.7         futile.options_1.0.0  
[49] GO.db_3.1.2            MASS_7.3-44            pheatmap_1.0.7         Matrix_1.2-2          
[53] AnnotationForge_1.10.1

 

ADD REPLYlink written 2.1 years ago by vinod.acear20
You are running an old version of Bioc. You need to upgrade to Bioc 3.2. Thomas On Sat, Oct 24, 2015 at 6:06 AM vinod.acear [bioc] <noreply@bioconductor.org> wrote: > Activity on a post you are following on support.bioconductor.org > > User vinod.acear <https: support.bioconductor.org="" u="" 8884=""/> wrote Comment: > Extract three types of intergenic regions > <https: support.bioconductor.org="" p="" 73648="" #73766="">: > > HI Thomas somehow i installed package ‘systemPipeR’ but genFeatures() is > not available. I am also sending you session info > > > > > > library("systemPipeR") > Loading required package: Rsamtools > Loading required package: ShortRead > Loading required package: BiocParallel > Loading required package: GenomicAlignments > Loading required package: DBI > > > library("Rsamtools") > > library("ShortRead") > > library("BiocParallel") > > library("GenomicAlignments") > > library("DBI") > > library("systemPipeR") > > ?genFeatures > No documentation for ‘genFeatures’ in specified packages and libraries: > you could try ‘??genFeatures’ > > > > > > sessionInfo() > R version 3.2.2 (2015-08-14) > Platform: x86_64-pc-linux-gnu (64-bit) > Running under: Ubuntu precise (12.04.5 LTS) > > locale: > [1] LC_CTYPE=en_IN.UTF-8 LC_NUMERIC=C LC_TIME=en_IN.UTF-8 > [4] LC_COLLATE=en_IN.UTF-8 LC_MONETARY=en_IN.UTF-8 LC_MESSAGES=en_IN.UTF-8 > [7] LC_PAPER=en_IN.UTF-8 LC_NAME=C LC_ADDRESS=C > [10] LC_TELEPHONE=C LC_MEASUREMENT=en_IN.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats4 parallel stats graphics grDevices utils datasets methods base > > other attached packages: > [1] systemPipeR_1.2.23 RSQLite_1.0.0 DBI_0.3.1 ShortRead_1.26.0 > [5] GenomicAlignments_1.4.2 BiocParallel_1.2.22 Rsamtools_1.20.5 BSgenome_1.36.3 > [9] rtracklayer_1.28.10 Biostrings_2.36.4 XVector_0.8.0 GenomicRanges_1.20.8 > [13] GenomeInfoDb_1.4.3 IRanges_2.2.9 S4Vectors_0.6.6 BiocGenerics_0.14.0 > > loaded via a namespace (and not attached): > [1] genefilter_1.50.0 reshape2_1.4.1 splines_3.2.2 lattice_0.20-33 > [5] colorspace_1.2-6 base64enc_0.1-3 Category_2.34.2 XML_3.98-1.3 > [9] RBGL_1.44.0 survival_2.38-3 GOstats_2.34.0 RColorBrewer_1.1-2 > [13] lambda.r_1.1.7 plyr_1.8.3 stringr_1.0.0 zlibbioc_1.14.0 > [17] munsell_0.4.2 gtable_0.1.2 futile.logger_1.4.1 hwriter_1.3.2 > [21] latticeExtra_0.6-26 Biobase_2.28.0 AnnotationDbi_1.30.1 GSEABase_1.30.2 > [25] proto_0.3-10 Rcpp_0.12.1 xtable_1.7-4 edgeR_3.10.5 > [29] scales_0.3.0 checkmate_1.6.3 limma_3.24.15 graph_1.46.0 > [33] annotate_1.46.1 sendmailR_1.2-1 brew_1.0-6 BatchJobs_1.6 > [37] fail_1.3 rjson_0.2.15 ggplot2_1.0.1 digest_0.6.8 > [41] stringi_1.0-1 BBmisc_1.9 grid_3.2.2 tools_3.2.2 > [45] bitops_1.0-6 magrittr_1.5 RCurl_1.95-4.7 futile.options_1.0.0 > [49] GO.db_3.1.2 MASS_7.3-44 pheatmap_1.0.7 Matrix_1.2-2 > [53] AnnotationForge_1.10.1 > > > > ------------------------------ > > Post tags: granges, views, annotation, iranges > > You may reply via email or visit > C: Extract three types of intergenic regions >
ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by Thomas Girke1.6k

Hi Thomas, As per your advice i successfully installed the 'systemPipeR'.  I am triying to find intergenic regions from .gff file from this. link http://downloads.yeastgenome.org/curation/chromosomal_feature/saccharomyces_cerevisiae.gff

When i tried to get make txdb databse from gff file , txdb is not created . Commands and sessoninfo are given below 

Can u suggest me the process to get intergenic regions of Saccharomyces cerevisiae

 

>gffFile="/home/vinod/new_yeast/yeast_classfication/saccharomyces_cerevisiae3.gff"
> txdb <- makeTxDbFromGFF(file=gffFile, format="gff3", organism="Saccharomyces cerevisiae")
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... Error in .make_splicings(exons, cds, stop_codons) : 
  some CDS cannot be mapped to an exon
In addition: Warning message:
In .extract_exons_from_GRanges(cds_IDX, gr, ID, Name, Parent, feature = "cds",  :
  141 orphan CDSs were dropped
> feat <- genFeatures(txdb, featuretype="all", reduce_ranges=FALSE, upstream=1000, downstream=0)
Error in genFeatures(txdb, featuretype = "all", reduce_ranges = FALSE,  : 
  object 'txdb' not found

 

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu precise (12.04.5 LTS)

locale:
 [1] LC_CTYPE=en_IN.UTF-8       LC_NUMERIC=C               LC_TIME=en_IN.UTF-8        LC_COLLATE=en_IN.UTF-8    
 [5] LC_MONETARY=en_IN.UTF-8    LC_MESSAGES=en_IN.UTF-8    LC_PAPER=en_IN.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_IN.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] systemPipeR_1.4.2                           RSQLite_1.0.0                              
 [3] DBI_0.3.1                                   ShortRead_1.28.0                           
 [5] GenomicAlignments_1.6.1                     SummarizedExperiment_1.0.0                 
 [7] BiocParallel_1.4.0                          Rsamtools_1.22.0                           
 [9] BSgenome.Scerevisiae.UCSC.sacCer2_1.4.0     BSgenome_1.38.0                            
[11] rtracklayer_1.30.1                          Biostrings_2.38.0                          
[13] XVector_0.10.0                              TxDb.Scerevisiae.UCSC.sacCer2.sgdGene_3.2.2
[15] GenomicFeatures_1.22.0                      AnnotationDbi_1.32.0                       
[17] Biobase_2.30.0                              GenomicRanges_1.22.0                       
[19] GenomeInfoDb_1.6.0                          IRanges_2.4.1                              
[21] S4Vectors_0.8.0                             BiocGenerics_0.16.0                        

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.1            lattice_0.20-33        GO.db_3.2.2            digest_0.6.8           plyr_1.8.3            
 [6] futile.options_1.0.0   BatchJobs_1.6          ggplot2_1.0.1          zlibbioc_1.16.0        annotate_1.48.0       
[11] Matrix_1.2-2           checkmate_1.6.3        proto_0.3-10           GOstats_2.36.0         splines_3.2.2         
[16] stringr_1.0.0          pheatmap_1.0.7         RCurl_1.95-4.7         biomaRt_2.26.0         munsell_0.4.2         
[21] sendmailR_1.2-1        base64enc_0.1-3        BBmisc_1.9             fail_1.3               edgeR_3.12.0          
[26] XML_3.98-1.3           AnnotationForge_1.12.0 MASS_7.3-44            bitops_1.0-6           grid_3.2.2            
[31] RBGL_1.46.0            xtable_1.7-4           GSEABase_1.32.0        gtable_0.1.2           magrittr_1.5          
[36] scales_0.3.0           graph_1.48.0           stringi_1.0-1          hwriter_1.3.2          reshape2_1.4.1        
[41] genefilter_1.52.0      limma_3.26.0           latticeExtra_0.6-26    futile.logger_1.4.1    brew_1.0-6            
[46] rjson_0.2.15           lambda.r_1.1.7         RColorBrewer_1.1-2     tools_3.2.2            Category_2.36.0       
[51] survival_2.38-3        colorspace_1.2-6      
 

>

ADD REPLYlink modified 2.1 years ago • written 2.1 years ago by vinod.acear20
1
gravatar for Thomas Girke
2.1 years ago by
Thomas Girke1.6k
United States
Thomas Girke1.6k wrote:

makeTxDbFromGFF() from GenomicFeatures fails to produce a TxDb. Something in your GFF is not meeting the expected format. This is usually fixable by debugging the GFF. However, would you mind using Biomart as source of your annotations (GFF) instead? If this is fine then please try the following which works just fine:

> library(GenomicFeatures); library("biomaRt"); library(systemPipeR)
> txdb <- makeTxDbFromBiomart(biomart = "ensembl", dataset = "scerevisiae_gene_ensembl")
> myfeatures <- c("tx_type", "promoter", "intron", "exon", "cds", "intergenic")
> feat <- genFeatures(txdb, featuretype=myfeatures, reduce_ranges=FALSE, upstream=1000, downstream=0)

Created feature ranges: protein_coding, ncRNA, tRNA, snoRNA, pseudogene, snRNA, rRNA
Created feature ranges: promoter
Created feature ranges: intron
Created feature ranges: exon
Created feature ranges: cds
Created feature ranges: intergenic 

Now the intergenic ranges can be extracted with feat$intergenic. Note: using feature="all" will give you an error for fiveUTR/threeUTR since those return empty objects. I will change this to a warning in the next update of systemPipeR.

Thomas 

 

ADD COMMENTlink modified 2.1 years ago • written 2.1 years ago by Thomas Girke1.6k
0
gravatar for Hervé Pagès
2.1 years ago by
Hervé Pagès ♦♦ 13k
United States
Hervé Pagès ♦♦ 13k wrote:

Hi,

If your intergenic regions are annotated in your GFF file, then just import the file with the import() function from the rtracklayer package. This will return you a GRanges object with various metadata columns. One of them will be named type and it will tell you the type of feature for each range in the GRanges object. Your 3 types of intergenic regions should show up there.

H.

ADD COMMENTlink written 2.1 years ago by Hervé Pagès ♦♦ 13k

Hi Herve,  it had not shown intergenic features. 

ADD REPLYlink written 2.0 years ago by vinod.acear20

If you used http://downloads.yeastgenome.org/curation/chromosomal_feature/saccharomyces_cerevisiae.gff

it doesn't seem to contain information about the intergenic regions, unfortunately. That's why the GRanges object you got with rtracklayer::import() doesn't contain these regions either. The next thing to try is what Thomas suggested. I assume it worked for you because you accepted his answer. If you're still struggling with this, you would need to provide some details about what you've done so far and what problems you ran into.

Cheers,

H.

ADD REPLYlink written 2.0 years ago by Hervé Pagès ♦♦ 13k

Hi Herve ,

Trick by Thomas worked for me. Thanks for your support

ADD REPLYlink written 2.0 years ago by vinod.acear20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.2.0
Traffic: 406 users visited in the last hour