Help with toGRanges
mhartman3 • 0
Last seen 8.5 years ago


I'm new to R, but am trying to analyze 2 ChIP-Seq data sets. I have been following along with several ChIPpeakAnno guides and I am running into a problem I can't figure out. I have managed to find overlapping peaks, but when I go to annotate them I run into the following problem:

> library(EnsDb.Hsapiens.v75)
Loading required package: ensembldb
Loading required package: GenomicFeatures
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> annoData <- toGRanges(EnsDb.Hsapiens.v75, feature="gene")
Error in toGRanges(EnsDb.Hsapiens.v75, feature = "gene") :
  No valid data passed in. For example a data frame as BED format
             file with at least 3 fields in the order of: chromosome, start and end.
             Optional fields are name, score and strand etc.
             Please refer to for details.


I'd appreciate any help in fixing this error!


granges chippeakanno • 2.3k views
For extra info:

> sessionInfo()
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 10586)

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
 [1] stats4    parallel  grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] EnsDb.Hsapiens.v75_0.99.12 ensembldb_1.2.2            GenomicFeatures_1.22.13    AnnotationDbi_1.32.3      
 [5] Biobase_2.30.0             ChIPpeakAnno_3.4.6         BiocInstaller_1.20.1       RSQLite_1.0.0             
 [9] DBI_0.3.1                  VennDiagram_1.6.16         futile.logger_1.4.1        GenomicRanges_1.22.4      
[13] GenomeInfoDb_1.6.3         Biostrings_2.38.4          XVector_0.10.0             IRanges_2.4.8             
[17] S4Vectors_0.8.11           BiocGenerics_0.16.1       

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.4                  AnnotationHub_2.2.5          regioneR_1.2.3               bitops_1.0-6                
 [5] futile.options_1.0.0         tools_3.2.4                  zlibbioc_1.16.0              biomaRt_2.26.1              
 [9] digest_0.6.9                 memoise_1.0.0                BSgenome_1.38.0              graph_1.48.0                
[13] shiny_0.13.1                 httr_1.1.0                   rtracklayer_1.30.3           multtest_2.26.0             
[17] R6_2.1.2                     XML_3.98-1.4                 survival_2.38-3              RBGL_1.46.0                 
[21] BiocParallel_1.4.3           limma_3.26.9                 GO.db_3.2.2                  lambda.r_1.1.7              
[25] matrixStats_0.50.1           htmltools_0.3.5              Rsamtools_1.22.0             splines_3.2.4               
[29] MASS_7.3-45                  GenomicAlignments_1.6.3      SummarizedExperiment_1.0.2   xtable_1.8-2                
[33] mime_0.4                     interactiveDisplayBase_1.8.0 httpuv_1.3.3                 RCurl_1.95-4.8              


Dario Strbenac ★ 1.5k
Last seen 4 days ago

toGenes is for converting simple data structures, such as data frames, into a GRanges object. The correct approach is simply :

> genes(EnsDb.Hsapiens.v75)
GRanges object with 64102 ranges and 5 metadata columns:
                  seqnames                 ranges strand   |         gene_id   gene_name    entrezid   gene_biotype seq_coord_system
                     <Rle>              <IRanges>  <Rle>   |     <character> <character> <character>    <character>      <character>
  ENSG00000000003        X [ 99883667,  99894988]      -   | ENSG00000000003      TSPAN6        7105 protein_coding       chromosome
  ENSG00000000005        X [ 99839799,  99854882]      +   | ENSG00000000005        TNMD       64102 protein_coding       chromosome
  ENSG00000000419       20 [ 49551404,  49575092]      -   | ENSG00000000419        DPM1        8813 protein_coding       chromosome
  ENSG00000000457        1 [169818772, 169863408]      -   | ENSG00000000457       SCYL3       57147 protein_coding       chromosome
  ENSG00000000460        1 [169631245, 169823221]      +   | ENSG00000000460    C1orf112       55732 protein_coding       chromosome
              ...      ...                    ...    ... ...             ...         ...         ...            ...              ...
           LRG_94       10   [72357104, 72362531]      -   |          LRG_94      LRG_94        5551       LRG_gene       chromosome
           LRG_96       15   [55495792, 55582001]      -   |          LRG_96      LRG_96        5873       LRG_gene       chromosome
           LRG_97       22   [37621310, 37640305]      -   |          LRG_97      LRG_97        5880       LRG_gene       chromosome
           LRG_98       11   [36589563, 36601312]      +   |          LRG_98      LRG_98        5896       LRG_gene       chromosome
           LRG_99       11   [36613493, 36619812]      -   |          LRG_99      LRG_99        5897       LRG_gene       chromosome
  seqinfo: 273 sequences from GRCh37 genome

