Help with toGRanges
1
0
Entering edit mode
mhartman3 • 0
@mhartman3-9996
Last seen 8.1 years ago

Hi,

I'm new to R, but am trying to analyze 2 ChIP-Seq data sets. I have been following along with several ChIPpeakAnno guides and I am running into a problem I can't figure out. I have managed to find overlapping peaks, but when I go to annotate them I run into the following problem:

> library(EnsDb.Hsapiens.v75)
Loading required package: ensembldb
Loading required package: GenomicFeatures
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

> annoData <- toGRanges(EnsDb.Hsapiens.v75, feature="gene")
Error in toGRanges(EnsDb.Hsapiens.v75, feature = "gene") :
  No valid data passed in. For example a data frame as BED format
             file with at least 3 fields in the order of: chromosome, start and end.
             Optional fields are name, score and strand etc.
             Please refer to http://genome.ucsc.edu/FAQ/FAQformat#format1 for details.

 

I'd appreciate any help in fixing this error!

 

granges chippeakanno • 2.2k views
ADD COMMENT
0
Entering edit mode

For extra info:

> sessionInfo()
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 10586)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
 [1] stats4    parallel  grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] EnsDb.Hsapiens.v75_0.99.12 ensembldb_1.2.2            GenomicFeatures_1.22.13    AnnotationDbi_1.32.3      
 [5] Biobase_2.30.0             ChIPpeakAnno_3.4.6         BiocInstaller_1.20.1       RSQLite_1.0.0             
 [9] DBI_0.3.1                  VennDiagram_1.6.16         futile.logger_1.4.1        GenomicRanges_1.22.4      
[13] GenomeInfoDb_1.6.3         Biostrings_2.38.4          XVector_0.10.0             IRanges_2.4.8             
[17] S4Vectors_0.8.11           BiocGenerics_0.16.1       

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.4                  AnnotationHub_2.2.5          regioneR_1.2.3               bitops_1.0-6                
 [5] futile.options_1.0.0         tools_3.2.4                  zlibbioc_1.16.0              biomaRt_2.26.1              
 [9] digest_0.6.9                 memoise_1.0.0                BSgenome_1.38.0              graph_1.48.0                
[13] shiny_0.13.1                 httr_1.1.0                   rtracklayer_1.30.3           multtest_2.26.0             
[17] R6_2.1.2                     XML_3.98-1.4                 survival_2.38-3              RBGL_1.46.0                 
[21] BiocParallel_1.4.3           limma_3.26.9                 GO.db_3.2.2                  lambda.r_1.1.7              
[25] matrixStats_0.50.1           htmltools_0.3.5              Rsamtools_1.22.0             splines_3.2.4               
[29] MASS_7.3-45                  GenomicAlignments_1.6.3      SummarizedExperiment_1.0.2   xtable_1.8-2                
[33] mime_0.4                     interactiveDisplayBase_1.8.0 httpuv_1.3.3                 RCurl_1.95-4.8              
>

 

ADD REPLY
0
Entering edit mode
Dario Strbenac ★ 1.5k
@dario-strbenac-5916
Last seen 1 day ago
Australia

toGenes is for converting simple data structures, such as data frames, into a GRanges object. The correct approach is simply :

> genes(EnsDb.Hsapiens.v75)
GRanges object with 64102 ranges and 5 metadata columns:
                  seqnames                 ranges strand   |         gene_id   gene_name    entrezid   gene_biotype seq_coord_system
                     <Rle>              <IRanges>  <Rle>   |     <character> <character> <character>    <character>      <character>
  ENSG00000000003        X [ 99883667,  99894988]      -   | ENSG00000000003      TSPAN6        7105 protein_coding       chromosome
  ENSG00000000005        X [ 99839799,  99854882]      +   | ENSG00000000005        TNMD       64102 protein_coding       chromosome
  ENSG00000000419       20 [ 49551404,  49575092]      -   | ENSG00000000419        DPM1        8813 protein_coding       chromosome
  ENSG00000000457        1 [169818772, 169863408]      -   | ENSG00000000457       SCYL3       57147 protein_coding       chromosome
  ENSG00000000460        1 [169631245, 169823221]      +   | ENSG00000000460    C1orf112       55732 protein_coding       chromosome
              ...      ...                    ...    ... ...             ...         ...         ...            ...              ...
           LRG_94       10   [72357104, 72362531]      -   |          LRG_94      LRG_94        5551       LRG_gene       chromosome
           LRG_96       15   [55495792, 55582001]      -   |          LRG_96      LRG_96        5873       LRG_gene       chromosome
           LRG_97       22   [37621310, 37640305]      -   |          LRG_97      LRG_97        5880       LRG_gene       chromosome
           LRG_98       11   [36589563, 36601312]      +   |          LRG_98      LRG_98        5896       LRG_gene       chromosome
           LRG_99       11   [36613493, 36619812]      -   |          LRG_99      LRG_99        5897       LRG_gene       chromosome
  -------
  seqinfo: 273 sequences from GRCh37 genome
ADD COMMENT

Login before adding your answer.

Traffic: 731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6