Annotation using QSEA package
1
0
Entering edit mode
rtrivedi1 ▴ 10
@4899789e
Last seen 3.4 years ago
United States

Hi everyone,

I am analyzing methylation data obtained by using enrichment based methods using QSEA package. While performing annotation, I am getting the following error.

result=makeTable(QSEAset, glm=qseaGLM,
+                  groupMeans=getSampleGroups(QSEAset), keep=sig,
+                  annotation=hg38_anno_final,
+                  norm_method="beta")

Run:
adding test results from TvN
obtaining raw values for 3 samples in 1353 windows
deriving beta values...
adding annotation
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 'as.data.frame': sequences chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22 have incompatible genomes:
  - in 'x': BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38
  - in 'y': BSgenome.Hsapiens.UCSC.NA, BSgenome.Hsapiens.UCSC.NA, BSgen

For qsea object creation, I have used 'BSgenome.Hsapiens.UCSC.hg38', and got the annotation file (hg38_anno_final) using 'TxDb.Hsapiens.UCSC.hg38.knownGene'. I also have changed all the seqinfo in the hg38_anno_final file using following

seqinfo(hg38_anno_final$hg38_genes_5UTRs)@genome<-paste0("BSgenome.Hsapiens.UCSC.",seqinfo(hg38_anno_final$hg38_genes_5UTRs)@genome)

I am unable to figure out the origin of problem. Any help will be highly appreciated.

sessionInfo( )

R version 4.0.5 (2021-03-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] shiny_1.6.0                             
 [2] AnnotationHub_2.22.1                    
 [3] BiocFileCache_1.14.0                    
 [4] dbplyr_2.1.1                            
 [5] BSgenome.Hsapiens.UCSC.hg19_1.4.3       
 [6] org.Hs.eg.db_3.12.0                     
 [7] TxDb.Hsapiens.UCSC.hg38.knownGene_3.10.0
 [8] GenomicFeatures_1.42.3                  
 [9] AnnotationDbi_1.52.0                    
[10] Biobase_2.50.0                          
[11] annotatr_1.16.0                         
[12] BSgenome.Hsapiens.UCSC.hg38_1.4.3       
[13] BSgenome_1.58.0                         
[14] rtracklayer_1.50.0                      
[15] Biostrings_2.58.0                       
[16] XVector_0.30.0                          
[17] GenomicRanges_1.42.0                    
[18] GenomeInfoDb_1.26.7                     
[19] IRanges_2.24.1                          
[20] S4Vectors_0.28.1                        
[21] BiocGenerics_0.36.1                     
[22] qsea_1.16.0                             

loaded via a namespace (and not attached):
 [1] bitops_1.0-7                           
 [2] matrixStats_0.59.0                     
 [3] bit64_4.0.5                            
 [4] progress_1.2.2                         
 [5] httr_1.4.2                             
 [6] bslib_0.2.5.1                          
 [7] tools_4.0.5                            
 [8] DT_0.18                                
 [9] utf8_1.2.1                             
[10] R6_2.5.0                               
[11] colorspace_2.0-1                       
[12] DBI_1.1.1                              
[13] withr_2.4.2                            
[14] tidyselect_1.1.1                       
[15] prettyunits_1.1.1                      
[16] bit_4.0.4                              
[17] curl_4.3.1                             
[18] compiler_4.0.5                         
[19] cli_2.5.0                              
[20] xml2_1.3.2                             
[21] DelayedArray_0.16.3                    
[22] sass_0.4.0                             
[23] scales_1.1.1                           
[24] readr_1.4.0                            
[25] askpass_1.1                            
[26] rappdirs_0.3.3                         
[27] stringr_1.4.0                          
[28] digest_0.6.27                          
[29] Rsamtools_2.6.0                        
[30] pkgconfig_2.0.3                        
[31] htmltools_0.5.1.1                      
[32] MatrixGenerics_1.2.1                   
[33] regioneR_1.22.0                        
[34] fastmap_1.1.0                          
[35] limma_3.46.0                           
[36] htmlwidgets_1.5.3                      
[37] rlang_0.4.11                           
[38] rstudioapi_0.13                        
[39] RSQLite_2.2.7                          
[40] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[41] jquerylib_0.1.4                        
[42] generics_0.1.0                         
[43] jsonlite_1.7.2                         
[44] zoo_1.8-9                              
[45] crosstalk_1.1.1                        
[46] BiocParallel_1.24.1                    
[47] gtools_3.9.2                           
[48] dplyr_1.0.6                            
[49] RCurl_1.98-1.3                         
[50] magrittr_2.0.1                         
[51] GenomeInfoDbData_1.2.4                 
[52] Matrix_1.3-4                           
[53] Rcpp_1.0.6                             
[54] munsell_0.5.0                          
[55] fansi_0.5.0                            
[56] lifecycle_1.0.0                        
[57] stringi_1.6.2                          
[58] yaml_2.2.1                             
[59] SummarizedExperiment_1.20.0            
[60] zlibbioc_1.36.0                        
[61] plyr_1.8.6                             
[62] HMMcopy_1.32.0                         
[63] grid_4.0.5                             
[64] blob_1.2.1                             
[65] promises_1.2.0.1                       
[66] crayon_1.4.1                           
[67] lattice_0.20-44                        
[68] hms_1.1.0                              
[69] knitr_1.33                             
[70] pillar_1.6.1                           
[71] reshape2_1.4.4                         
[72] biomaRt_2.46.3                         
[73] XML_3.99-0.6                           
[74] glue_1.4.2                             
[75] BiocVersion_3.12.0                     
[76] data.table_1.14.0                      
[77] BiocManager_1.30.16                    
[78] vctrs_0.3.8                            
[79] httpuv_1.6.1                           
[80] gtable_0.3.0                           
[81] openssl_1.4.4                          
[82] purrr_0.3.4                            
[83] assertthat_0.2.1                       
[84] cachem_1.0.5                           
[85] ggplot2_3.3.4                          
[86] xfun_0.24                              
[87] mime_0.10                              
[88] xtable_1.8-4                           
[89] later_1.2.0                            
[90] tibble_3.1.2                           
[91] GenomicAlignments_1.26.0               
[92] memoise_2.0.0                          
[93] ellipsis_0.3.2                         
[94] interactiveDisplayBase_1.28.0

Thanks

Rakesh

DNAMethylation QSEA MEDIP • 1.7k views
ADD COMMENT
1
Entering edit mode

What is class(hg38_anno_final) ? It should be a GRanges object to work correctly

ADD REPLY
0
Entering edit mode

It's a list of GRanges object.

> class(hg38_anno_final)
[1] "list"

> head(hg38_anno_final)
$hg38_genes_1to5kb
GRanges object with 219477 ranges and 5 metadata columns:
           seqnames            ranges strand |            id             tx_id
              <Rle>         <IRanges>  <Rle> |   <character>       <character>
       [1]     chr1        6869-10868      + |      1to5kb:1 ENST00000456328.2
       [2]     chr1        7010-11009      + |      1to5kb:2 ENST00000450305.2
       [3]     chr1       24554-28553      + |      1to5kb:3 ENST00000473358.1
       [4]     chr1       25267-29266      + |      1to5kb:4 ENST00000469289.1
       [5]     chr1       25366-29365      + |      1to5kb:5 ENST00000607096.1
       ...      ...               ...    ... .           ...               ...
  [219473]    chr22 50784287-50788286      - | 1to5kb:219473 ENST00000464678.5
  [219474]    chr22 50784631-50788630      - | 1to5kb:219474 ENST00000395590.5
  [219475]    chr22 50784601-50788600      - | 1to5kb:219475 ENST00000468451.1
  [219476]    chr22 50784631-50788630      - | 1to5kb:219476 ENST00000464740.1
  [219477]    chr22 50784046-50788045      - | 1to5kb:219477 ENST00000413505.1
               gene_id      symbol              type
           <character> <character>       <character>
       [1]   100287102     DDX11L1 hg38_genes_1to5kb
       [2]   100287102     DDX11L1 hg38_genes_1to5kb
       [3]        <NA>        <NA> hg38_genes_1to5kb
       [4]        <NA>        <NA> hg38_genes_1to5kb
       [5]   100302278   MIR1302-2 hg38_genes_1to5kb
       ...         ...         ...               ...
  [219473]       11158      RABL2B hg38_genes_1to5kb
  [219474]       11158      RABL2B hg38_genes_1to5kb
  [219475]       11158      RABL2B hg38_genes_1to5kb
  [219476]       11158      RABL2B hg38_genes_1to5kb
  [219477]       11158      RABL2B hg38_genes_1to5kb
  -------
  seqinfo: 22 sequences from BSgenome.Hsapiens.UCSC.hg38 genome

Thanks

Rakesh

ADD REPLY
0
Entering edit mode

What is the code you used to construct your hg38_anno_final ?

ADD REPLY
0
Entering edit mode

I used the following code to get hg38_anno_final:

library(annotatr)

library(TxDb.Hsapiens.UCSC.hg38.knownGene)

hg38_annot <- builtin_annotations()[grep("hg38",builtin_annotations())]

hg38_annot <- hg38_annot[hg38_annot!="hg38_lncrna_gencode"]

an_fun <- function(anno) build_annotations(genome='hg38', annotations = anno)

hg38_anno_final <- sapply(hg38_annot,an_fun)
ADD REPLY
1
Entering edit mode
@james-w-macdonald-5106
Last seen 2 days ago
United States

This sort of thing:

seqinfo(hg38_anno_final$hg38_genes_5UTRs)@genome<-paste0("BSgenome.Hsapiens.UCSC.",seqinfo(hg38_anno_final$hg38_genes_5UTRs)@genome)

is something that you should almost never (like 99% of the time) need to do, nor should you do it. In other words, mucking about in an S4 object instead of using the provided accessors is a suboptimal thing to do, unless you are a super l33t expert. Because what you ended up doing was borking all of the seqinfo for your hg38_anno_final object. Which is what the error says:

- in 'y': BSgenome.Hsapiens.UCSC.NA, BSgenome.Hsapiens.UCSC.NA, BSgen

You should probably show why you are trying to change the seqinfo slot like that, and perhaps people can then tell you the right way to do it.

ADD COMMENT
0
Entering edit mode

Hi,

Step 1: I used the following code to get hg38_anno_final:

library(annotatr)

library(TxDb.Hsapiens.UCSC.hg38.knownGene)

hg38_annot <- builtin_annotations()[grep("hg38",builtin_annotations())]

hg38_annot <- hg38_annot[hg38_annot!="hg38_lncrna_gencode"]

an_fun <- function(anno) build_annotations(genome='hg38', annotations = anno)

hg38_anno_final <- sapply(hg38_annot,an_fun)

Step 2 : I calculated regions of interest (ROIs) by using this code:

CGIprom = intersect(hg38_anno_final$hg38_cpg_islands, hg38_anno_final$hg38_genes_1to5kb,ignore.strand=TRUE)

Step 3: I calculated PCA using this:

pca_cgi = getPCA(QSEAset, norm_method="beta", ROIs=CGIprom)

Error I got:

Error in mergeNamedAtomicVectors(genome(x), genome(y), what = c("sequence",  : 
  sequences chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22 have incompatible genomes:
  - in 'x': BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38
  - in 'y': hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38, hg38

So I changed the all the seqinfo slot using similar type of code:

seqinfo(hg38_anno_final$hg38_genes_5UTRs)@genome<-paste0("BSgenome.Hsapiens.UCSC.",seqinfo(hg38_anno_final$hg38_genes_5UTRs)@genome)

Then Step 3 ran successfully. without any errors.

Step 4: I performed Differential Methylation Analysis using following code:

design=model.matrix(~group, getSampleTable(QSEAset) )

qseaGLM=fitNBglm(QSEAset, design, norm_method="beta")

qseaGLM=addContrast(QSEAset, qseaGLM, coef=2, name="TvN" )

Step 5 : I started annotating the results:

library(GenomicRanges) 

sig=isSignificant(qseaGLM, fdr_th=.01)             # Step executed successfully

result=makeTable(prad_QSEAset, glm=prad_qseaGLM,                               # Step giving Errors
                 groupMeans=getSampleGroups(prad_QSEAset), keep=sig,
                 annotation=hg38_anno_final,
                 norm_method="beta")

Here I got this error:

result=makeTable(QSEAset, glm=qseaGLM,
+                  groupMeans=getSampleGroups(QSEAset), keep=sig,
+                  annotation=hg38_anno_final,
+                  norm_method="beta")

Run:
adding test results from TvN
obtaining raw values for 3 samples in 1353 windows
deriving beta values...
adding annotation
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 'as.data.frame': sequences chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10, chr11, chr12, chr13, chr14, chr15, chr16, chr17, chr18, chr19, chr20, chr21, chr22 have incompatible genomes:
  - in 'x': BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38, BSgenome.Hsapiens.UCSC.hg38
  - in 'y': BSgenome.Hsapiens.UCSC.NA, BSgenome.Hsapiens.UCSC.NA, BSgen

That's how I got the error.

Thanks

Rakesh

ADD REPLY
1
Entering edit mode

OK. This is the part I meant about using an accessor. There is a function called genome that does what you have tried to do. It actually does genome(seqinfo(x)) where x is a GRanges object. So as an example

> library(TxDb.Hsapiens.UCSC.hg38.knownGene)
> z <- transcriptsBy(TxDb.Hsapiens.UCSC.hg38.knownGene)
> head(genome(z))
  chr1   chr2   chr3   chr4   chr5   chr6 
"hg38" "hg38" "hg38" "hg38" "hg38" "hg38" 
> genome(z) <- "BSgenome.Hsapiens.UCSC.hg38"
> head(genome(z))
                         chr1                          chr2 
"BSgenome.Hsapiens.UCSC.hg38" "BSgenome.Hsapiens.UCSC.hg38" 
                         chr3                          chr4 
"BSgenome.Hsapiens.UCSC.hg38" "BSgenome.Hsapiens.UCSC.hg38" 
                         chr5                          chr6 
"BSgenome.Hsapiens.UCSC.hg38" "BSgenome.Hsapiens.UCSC.hg38"

So doing something like

for(i in seq(along = hg38_anno_final))  genome(hg38_anno_final[[i]]) <- "BSgenome.Hsapiens.UCSC.hg38"

Should fix the issue.

ADD REPLY
0
Entering edit mode

Issue is fixed now !!

Thanks

Rakesh

ADD REPLY

Login before adding your answer.

Traffic: 524 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6