gwascat Liftover Fails Silently
Entering edit mode
anailis • 0
Last seen 7 weeks ago

I would like to obtain GWAS Catalog data on hg19 co-ordinates so I use makeCurrentGwascat(genome = "GRCh37"). Everything appears to go smoothly and the object says that it is on hg19 co-ordinates. But if you compare the co-ordinates to what they should be, you will quickly realise that they are still on hg38 co-ordinates despite the liftover appearing to complete successfully. For example, rs56106601 hg38's co-ordinates are chr9:128008205 and its hg19 co-ordinates are chr9:130770484 (, but attempt@elementMetadata[attempt@elementMetadata$SNPS=="rs56106601","CHR_POS"] yields 128008205 - hg38 co-ordinates.

> library(gwascat)
gwascat loaded.  Use makeCurrentGwascat() to extract current image.
 from EBI.  The data folder of this package has some legacy extracts.
> library(liftOver)
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append,, basename, cbind,
    colnames, dirname,, duplicated, eval, evalq, Filter,
    Find, get, grep, grepl, intersect, is.unsorted, lapply, Map,
    mapply, match, mget, order, paste, pmax,, pmin,, Position, rank, rbind, Reduce, rownames, sapply,
    setdiff, sort, table, tapply, union, unique, unsplit,
    which.max, which.min

Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:


Loading required package: IRanges

Attaching package: ‘IRanges’

The following object is masked from ‘package:grDevices’:


Loading required package: GenomeInfoDb
Loading required package: rtracklayer
Loading required package: Homo.sapiens
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages

Loading required package: OrganismDbi
Loading required package: GenomicFeatures
Loading required package: GO.db

Loading required package:

Loading required package: TxDb.Hsapiens.UCSC.hg19.knownGene
> attempt <- makeCurrentGwascat(genome = "GRCh37", withOnt = F)
trying URL ''
Content type 'text/tsv' length unknown
ownloaded 125.9 MB

|============================================================| 100% 126 MB
Warning: 5303 parsing failures.
row     col               expected            actual                                                                      file
669 CHR_POS no trailing characters 27556782;27543283 'C:\Users\XXX\AppData\Local\Temp\RtmpO0PFpO\file3a1c3aa8659b'
670 CHR_POS no trailing characters 27556782;27543283 'C:\Users\XXX\AppData\Local\Temp\RtmpO0PFpO\file3a1c3aa8659b'
851 CHR_POS no trailing characters 34150806;34141638 'C:\Users\XXX\AppData\Local\Temp\RtmpO0PFpO\file3a1c3aa8659b'
852 CHR_POS no trailing characters 34150806;34141638 'C:\Users\XXX\AppData\Local\Temp\RtmpO0PFpO\file3a1c3aa8659b'
853 CHR_POS no trailing characters 34150806;34141638 'C:\Users\XXX\AppData\Local\Temp\RtmpO0PFpO\file3a1c3aa8659b'
... ....... ...................... ................. .........................................................................
See problems(...) for more details.

formatting gwaswloc instance...
NOTE: input data had non-ASCII characters replaced by '*'.
Loading required namespace: AnnotationHub
starting liftover from GRCh38 to GRCh37
snapshotDate(): 2020-10-27
loading from cache
liftover complete.
> attempt
gwasloc instance with 235055 records and 34 attributes per record.
Extracted:  Wed Feb 03 16:30:36 2021 
metadata()$badpos includes records for which no unique locus was given.
Genome:  GRCh37 
GRanges object with 5 ranges and 3 metadata columns:
      seqnames    ranges strand |   DISEASE/TRAIT        SNPS   P-VALUE
         <Rle> <IRanges>  <Rle> |     <character> <character> <numeric>
  [1]        8  19844222      * | HDL cholesterol  rs12678919     2e-34
  [2]       18  47167214      * | HDL cholesterol   rs4939883     7e-15
  [3]       11 116648917      * | HDL cholesterol    rs964184     1e-12
  [4]        9 107664301      * | HDL cholesterol   rs1883025     1e-09
  [5]        1 230295691      * | HDL cholesterol   rs4846914     4e-08
  seqinfo: 24 sequences from GRCh37 genome
> attempt@elementMetadata[attempt@elementMetadata$SNPS=="rs56106601","CHR_POS"]
[1] 128008205

Session info etc:

> BiocManager::valid()

* sessionInfo()

R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)

Matrix products: default

[1] LC_COLLATE=English_United Kingdom.1252 
[2] LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] liftOver_1.14.0                        
 [2] Homo.sapiens_1.3.1                     
 [3] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
 [5] GO.db_3.12.1                           
 [6] OrganismDbi_1.32.0                     
 [7] GenomicFeatures_1.42.1                 
 [8] AnnotationDbi_1.52.0                   
 [9] Biobase_2.50.0                         
[10] rtracklayer_1.49.5                     
[11] GenomicRanges_1.42.0                   
[12] GenomeInfoDb_1.26.2                    
[13] IRanges_2.24.1                         
[14] S4Vectors_0.28.1                       
[15] BiocGenerics_0.36.0                    
[16] gwascat_2.22.0                         

loaded via a namespace (and not attached):
 [1] bitops_1.0-6                  matrixStats_0.58.0           
 [3] bit64_4.0.5                   progress_1.2.2               
 [5] httr_1.4.2                    tools_4.0.3                  
 [7] R6_2.5.0                      DBI_1.1.1                    
 [9] withr_2.4.1                   tidyselect_1.1.0             
[11] prettyunits_1.1.1             bit_4.0.4                    
[13] curl_4.3                      compiler_4.0.3               
[15] graph_1.68.0                  pacman_0.5.1                 
[17] xml2_1.3.2                    DelayedArray_0.16.1          
[19] readr_1.4.0                   RBGL_1.66.0                  
[21] askpass_1.1                   rappdirs_0.3.3               
[23] stringr_1.4.0                 digest_0.6.27                
[25] Rsamtools_2.6.0               XVector_0.30.0               
[27] pkgconfig_2.0.3               htmltools_0.5.1.1            
[29] MatrixGenerics_1.2.1          dbplyr_2.0.0                 
[31] fastmap_1.1.0                 BSgenome_1.58.0              
[33] rlang_0.4.10                  rstudioapi_0.13              
[35] RSQLite_2.2.3                 shiny_1.6.0                  
[37] generics_0.1.0                BiocParallel_1.24.1          
[39] dplyr_1.0.4                   VariantAnnotation_1.36.0     
[41] RCurl_1.98-1.2                magrittr_2.0.1               
[43] GenomeInfoDbData_1.2.4        Matrix_1.3-2                 
[45] Rcpp_1.0.6                    lifecycle_0.2.0              
[47] stringi_1.5.3                 yaml_2.2.1                   
[49] SummarizedExperiment_1.20.0   zlibbioc_1.36.0              
[51] BiocFileCache_1.14.0          AnnotationHub_2.22.0         
[53] grid_4.0.3                    blob_1.2.1                   
[55] promises_1.1.1                snpStats_1.40.0              
[57] crayon_1.4.0                  lattice_0.20-41              
[59] Biostrings_2.58.0             splines_4.0.3                
[61] hms_1.0.0                     pillar_1.4.7                 
[63] biomaRt_2.46.2                XML_3.99-0.5                 
[65] glue_1.4.2                    BiocVersion_3.12.0           
[67] BiocManager_1.30.10           vctrs_0.3.6                  
[69] httpuv_1.5.5                  openssl_1.4.3                
[71] purrr_0.3.4                   assertthat_0.2.1             
[73] cachem_1.0.1                  mime_0.9                     
[75] xtable_1.8-4                  later_1.1.0.1                
[77] survival_3.2-7                tibble_3.0.6                 
[79] GenomicAlignments_1.26.0      memoise_2.0.0                
[81] ellipsis_0.3.1                interactiveDisplayBase_1.28.0

Bioconductor version '3.12'

  * 1 packages out-of-date
  * 0 packages too new

create a valid installation with

  BiocManager::install("rtracklayer", update = TRUE, ask = FALSE)

more details: BiocManager::valid()$too_new, BiocManager::valid()$out_of_date

Warning message:
1 packages out-of-date; 0 packages too new

Note that rtracklayer is still outdated because I'm a Windows user (see

gwascat • 52 views

Login before adding your answer.

Traffic: 365 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6