Entering edit mode
                    Hi all,
I'm sure it's just me missing something obviuos... but I can't find PTEN in TxDb.Hsapiens.UCSC.hg38.knownGene.
If I do this with two genes in hg19 I get the positions for both genes
> library(AnnotationDbi)
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
> library(TxDb.Hsapiens.UCSC.hg38.knownGene)
> AnnotationDbi::select(org.Hs.eg.db::org.Hs.eg.db, keys=c("NF1", "PTEN"), keytype="SYMBOL", columns="ENTREZID")
> all.genes <- genes(TxDb.Hsapiens.UCSC.hg19.knownGene)
> all.genes[(all.genes$gene_id %in% c("4763", "5728"))]
GRanges object with 2 ranges and 1 metadata column:
       seqnames            ranges strand |     gene_id
          <Rle>         <IRanges>  <Rle> | <character>
  4763    chr17 29421945-29708905      + |        4763
  5728    chr10 89623195-89728532      + |        5728
  -------
  seqinfo: 93 sequences (1 circular) from hg19 genome
However, if I do the same with hg38 I only get the information for NF1 but not for PTEN
> all.genes <- genes(TxDb.Hsapiens.UCSC.hg38.knownGene)
> all.genes[(all.genes$gene_id %in% c("4763", "5728"))]
GRanges object with 1 range and 1 metadata column:
       seqnames            ranges strand |     gene_id
          <Rle>         <IRanges>  <Rle> | <character>
  4763    chr17 31094927-31381887      + |        4763
  -------
  seqinfo: 595 sequences (1 circular) from hg38 genome
Am I missing someting? or PTEN was not included into the TxDb?
Thanks a lot!
Bernat
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 8 (jessie)
Matrix products: default
BLAS:   /software/debian-8/general/R-3.6.1-bioc-3.10/lib/R/lib/libRblas.so
LAPACK: /software/debian-8/general/R-3.6.1-bioc-3.10/lib/R/lib/libRlapack.so
locale:
 [1] LC_CTYPE=en_US.UTF-8      LC_NUMERIC=C              LC_TIME=C                 LC_COLLATE=en_US.utf8    
 [5] LC_MONETARY=en_US.utf8    LC_MESSAGES=en_US.utf8    LC_PAPER=es_ES.UTF-8      LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C            LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C      
attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
 [1] TxDb.Hsapiens.UCSC.hg38.knownGene_3.10.0 TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2 
 [3] GenomicFeatures_1.38.0                   GenomicRanges_1.38.0                    
 [5] GenomeInfoDb_1.22.0                      AnnotationDbi_1.48.0                    
 [7] IRanges_2.20.0                           S4Vectors_0.24.0                        
 [9] Biobase_2.46.0                           BiocGenerics_0.32.0                     
loaded via a namespace (and not attached):
 [1] SummarizedExperiment_1.16.0 progress_1.2.2              tidyselect_0.2.5           
 [4] purrr_0.3.3                 lattice_0.20-38             vctrs_0.2.0                
 [7] BiocFileCache_1.10.2        rtracklayer_1.46.0          yaml_2.2.0                 
[10] blob_1.2.0                  XML_3.98-1.20               rlang_0.4.1                
[13] pillar_1.4.2                glue_1.3.1                  DBI_1.0.0                  
[16] BiocParallel_1.20.0         rappdirs_0.3.1              bit64_0.9-7                
[19] dbplyr_1.4.2                matrixStats_0.55.0          GenomeInfoDbData_1.2.2     
[22] stringr_1.4.0               zlibbioc_1.32.0             Biostrings_2.54.0          
[25] memoise_1.1.0               biomaRt_2.42.0              curl_4.2                   
[28] Rcpp_1.0.3                  openssl_1.4.1               backports_1.1.5            
[31] DelayedArray_0.12.0         org.Hs.eg.db_3.10.0         XVector_0.26.0             
[34] bit_1.1-14                  Rsamtools_2.2.1             hms_0.5.2                  
[37] askpass_1.1                 digest_0.6.22               stringi_1.4.3              
[40] dplyr_0.8.3                 grid_3.6.1                  tools_3.6.1                
[43] bitops_1.0-6                magrittr_1.5                RCurl_1.95-4.12            
[46] tibble_2.1.3                RSQLite_2.1.2               crayon_1.3.4               
[49] pkgconfig_2.0.3             zeallot_0.1.0               Matrix_1.2-17              
[52] prettyunits_1.0.2           assertthat_0.2.1            httr_1.4.1                 
[55] rstudioapi_0.10             R6_2.4.1                    GenomicAlignments_1.22.1   
[58] compiler_3.6.1
                    
                
                
Thanks Lori!
It works perfectly with that. If I understand correctly, this is due to PTEN mapping onto the canonical chromosome and on a gemome patch, isn't it?
In any case, it seems I'm not the only one wondering where my genes have gone! Maybe a note on the existence of this option could be added to the GenomicFeatures vignette? Thanks again!
Thank you for the suggestion we will look into it.
FWIW: I have submitted a pull request that will output a message when this filtering occurs - hopefully it will be helpful. https://github.com/Bioconductor/GenomicFeatures/pull/20
Great! That would be very helpful! thanks!