How to specify transcript_ids in makeTxDbFromUCSC function in GenomicFeatures package?
0
0
Entering edit mode
rbacher ▴ 20
@rbacher-12895
Last seen 9 weeks ago
United States

I would like to create a TxDb object restricted to only a specific set of transcripts. However, when I set the transcript_ids, it doesn't appear to have what I assumed would be the intended effect. Instead of the TxDb object containing only with information for the given transcript, it appears to return everything.

Am I specifying the transcript_ids incorrectly? Or is there a better way to do this?

Any advice is appreciated!

Example:

> transcript_ids <- c("uc001aaa.3")

> txdbTry <- makeTxDbFromUCSC(genome="hg19", tablename="knownGene", transcript_ids=transcript_ids)
> metadata(txdbTry)

   transcript_nrow                                        82960​

> length(keys(txdbTry))
[1] 23459

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.1

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

attached base packages:

[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ChIPseeker_1.15.2                       devtools_1.13.4                         biomaRt_2.35.1                         
 [4] limma_3.34.3                            org.Hs.eg.db_3.5.0                      TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
 [7] GenomicFeatures_1.30.0                  AnnotationDbi_1.40.0                    Biobase_2.38.0                         
[10] GenomicRanges_1.30.0                    GenomeInfoDb_1.14.0                     IRanges_2.12.0                         
[13] S4Vectors_0.16.0                        BiocGenerics_0.24.0                     EBSeq_1.18.0                           
[16] testthat_1.0.2                          gplots_3.0.1                            blockmodeling_0.1.9                    
[19] BiocInstaller_1.28.0                   

loaded via a namespace (and not attached):
 [1] bitops_1.0-6               matrixStats_0.52.2         bit64_0.9-7                httr_1.3.1                 RColorBrewer_1.1-2        
 [6] progress_1.1.2             UpSetR_1.3.3               tools_3.4.3                R6_2.2.2                   KernSmooth_2.23-15        
[11] DBI_0.7                    lazyeval_0.2.1             colorspace_1.3-2           withr_2.1.0                gridExtra_2.3             
[16] prettyunits_1.0.2          RMySQL_0.10.13             curl_3.1                   git2r_0.19.0               bit_1.1-12                
[21] compiler_3.4.3             DelayedArray_0.4.1         rtracklayer_1.38.2         caTools_1.17.1             scales_0.5.0              
[26] stringr_1.2.0              digest_0.6.12              Rsamtools_1.30.0           DOSE_3.4.0                 XVector_0.18.0            
[31] pkgconfig_2.0.1            plotrix_3.7                rlang_0.1.4                rstudioapi_0.7             RSQLite_2.0               
[36] bindr_0.1                  BiocParallel_1.12.0        gtools_3.5.0               GOSemSim_2.4.0             dplyr_0.7.4               
[41] RCurl_1.95-4.8             magrittr_1.5               GO.db_3.5.0                GenomeInfoDbData_0.99.1    Matrix_1.2-12             
[46] Rcpp_0.12.14               munsell_0.4.3              stringi_1.1.6              yaml_2.1.16                SummarizedExperiment_1.8.0
[51] zlibbioc_1.24.0            plyr_1.8.4                 qvalue_2.10.0              grid_3.4.3                 blob_1.1.0                
[56] gdata_2.18.0               DO.db_2.9                  crayon_1.3.4               lattice_0.20-35            Biostrings_2.46.0         
[61] splines_3.4.3              knitr_1.17                 fgsea_1.4.0                igraph_1.1.2               boot_1.3-20               
[66] reshape2_1.4.3             fastmatch_1.1-0            XML_3.98-1.9               glue_1.2.0                 data.table_1.10.4-3       
[71] gtable_0.2.0               assertthat_0.2.0           ggplot2_2.2.1              gridBase_0.4-7             tibble_1.3.4              
[76] rvcheck_0.0.9              GenomicAlignments_1.14.1   memoise_1.1.0              bindrcpp_0.2              


 

genomicfeatures maketxdbfromucsc • 1.0k views
ADD COMMENT
0
Entering edit mode

Is there a particular reason that installing the TxDb.Hsapiens.UCSC.hg19.knownGene and then using just the transcripts you care about isn't sufficient?

ADD REPLY
0
Entering edit mode

I intend to use the reduced TxDB object as input to the CHIPseeker package via:

ChIPseeker::annotatePeak(peaks, tssRegion=c(-3000, 3000), assignGenomicAnnotation=TRUE, TxDb=newTxDb)​

where newTxDb only contains my genes of interest. I know the annotatePeak will let me input a GRanges object as the TxDb, but then it is unable to assignGenomicAnnotation and I would like the information about intron, exon, UTR, etc.

 

ADD REPLY

Login before adding your answer.

Traffic: 836 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6