Question: How to specify transcript_ids in makeTxDbFromUCSC function in GenomicFeatures package?
0
gravatar for rbacher
18 months ago by
rbacher10
rbacher10 wrote:

I would like to create a TxDb object restricted to only a specific set of transcripts. However, when I set the transcript_ids, it doesn't appear to have what I assumed would be the intended effect. Instead of the TxDb object containing only with information for the given transcript, it appears to return everything.

Am I specifying the transcript_ids incorrectly? Or is there a better way to do this?

Any advice is appreciated!

Example:

> transcript_ids <- c("uc001aaa.3")

> txdbTry <- makeTxDbFromUCSC(genome="hg19", tablename="knownGene", transcript_ids=transcript_ids)
> metadata(txdbTry)

   transcript_nrow                                        82960​

> length(keys(txdbTry))
[1] 23459

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.1

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

attached base packages:

[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ChIPseeker_1.15.2                       devtools_1.13.4                         biomaRt_2.35.1                         
 [4] limma_3.34.3                            org.Hs.eg.db_3.5.0                      TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
 [7] GenomicFeatures_1.30.0                  AnnotationDbi_1.40.0                    Biobase_2.38.0                         
[10] GenomicRanges_1.30.0                    GenomeInfoDb_1.14.0                     IRanges_2.12.0                         
[13] S4Vectors_0.16.0                        BiocGenerics_0.24.0                     EBSeq_1.18.0                           
[16] testthat_1.0.2                          gplots_3.0.1                            blockmodeling_0.1.9                    
[19] BiocInstaller_1.28.0                   

loaded via a namespace (and not attached):
 [1] bitops_1.0-6               matrixStats_0.52.2         bit64_0.9-7                httr_1.3.1                 RColorBrewer_1.1-2        
 [6] progress_1.1.2             UpSetR_1.3.3               tools_3.4.3                R6_2.2.2                   KernSmooth_2.23-15        
[11] DBI_0.7                    lazyeval_0.2.1             colorspace_1.3-2           withr_2.1.0                gridExtra_2.3             
[16] prettyunits_1.0.2          RMySQL_0.10.13             curl_3.1                   git2r_0.19.0               bit_1.1-12                
[21] compiler_3.4.3             DelayedArray_0.4.1         rtracklayer_1.38.2         caTools_1.17.1             scales_0.5.0              
[26] stringr_1.2.0              digest_0.6.12              Rsamtools_1.30.0           DOSE_3.4.0                 XVector_0.18.0            
[31] pkgconfig_2.0.1            plotrix_3.7                rlang_0.1.4                rstudioapi_0.7             RSQLite_2.0               
[36] bindr_0.1                  BiocParallel_1.12.0        gtools_3.5.0               GOSemSim_2.4.0             dplyr_0.7.4               
[41] RCurl_1.95-4.8             magrittr_1.5               GO.db_3.5.0                GenomeInfoDbData_0.99.1    Matrix_1.2-12             
[46] Rcpp_0.12.14               munsell_0.4.3              stringi_1.1.6              yaml_2.1.16                SummarizedExperiment_1.8.0
[51] zlibbioc_1.24.0            plyr_1.8.4                 qvalue_2.10.0              grid_3.4.3                 blob_1.1.0                
[56] gdata_2.18.0               DO.db_2.9                  crayon_1.3.4               lattice_0.20-35            Biostrings_2.46.0         
[61] splines_3.4.3              knitr_1.17                 fgsea_1.4.0                igraph_1.1.2               boot_1.3-20               
[66] reshape2_1.4.3             fastmatch_1.1-0            XML_3.98-1.9               glue_1.2.0                 data.table_1.10.4-3       
[71] gtable_0.2.0               assertthat_0.2.0           ggplot2_2.2.1              gridBase_0.4-7             tibble_1.3.4              
[76] rvcheck_0.0.9              GenomicAlignments_1.14.1   memoise_1.1.0              bindrcpp_0.2              


 

ADD COMMENTlink written 18 months ago by rbacher10

Is there a particular reason that installing the TxDb.Hsapiens.UCSC.hg19.knownGene and then using just the transcripts you care about isn't sufficient?

ADD REPLYlink written 18 months ago by James W. MacDonald50k

I intend to use the reduced TxDB object as input to the CHIPseeker package via:

ChIPseeker::annotatePeak(peaks, tssRegion=c(-3000, 3000), assignGenomicAnnotation=TRUE, TxDb=newTxDb)​

where newTxDb only contains my genes of interest. I know the annotatePeak will let me input a GRanges object as the TxDb, but then it is unable to assignGenomicAnnotation and I would like the information about intron, exon, UTR, etc.

 

ADD REPLYlink modified 18 months ago • written 18 months ago by rbacher10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 170 users visited in the last hour