The GFF can be downloa TAIR_download
I want to make my own Txdb from Araport11( Arabidopsis thaliana) using makeTxdbFromGFF. But I found a weird thing. Some miRNA appears in my gene result.
seq_length <- read.table("~/reference/genome/TAIR10/Athaliana.fa.fai")[,1:2]
Txdb_gff <- makeTxDbFromGFF("~/reference/annoation/Athaliana/Araport11/Araport11_GFF3_genes_transposons.201606.gff", chrominfo = Seqinfo(seqnames = seq_length$V1, seqlengths = seq_length$V2))
# Import genomic features from the file as a GRanges object ... OK
# Prepare the 'metadata' data frame ... OK
# Make the TxDb object ... OK
# Warning message:
# In .get_cds_IDX(mcols0$type, mcols0$phase) :
# The "phase" metadata column contains non-NA values for
# features of type exon. This information was ignored.
> genes(Txdb_gff)
GRanges object with 38194 ranges and 1 metadata column:
seqnames ranges strand | gene_id
<Rle> <IRanges> <Rle> | <character>
AT1G01010 Chr1 3631-5899 + | AT1G01010
AT1G01020 Chr1 6788-9130 - | AT1G01020
AT1G01030 Chr1 11649-13714 - | AT1G01030
AT1G01040 Chr1 23121-31227 + | AT1G01040
AT1G01046 Chr1 28500-28706 + | AT1G01046
... ... ... ... . ...
MIR854c Chr5 11838096-11838316 + | MIR854c
MIR854d Chr5 11689861-11690081 - | MIR854d
MIR854e Chr5 11942467-11942687 + | MIR854e
MIR855 Chr2 4674427-4674698 + | MIR855
MIR858b Chr1 26772633-26772935 + | MIR858b
-------
seqinfo: 7 sequences from an unspecified genome
But when I check the MIRNA, it did has its own gene ID.
$ grep MIR855 Araport11_GFF3_genes_transposons.201606.gff
Chr2 Araport11 gene 4674427 4674698 . + . ID=AT2G05605;Name=MIR855;locus_type=mirna
Chr2 Araport11 miRNA_primary_transcript 4674427 4674698 . + . ID=AT2G05605.1;Parent=AT2G05605;Alias=MI0005411;Name=ath-MIR855;symbol=ath-MIR855;Note=microRNA ath-MIR855 precursor
I am wondering why the makeTxdbFromGFF use the MIRNA name rather than AT2G05605.
But I noticed a info about that in GenomicFeatures/NEWS in line 241
makeTxDbFromGRanges() now recognizes features of type miRNA,
miRNA_primary_transcript, SRP_RNA, RNase_P_RNA, RNase_MRP_RNA, misc_RNA,
antisense_RNA, and antisense as transcripts. It also now recognizes
features of type transposable_element_gene as genes.
So I am wondering whether I can change the MIR855 into its original geneID(AT2G05605).
> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS: /opt/sysoft/R-3.6.1/lib64/R/lib/libRblas.so
LAPACK: /opt/sysoft/R-3.6.1/lib64/R/lib/libRlapack.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 parallel stats graphics grDevices utils
[7] datasets methods base
other attached packages:
[1] GenomicFeatures_1.38.0 AnnotationDbi_1.48.0
[3] Biobase_2.46.0 GenomicRanges_1.38.0
[5] GenomeInfoDb_1.22.0 IRanges_2.20.0
[7] S4Vectors_0.24.0 BiocGenerics_0.32.0
loaded via a namespace (and not attached):
[1] SummarizedExperiment_1.16.0 progress_1.2.2
[3] tidyselect_0.2.5 purrr_0.3.3
[5] lattice_0.20-38 vctrs_0.2.1
[7] BiocFileCache_1.10.0 rtracklayer_1.46.0
[9] blob_1.2.0 XML_3.98-1.20
[11] rlang_0.4.2 pillar_1.4.3
[13] glue_1.3.1 DBI_1.1.0
[15] BiocParallel_1.19.6 rappdirs_0.3.1
[17] bit64_0.9-7 dbplyr_1.4.2
[19] matrixStats_0.55.0 GenomeInfoDbData_1.2.2
[21] stringr_1.4.0 zlibbioc_1.32.0
[23] Biostrings_2.54.0 memoise_1.1.0
[25] biomaRt_2.42.0 curl_4.3
[27] Rcpp_1.0.3 openssl_1.4.1
[29] backports_1.1.5 DelayedArray_0.12.0
[31] XVector_0.26.0 bit_1.1-14
[33] Rsamtools_2.2.0 hms_0.5.2
[35] askpass_1.1 packrat_0.5.0
[37] digest_0.6.23 stringi_1.4.3
[39] dplyr_0.8.3 grid_3.6.1
[41] tools_3.6.1 bitops_1.0-6
[43] magrittr_1.5 RCurl_1.95-4.12
[45] tibble_2.1.3 RSQLite_2.1.5
[47] crayon_1.3.4 pkgconfig_2.0.3
[49] zeallot_0.1.0 Matrix_1.2-18
[51] prettyunits_1.0.2 assertthat_0.2.1
[53] httr_1.4.1 rstudioapi_0.10
[55] R6_2.4.1 GenomicAlignments_1.22.0
[57] compiler_3.6.1
Can somebody help :)