You wonder about why the gene PLCXD1 can be found as transcripts but not genes in UCSC.
My understanding is that you picked a very special gene. This gene has two locations: one in chrX, and one in chrY.
When you use genes() function with default parameter in the GenomicFeatures package, the returned gene locations are all unique.
However, the transcript_ids are unique for the different transcripts of this gene. So you have no problem to use transcripts(txdb) to get the transcripts.
If you read the help of genes function, there is a parameter to deal with multiple location genes:
TRUE or FALSE. If TRUE (the default), then genes that have exons located on both strands of the same chromosome or on two different chromosomes are dropped. In that case, the genes are returned in a GRanges object. Otherwise, all genes are returned in aGRangesList object with the columns specified thru the columns argument set as top level metadata columns. (Please keep in mind that the top level metadata columns of a GRangesList object are not displayed by the show method.)
The following code are very educational:
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
cols <- c("tx_id", "tx_chrom", "tx_strand",
"exon_id", "exon_chrom", "exon_strand")
single_strand_genes <- genes(txdb, columns=cols)
all_genes <- genes(txdb, columns=cols, single.strand.genes.only=FALSE)
all_genes # a GRangesList object
multiple_strand_genes <- all_genes[elementNROWS(all_genes) >= 2]
55344 is the gene id of PLCXD1.
You can run it yourself.
As to the gene_bio_type, here is the explanation:
Thank you for using ChIPpeakAnno, and thanks for the suggestion. Let us know if you have more questions.
Posted for Jun Yu