I received the following question by Andrew Lamb which I'm posting here since it might be of interest to other users. I'll answer it shortly. He is using the following GTF file ftp://ftp.ensembl.org/pub/release-81/gtf/homo_sapiens/Homo_sapiens.GRCh38.81.gtf.gz.
The only issue is that we're still having trouble annotating it using GRCH38 gtf file. Our RNASeq data was aligned using this version so we would like to be able to annotate with it as well. The issue comes when trying the makeGenomicState function, although it may be the issue was with the makeTxDbFromGFF function. Please see below:
library(GenomicFeatures) library(derfinder) gtf_file <-'/udd/reala/reference_files/DE/Homo_sapiens.GRCh38.81.gtf' TXDB <- makeTxDbFromGFF(gtf_file, format = 'gtf') Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... OK Warning messages: 1: RSQLite::dbGetPreparedQuery() is deprecated, please switch to DBI::dbGetQuery(params = bind.data). 2: Named parameters not used in query: internal_chrom_id, chrom, length, is_circular 3: Named parameters not used in query: internal_id, name, type, chrom, strand, start, end 4: Named parameters not used in query: internal_id, name, chrom, strand, start, end 5: Named parameters not used in query: internal_id, name, chrom, strand, start, end 6: Named parameters not used in query: internal_tx_id, exon_rank, internal_exon_id, internal_cds_id 7: Named parameters not used in query: gene_id, internal_tx_id genomic_state <- makeGenomicState(TXDB) extendedMapSeqlevels: sequence names mapped from NCBI to UCSC for species homo_sapiens Error in .testForValidKeys(x, keys, keytype, fks) : None of the keys entered are valid keys for 'TXID'. Please use the keys method to see a listing of valid arguments.
I also tried:
genomic_state <- makeGenomicState(TXDB, style = "chrsStyle") extendedMapSeqlevels: the 'style' chrsStyle is currently not supported for the 'species' homo_sapiens in GenomeInfoDb. Check valid naming styles by running GenomeInfoDb::genomeStyles(species). If it's not present, consider adding your genome by following the information at http://www.bioconductor.org/packages/release/bioc/vignettes/GenomeInfoDb/inst/doc/Accept-organism-for-GenomeInfoDb.pdf 'select()' returned 1:1 mapping between keys and columns Error: logical subscript contains NAs
Any thought on what is happening here?