Greetings,
I'm working on RNA-Seq analysis from dog (Canis Familiaris) samples and I'm using the last genome annotation data available (CanFam3.1) from NCBI (https://www.ncbi.nlm.nih.gov/genome?LinkName=assembly_genome&from_uid=317138).
When I use the following code:
txdb <- makeTxDbFromGFF(gtffile, format = "gff3", circ_seqs = DEFAULT_CIRC_SEQS )
A txdb object is created successfully and R returns the following:
Import genomic features from the file as a GRanges object ... OK Prepare the 'metadata' data frame ... OK Make the TxDb object ... The following orphan exon were dropped (showing only the 6 first): seqid start end strand ID Parent Name 1 NC_006590.3 2233573 2233690 + id182056 id182055 id182056 2 NC_006590.3 2233883 2234287 + id182057 id182055 id182057 3 NC_006590.3 2235773 2235792 + id182058 id182055 id182058 4 NC_006590.3 2262018 2262086 + id182061 id182060 id182061 5 NC_006590.3 2264839 2264917 + id182062 id182060 id182062 6 NC_006590.3 2265047 2265381 + id182063 id182060 id182063The following transcripts have exons that contain more than one CDS (only the first CDS was kept for each exon): rna27051, rna32073, rna36992, rna47992Named parameters not used in query: internal_chrom_id, chrom, length, is_circularNamed parameters not used in query: internal_id, name, type, chrom, strand, start, endNamed parameters not used in query: internal_id, name, chrom, strand, start, endNamed parameters not used in query: internal_id, name, chrom, strand, start, endNamed parameters not used in query: internal_tx_id, exon_rank, internal_exon_id, internal_cds_idNamed parameters not used in query: gene_id, internal_tx_idOK
The following is the problem
When I ask for seqids to verify if they match the chromosome names in my BAM files I get some weird number which happens to be the NCBI chromosome accession number:
head(seqlevels (txdb)) [1] "NC_002008.4" "NC_006583.3" "NC_006584.3" "NC_006585.3" "NC_006586.3" [6] "NC_006587.3"
Is there anyway I can modify the GFF and remove the pertinent column? or can I ask makeTxDbFromGFF to use different columns to make TxDb object?
Thank you!