dexseq hg19- create gff
0
0
Entering edit mode
@efratdahan21-8696
Last seen 7.1 years ago
European Union

hi all!!

how can i create gff from given gtf annotation file of UCSC of human genome h19?

i try to use the python script prepere_annotaion.py and got a lot of errors

any ideas???

ty

efrat

gtf gff • 1.6k views
0
Entering edit mode

"i try to use the python script prepere_annotaion.py and got a lot of errors" - Can you be more specific?

Have you read the DEXSeq manual?

1. The gene_id attribute is used to see which exons belong to the same gene. It must be called gene_id (and not Parent as in GFF3 files, or GeneID as in some older GFF files), and it must give the same identifier to all exons from the same gene, even if they are from different transcripts of this gene. (This last requirement is not met by GTF files generated by the Table Browser function of the UCSC Genome Browser.)

0
Entering edit mode

this is the command: python dexseq_prepare_annotation.py  genes.gtf genes.gff

while the genes.gtf refer to gtf from hg19

this is the error i got:

File "dexseq_prepare_annotation.py", line 127, in <module>
assert l[i].iv.end <= l[i+1].iv.start, str(l[i+1]) + " starts too early"
AssertionError: <GenomicFeature: exonic_part 'CFB' at chr6_dbb_hap3: 3199308 -> 3199650 (strand '+')> starts too early

what i need to change in the gtf before im trying to convert it to gff?

ty again!

efrat

0
Entering edit mode

Hi Efrat,

You could try this to convert from GTF to GFF3:

library(GenomicFeatures)
export(asGFF(makeTxDbFromGFF("path/to/my.gtf")), "path/to/my.gff3")

Should work granted that makeTxDbFromGFF() doesn't choke on the GTF file, which sometimes happens with some exotic GTF files.

Note that not all the attributes from the original GTF file will necessarily propagate but the core GTF attributes (gene_id, transcript_idexon_id) will be used to generate the core GFF3 attributes (IDParent, Name), hence the gene/transcript/exon hierarchical organization should be preserved. Hopefully that's all what matters from a DEXSeq point of view but I can't tell for sure...

H.