Hi, I want to get The evolutionary conservation scores for lncRNAs by the phastCons100way.UCSC.hg38 (3.7.1) R package. I already manually downloaded the lncrna annotation gtf file from gencode V34, GRCh38), and I extracted the exons for the lncrna.
>gtf_data_df <- as.data.frame(gtf_data)
# Filter for long non-coding RNAs
> gtf_data_df_final= gtf_data_df %>% filter (gene_type == "lncRNA" & type == "exon")
> head(gtf_data_df_final)
seqnames start end width strand source type score phase gene_id
1 chr1 29554 30039 486 + HAVANA exon NA NA ENSG00000243485.5
2 chr1 30564 30667 104 + HAVANA exon NA NA ENSG00000243485.5
3 chr1 30976 31097 122 + HAVANA exon NA NA ENSG00000243485.5
4 chr1 30267 30667 401 + HAVANA exon NA NA ENSG00000243485.5
5 chr1 30976 31109 134 + HAVANA exon NA NA ENSG00000243485.5
6 chr1 35721 36081 361 - HAVANA exon NA NA ENSG00000237613.2
gene_type gene_name level hgnc_id tag havana_gene
1 lncRNA MIR1302-2HG 2 HGNC:52482 basic OTTHUMG00000000959.1
2 lncRNA MIR1302-2HG 2 HGNC:52482 basic OTTHUMG00000000959.1
3 lncRNA MIR1302-2HG 2 HGNC:52482 basic OTTHUMG00000000959.1
4 lncRNA MIR1302-2HG 2 HGNC:52482 basic OTTHUMG00000000959.1
5 lncRNA MIR1302-2HG 2 HGNC:52482 basic OTTHUMG00000000959.1
6 lncRNA FAM138A 2 HGNC:32334 basic OTTHUMG00000000960.1
transcript_id transcript_type transcript_name transcript_support_level
1 ENST00000473358.1 lncRNA MIR1302-2HG-202 5
2 ENST00000473358.1 lncRNA MIR1302-2HG-202 5
3 ENST00000473358.1 lncRNA MIR1302-2HG-202 5
4 ENST00000469289.1 lncRNA MIR1302-2HG-201 5
5 ENST00000469289.1 lncRNA MIR1302-2HG-201 5
6 ENST00000417324.1 lncRNA FAM138A-201 1
havana_transcript exon_number exon_id ont
1 OTTHUMT00000002840.1 1 ENSE00001947070.1 <NA>
2 OTTHUMT00000002840.1 2 ENSE00001922571.1 <NA>
3 OTTHUMT00000002840.1 3 ENSE00001827679.1 <NA>
4 OTTHUMT00000002841.1 1 ENSE00001841699.1 <NA>
5 OTTHUMT00000002841.1 2 ENSE00001890064.1 <NA>
6 OTTHUMT00000002842.1 1 ENSE00001656588.1 <NA>
sessionInfo( )