Question: UTRs position of genomic loci
2.6 years ago by
Jvais20000 wrote:

Dear all, 

I'm actually tried to calculated the distance between genomic loci and the end / the start of UTR 3 (for coding genes).

I've used "GenomicFeatures":



#1.UTRs size

txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene

tx_lens <- transcriptLengths(txdb, with.utr5_len = F, with.utr3_len = T, with.cds_len=F)
coding.gene <- tx_lens[tx_lens$cds_len !=0,]

# Genomic loci in UTR 3

loci.GR<-makeGRangesFromDataFrame(loci ,ignore.strand = F,seqnames.field = "chromosome",

    strand.field="strand",start.field = "start", keep.extra.columns=T, end.field = "stop", = TRUE) 

# Position in each transcript 

ucsc3UTRbytx <- threeUTRsByTranscript(txdb)

loci.t <- mapToTranscriptsloci.GR,ucsc3UTRbytx)

loci.df = data.frame(loci.t)

# retrieve id - transcrit

# identification of loci position in UTRs

loci.df.annotated <- merge(loci.df, coding.gene , by.x =c("tx_name"), by.y =c("tx_name"),all.x = F , all.y = F)


But when I look back to my results, It's quite confusing. For example,

- UTR present in "loci.df.annotated" are not always in  "coding.gene"

- UTR position in "loci.df.annotated " is sometimes out of the range of UTR3 size in "coding.gene"

- UTR position in "loci.df.annotated " is sometimes out of the range of transcript length size in "coding.gene"


For you, its a issue in the "GenomicFeatures"? Or an issue in script?

threeutrsbytranscript utr • 352 views
threeutrsbytranscript utr • 352 views

I'm having trouble trying to reproduce the code you provided. 

#1.UTRs size,  

tx_lens$cds_len would be NULL,  unless in the previous call 

tx_lens <- transcriptLengths(txdb, with.utr5_len = F, with.utr3_len = T, with.cds_len=F)

the argument with.cds_len would have to be changed to TRUE.

#Genomic loci in UTR 3 

you have not yet defined loci when you use it in the argument for makeGRangesFromDataFrame.  

Could you please update the code provided so I can reproduce what you are experiencing. 


shepherl ♦♦ 1.4k
