Entering edit mode
anmej
•
0
@anmej-20275
Last seen 5.7 years ago
Hello everyone.
I want to extract the annotation of 5UTR+CDS region of every transcript in the hg19 annotation, to search for alternative ORFs. This is what I've managed to do so far:
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
fiveUTRs = fiveUTRsByTranscript(txdb, use.names = TRUE)
names5UTR = names(fiveUTRs)
cds = cdsBy(txdb, "tx", use.names=TRUE)
namesCDS = names(cds)
names5UTRCDS = intersect(namesCDS,names5UTR)
fiveUTRs = fiveUTRs[names5UTRCDS]
cds = cds[names5UTRCDS]
fiveUTRCDS = GRangesList()
for (i in 1:length(names5UTRCDS)){
x = GRangesList(c(unlist(fiveUTRs[i]),unlist(cds[i])))
names(x) = names(fiveUTRs[i])
fiveUTRCDS = c(fiveUTRCDS,x)
}
I'm basically looping over both lists and concatenating every element. It works, but is very slow and inelegant. Surely there must be a better, functional way do to it? Some way to "zip" the two listsl?
Thanks.