Entering edit mode
Tefina Paloma
▴
220
@tefina-paloma-3676
Last seen 10.2 years ago
Dear list,
I do have a question regarding getSequence and the difference between
the
seqType "coding_gene_flank" and "5utr".
As far as I understand, "coding_gene_flank" should contain the 5utr.
Looking at an example:
library(biomaR)
ensembl <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
flanking_seq <- getSequence(id = c(23704), type = "entrezgene",
seqType =
"coding_gene_flank", upstream = 1000, mart = ensembl)
5utr <- getSequence(id = c(23704), type = "entrezgene", seqType =
"5utr",
mart = ensembl)
So flanking_seq contains a sequence which is 1000 bases long,
5utr contains 154 bases.
But:
The 5utr does not align perfectly with the flanking_seq (only 131
bases
align), and further more,
the alignment start at base 313 of the flanking_seq.
I would assume that the 5utr is at the end of the flanking_seq and not
in
the middle?!
And, of course, that the flanking_seq contains entirely the 5utr.
So, what am I missing here?
Thanks a lot in advance for any hints!
Tefina
> sessionInfo()
R version 2.9.2 (2009-08-24)
i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Biostrings_2.12.8 IRanges_1.2.3 biomaRt_2.0.0
loaded via a namespace (and not attached):
[1] Biobase_2.4.1 RCurl_1.2-0 XML_2.6-0
[[alternative HTML version deleted]]