Question

Biostrings translate() function with alternative genetic code is not working for first codon

0

Entering edit mode

atisou • 0

@atisou-7468

Last seen 8.6 years ago

Switzerland

Hello,

I would like to translate a bacterial DNA sequence into a protein using the GENETIC_CODE Id 11 (instead of the Standard Id 1) and below are the commands used:

dna <- "TTGAGGATGACGAATCGTAACGTCGAATGGACTGATAATGCCTGGGATGAATATATCTATTGGCAGACACAGGATAAAAAGATACTTAAGCGTATTAATACCTTAATCAAAGAATGTCAGCGAACACCTTTTGAAGGAACAGGAAAACCAGAACCTTTAAAAGCTAATCTTTCAGGATTTTGGAGTCGTAGGATTGATGAAAAGCATAGATTAGTTTATGAAGTGACAGATGAACGAATCTCTATAATTCAATGTCGATTCCATTACTAA"

dna_obj <- DNAString(dna, start = 1)

translate(dna_obj, genetic.code = getGeneticCode("11", full.search = FALSE))

  90-letter "AAString" instance
seq: LRMTNRNVEWTDNAWDEYIYWQTQDKKILKRINTLIKECQRTPFEGTGKPEPLKANLSGFWSRRIDEKHRLVYEVTDERISIIQCRFHY*

The 1st residue should be a Methionine, and not a Leucine as returned by translate(). The genetic code 11 translates as M the TTG codon (and also few others) if this one is located as the 1st codon in the sequence.

See the NCBI reference genetic code for the alternative codons usage in GENETIC_CODE 11:

https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG11

Am I missing a parameter/option somewhere to make the selected GENETIC_CODE work properly? or is translate() not designed to take into account the codons positions for amino-acid assignments?

Hatice

ps: Thanks for making this great package available to the community!

biostrings genetic code translate • 4.3k views

ADD COMMENT • link updated 8.6 years ago by Hervé Pagès 16k • written 8.6 years ago by atisou • 0

score 2 · Accepted Answer · 2017-05-25

2

Entering edit mode

Hervé Pagès 16k

@herve-pages-1542

Last seen 2 days ago

Seattle, WA, United States

Hi Hatice,

Thanks for pointing out this problem. It's true that translate() was not designed to take into account the codons positions for amino-acid assignments so the 1st codon in the sequence is not treated in any special way. Supporting this requires some change to the translate() interface, as well as (possibly) changes to GENETIC_CODE_TABLE and getGeneticCode(). I'm going to look into this and will let you know.

Best,

H.

ADD COMMENT • link 8.6 years ago Hervé Pagès 16k

0

Entering edit mode

This is now fixed in Biostrings 2.44.1. With this new version:

library(Biostrings)
getGeneticCode("11")
# TTT TTC TTA TTG TCT TCC TCA TCG TAT TAC TAA TAG TGT TGC TGA TGG CTT 
# "F" "F" "L" "L" "S" "S" "S" "S" "Y" "Y" "*" "*" "C" "C" "*" "W" "L" 
# CTC CTA CTG CCT CCC CCA CCG CAT CAC CAA CAG CGT CGC CGA CGG ATT ATC 
# "L" "L" "L" "P" "P" "P" "P" "H" "H" "Q" "Q" "R" "R" "R" "R" "I" "I" 
# ATA ATG ACT ACC ACA ACG AAT AAC AAA AAG AGT AGC AGA AGG GTT GTC GTA 
# "I" "M" "T" "T" "T" "T" "N" "N" "K" "K" "S" "S" "R" "R" "V" "V" "V" 
# GTG GCT GCC GCA GCG GAT GAC GAA GAG GGT GGC GGA GGG 
# "V" "A" "A" "A" "A" "D" "D" "E" "E" "G" "G" "G" "G" 
# attr(,"alt_init_codons")
# [1] "TTG" "CTG" "ATT" "ATC" "ATA" "GTG"

Note the new alt_init_codons attribute.

dna <- DNAString("TTGAGGATGACGAATCGTAACGTCGAATGGACTGATAATGCCTGGGATGAATATATCTATTGGCAGACACAGGATAAAAAGATACTTAAGCGTATTAATACCTTAATCAAAGAATGTCAGCGAACACCTTTTGAAGGAACAGGAAAACCAGAACCTTTAAAAGCTAATCTTTCAGGATTTTGGAGTCGTAGGATTGATGAAAAGCATAGATTAGTTTATGAAGTGACAGATGAACGAATCTCTATAATTCAATGTCGATTCCATTACTAA")
translate(dna, getGeneticCode("11"))
#   90-letter "AAString" instance
# seq: MRMTNRNVEWTDNAWDEYIYWQTQDKKILKRI...GFWSRRIDEKHRLVYEVTDERISIIQCRFHY*

The 1st codon (TTG) is an alternative initiation codon and so is translated to M (instead of L previously). See ?getGeneticCode for details about the new alt_init_codons attribute set on all genetic codes.

Please allow between 24h and 48h for this fix to become available via biocLite().

Cheers,

H.

ADD REPLY • link 8.6 years ago Hervé Pagès 16k

0

Entering edit mode

super, thanks Herve!

H.

ADD REPLY • link 8.6 years ago atisou • 0

0

Entering edit mode

Dear Herve,

I am translating a set of DNA sequences to peptides, therefore the alternative initiation codon doesn't apply. How can I reset the attribution? Thank you.

Xiaoyan

Hi, I tried attr(GENETIC_CODE, "alt_init_codons")=NULL won't work, but attr(GENETIC_CODE, "alt_init_codons")=character(0) worked.

Thanks for your package.

Best,

Xiaoyan

ADD REPLY • link 7.8 years ago XIA.PAN ▴ 20