GeneGA organism abbreviations
1
0
Entering edit mode
@tomasbjorklund-7071
Last seen 9.7 years ago
Sweden

I'm trying to use GeneGA as a component in codon optimisation for expression of polypeptide sequences in mammalian cells. It appears to work well, but I have an annoying issue. The package is said to include a database to optimise for 200 organisms. However, the abbreviation used to specify which organism appears to be non-standard and without documentation. I have tried this list: http://www.genome.jp/kegg/catalog/org_list.html as well as tried many real name alternatives. The one given as example int the documentation is "ec" (I assume ar e.coli, but that is not specified either). 

Can anyone please help me finding the right abbreviation for human, rat and mouse for use in GeneGA?

Thanks'

GeneGA • 2.2k views
ADD COMMENT
1
Entering edit mode

the names used for available organisms seem to be extractable as follows

> data(wSet)

> ?wSet

> rownames(wSet)

  [1] "ec"                                               

  [2] "bs"                                               

  [3] "sc"                                               

  [4] "Acinetobacter_baumannii_ATCC_17978"               

  [5] "Acinetobacter_sp_ADP1"                            

  [6] "Actinobacillus_pleuropneumoniae_L20"              

  [7] "Aeromonas_hydrophila_ATCC_7966"                   

  [8] "Agrobacterium_tumefaciens_C58_Cereon"             

  [9] "Agrobacterium_tumefaciens_C58_UWash"              

 [10] "Alcanivorax_borkumensis_SK2"                      

 [11] "Arthrobacter_aurescens_TC1"                       

 [12] "Arthrobacter_FB24"                                

 [13] "Bacillus_anthracis_Ames"                          

 [14] "Bacillus_anthracis_Ames_0581"  ...          
ADD REPLY
0
Entering edit mode

Thank you Vincent. This is clearly one step forward and two steps back as the list contains no mammalian species. I have the equivalent data for the species I need however. Is there a way to inject this data into the wSet data table before execution or make it always include this data as it has done with the three species for the seqinr caitab data? 

I apologise if this is an obvious question, but I'm still rather new to bioconductor and R.

ADD REPLY
0
Entering edit mode
@tomasbjorklund-7071
Last seen 9.7 years ago
Sweden

For anyone else who is interested, I injected the human CAI information in the wSet data table and then saved it into a new .rda file and replaced the one in the GeneGA data folder. This appears now to work well. Not the most elegant solution long term. In parallel, I have contacted the GeneGA package maintainer to request that he includes some of the most commonly used mammals in the distributed set. The data I used was taken from here: http://www.genscript.com/cgi-bin/tools/codon_freq_table

 

ADD COMMENT

Login before adding your answer.

Traffic: 846 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6