How to translate FASTA file of DNA sequence to protein sequence using Biostrings package?
1
2
Entering edit mode
@mahasishshome-22684
Last seen 3.2 years ago

I have gene list of 10,000 DNA sequences. I need to translate the DNA sequence to protein sequence using Biostrings package. I am able to use the function:

translate ( DNAStrings ("ATG") )

But the problem is that I have to do one sequence at a time. I want to run the entire FASTA file at a time and get the output as protein sequence file. Please help me to find a high-throughput method of doing the same.

Biostrings Bioconductor R • 2.0k views
3
Entering edit mode
Paul Harrison ▴ 100
@paul-harrison-5740
Last seen 5 months ago
Australia/Melbourne/Monash University B…

The Bioconductor type to hold a collection of DNA sequences is DNAStringSet. You can also use translate on this and it will translate each sequence.

translate(DNAStringSet(c("ATG","TAA")))


If you use rtracklayer::import to load a FASTA file, the result is a DNAStringSet, so you should be able to immediately use translate on it.