Question: Reading fasta file with multiple sequences
gravatar for Riot
2.4 years ago by
Riot0 wrote:

Hello all,

I'm trying to read a fasta file that has over 5000 sequences.  The plan is to create a vector that calls out all the sequences, and those sequences I'll carry over to Bio Linux after I turn them into protein. I've done this, but with only one sequence at a time (that and I'm still new to RStudio).  Please see below for the codes I'm using... Can someone please tell me where I'm going wrong?

> contigs= read.fasta("contigs.fasta", seqtype = “DNA”)

> contigsdnaseq= contigs[[1]]   (I think this is the part where things go wrong. I'm not sure what code to use in order for the program to recognize the 5000+ sequences.)

> getTrans(contigsdnaseq, sens = "F", NAstring = "X", ambiguous = FALSE, frame = 0, numcode = 1)

> contigs_aa= getTrans(contigsdnaseq,sens = "F")    

> write.fasta(contigs_aa,contigs_aa,file.out = "contigs_aa.fasta")

> contigsaafile = read.fasta("contigs_aa.fasta", seqtype = "AA")

> getAnnot(contigsaafile)


ADD COMMENTlink modified 2.4 years ago by Martin Morgan ♦♦ 23k • written 2.4 years ago by Riot0
Answer: Reading fasta file with multiple sequences
gravatar for Martin Morgan
2.4 years ago by
Martin Morgan ♦♦ 23k
United States
Martin Morgan ♦♦ 23k wrote:

seqinr is a CRAN package so you'd have to ask elsewhere for help.

In Bioconductor, you'd use

dna = readDNAStringSet("your.fasta")
aa = translate(dna)
writeXStringSet(aa, "aa.fasta")

This would process all of your fasta sequences in one go, no need to iterate.

I'm not really sure what getAnnot() retrieves for amino acid sequences, it seems like it's just the identifier, names(aa). If more, one would use one of the Bioconductor 'org' packages (e.g., or biomaRt; see the vignette AnnotationDbi: Introduction To Bioconductor Annotation Packages in the AnnotationDbi or biomaRt packages for more.

ADD COMMENTlink written 2.4 years ago by Martin Morgan ♦♦ 23k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 16.09
Traffic: 311 users visited in the last hour