Question

convert sequences disambiguated with DECIPHER back to FASTA

0

Entering edit mode

joannew • 0

@joannew-23625

Last seen 4.0 years ago

Vancouver

Hi folks, I have a set of short DNA sequences containing the R character (for A or G). I used readDNAStringSet to convert my fasta input file to a DNAStringSet, then the Disambiguate function from the DECIPHER package to expand the set into all possible unique sequences. Now I would like to get that DNAStringSet object back into a fasta format for downstream analysis, keeping the original identifiers, appended with a number/letter, to give a unique name to each disambiguated sequence. I have been playing with DECIPHER's DB2Seqs function, but I don't seem to understand the file handling for that function. Here are some things I have tried:

DB2Seqs(file = "", stringset_10k_disambig, tblName = "disambig10k.fasta")
Error in DB2Seqs(file = "", stringset_10k_disambig, tblName = "disambig10k.fasta") : 'dbFile' must be a character string or SQLiteConnection.

DB2Seqs(getwd(), stringset_10k_disambig, tblName = "disambig10k.fasta")
Error in DB2Seqs(getwd(), stringset_10k_disambig, tblName = "disambig10k.fasta") : 'dbFile' must be a character string or SQLiteConnection.

or from the DB2Seqs example:

tf <- tempfile()
DB2Seqs(tf, stringset_10k_disambig, tblName = "disambig10k.fasta")
Error in DB2Seqs(tf, stringset_10k_disambig, tblName = "disambig10k.fasta") : 'dbFile' must be a character string or SQLiteConnection.

Once I get the conversion figured out, I will work on how to make the identifiers unique! Thanks for any suggestions.

DECIPHER FASTA disambiguate • 1.2k views

ADD COMMENT • link written 4.9 years ago by joannew • 0

0

Entering edit mode

This is not the correct usage of DB2Seqs(), which is why you are encountering errors. However, you probably want to use writeXStringSet() in your case.

ADD REPLY • link 4.9 years ago Erik Wright ▴ 150