Hi folks, I have a set of short DNA sequences containing the R character (for A or G). I used readDNAStringSet to convert my fasta input file to a DNAStringSet, then the Disambiguate function from the DECIPHER package to expand the set into all possible unique sequences. Now I would like to get that DNAStringSet object back into a fasta format for downstream analysis, keeping the original identifiers, appended with a number/letter, to give a unique name to each disambiguated sequence. I have been playing with DECIPHER's DB2Seqs function, but I don't seem to understand the file handling for that function. Here are some things I have tried:
DB2Seqs(file = "", stringset_10k_disambig, tblName = "disambig10k.fasta")
Error in DB2Seqs(file = "", stringset_10k_disambig, tblName = "disambig10k.fasta") : 'dbFile' must be a character string or SQLiteConnection.
DB2Seqs(getwd(), stringset_10k_disambig, tblName = "disambig10k.fasta")
Error in DB2Seqs(getwd(), stringset_10k_disambig, tblName = "disambig10k.fasta") : 'dbFile' must be a character string or SQLiteConnection.
or from the DB2Seqs example:
tf <- tempfile()
DB2Seqs(tf, stringset_10k_disambig, tblName = "disambig10k.fasta")
Error in DB2Seqs(tf, stringset_10k_disambig, tblName = "disambig10k.fasta") : 'dbFile' must be a character string or SQLiteConnection.
Once I get the conversion figured out, I will work on how to make the identifiers unique! Thanks for any suggestions.
This is not the correct usage of
DB2Seqs()
, which is why you are encountering errors. However, you probably want to usewriteXStringSet()
in your case.