Import sequences from MacClade 4.*
2
0
Entering edit mode
Brian ▴ 20
@brian-5105
Last seen 9.7 years ago
Hello List, sorry if this is a stupid question, but I am returning some old sequences that I have laying around, the program is a mac program called MacClade, but the sequence file looks like: #NEXUS [MacClade 4.03 registered to University] BEGIN DATA; DIMENSIONS NTAX=53 NCHAR=673; FORMAT DATATYPE=DNA MISSING=? GAP=- INTERLEAVE ; MATRIX [ 10 20 30 40 50 60] [ . . . . . .] [Modal TGAACCTGCGGAAGGAAAATATTATTGAATATATTTTTTA] AI1A TGAACCTGCGGAACGAAAATATTATTGAATATATTTTTTA [60] ... Does this look familiar to anyone? Did I overlook some function in the "seqinr" package? Before I write some function to get the sequences out for me. Thanks for the help! Cheers, Brian
• 717 views
ADD COMMENT
0
Entering edit mode
@jarno-tuimala-5112
Last seen 9.7 years ago
Brian, Your sequence alignment is in NEXUS format. Most likely the function read.nexus.data() in the ape package (in CRAN) will be able read it. Cheers, Jarno [[alternative HTML version deleted]]
ADD COMMENT
0
Entering edit mode
Brian ▴ 20
@brian-5105
Last seen 9.7 years ago
Hi again, so after a bit of R-programming fun I wrote the following. I realize now that that was actually a sequence alignment and not a typical sequence file. Nonetheless, should anyone be tasked with rooting up old data from a macclade file then, as promised: read.macclade <- function(file, raw.sequences=TRUE, return.metadata=FALSE,...){ ## file - path to file ## return.metadata - return the metadata from the mcclade file ## ... - Additional arguments to "readLines" function seqs <- readLines(file) seqs.metadata.ind <- grep("^BEGIN|^MATRIX", seqs) seqs.metadata <- seqs[(seqs.metadata.ind[1]+1):(seqs.metadata.ind[2]-1)] ## Get the sequences seqs.modal_index <- grep("^\\[Modal",seqs) seq_end <- grep("END",seqs)[1] seqs_sub <- seqs[(seqs.modal_index[1]+2):(seq_end-2)] seqs_sub.modal_index <- grep("^\\[Modal",seqs_sub) seqs.modal <- seqs[seqs.modal_index] ## Get the lines seqs_sub <- grep("^[[:alnum:]]",seqs_sub,value=T) ## Cut it up seqs_list_raw <- sapply(seq(seqs_sub), function(x)grep(".", unlist(strsplit(x=seqs_sub[x], split="[[:blank:]]")),value=TRUE)) ## Put everything together seqs.unique <- unique(seqs_list_raw[1,]) seqs_index <- sapply(seqs.unique,function(seq)grep(seq,seqs_list_raw[1,])) seqs_vect <- sapply(seqs.unique,function(seq) paste(seqs_list_raw[2,seqs_index[,seq]],collapse="")) seqs_lengths <- sapply(seqs.unique,function(seq) gsub(pattern="\\[|\\]", replacement="", x=seqs_list_raw[3,max(seqs_index[,seq])])) if(raw.sequences) { seqs_raw <- sapply(seqs_vect,function(seq) gsub("[[:punct:]]",replacement="",seq)) return(seqs_raw) } else { if(return.metadata) {return(list(seqs.metadata, seqs_vect,seqs_lengths)) } else { print(seqs_lengths); print(seqs.metadata); seqs_vect} } } Sorry if it looks ugly... but it works Cheers, Brian
ADD COMMENT

Login before adding your answer.

Traffic: 512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6