I'm looking for a package to simulate RNA-Seq data that its output includes besides sequences the true amount of mutations (SNP, INDELs) and their positions. So far I tried two tools who failed this task.
Yes, you are correct: polyester does not simulate SNPs / mutations / indels etc. You would need to know where the mutations were beforehand before giving the sequences to polyester. You could then introduce them into the fasta file and have Polyester simulate from them.
(edited version of a response from Alyssa Frazee via email)
I'm trying to create a custom error model, however, GemErr is triggering an error [1]. I already tried to reach the author of this tool but got no answer.
The command line I'm using is:
$tools_path/GemErr.py -r 100 -f $anno_path/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa -s $GemErr/ToGemErr.sam -n $GemErr/illm -m 12
Thanks
[1]
Traceback (most recent call last):
File "...../Me/Tools/GemSIM_v1.6//GemErr.py", line 758, in <module>
main(sys.argv[1:])
File "...../Me/Tools/GemSIM_v1.6//GemErr.py", line 755, in main
mkMxSingle(readLen,reference,samfile,name,skip,circular,maxIndel,excl,minK)
File "...../Me/Tools/GemSIM_v1.6//GemErr.py", line 530, in mkMxSingle
updateM(ref[chr],pos,seq,qual,cigList,circular,0,maxIndel,'f',readLen,excl)
KeyError: 'chr1'
GemErr is a totally separate project from polyester (it's just one way to go if you want a custom error model), and the authors of polyester aren't involved in GemErr development at all -- so this is just a debugging suggestion and not actually "official" GemErr help :) But it seems to me from this error that the chromosome names in your alignment file (ToGemErr.sam) are different from the chromosome names in your reference .fa file. I'd double check to make sure the fasta file you're providing as your reference is the same one that the reads in the sam file were aligned to.
Thank you for answering!
I'm trying to create a custom error model, however, GemErr is triggering an error [1]. I already tried to reach the author of this tool but got no answer.
Thanks
[1]
GemErr is a totally separate project from polyester (it's just one way to go if you want a custom error model), and the authors of polyester aren't involved in GemErr development at all -- so this is just a debugging suggestion and not actually "official" GemErr help :) But it seems to me from this error that the chromosome names in your alignment file (ToGemErr.sam) are different from the chromosome names in your reference .fa file. I'd double check to make sure the fasta file you're providing as your reference is the same one that the reads in the sam file were aligned to.
Unfortunately, that did not work!