Hi,
I'm gonna simulate some RNA-seq read counts data from a real data set for publication purposes. My problem is I can't download big Fasta files. So I want to obtain the count matrix using galaxy and then simulate data using that count matrix. There are some R packages such as SimSeq and Polyester which simulate data. SimSeq can simulate data by getting as input a count matrix which is what I'm looking for. But the problem is I want to simulate a time course study and SimSeq can't accommodate dependence over time. Technically, it looks at each time point as an independent treatment. Polyester can handle time course studies but can I give it a count matrix as input instead of big Fasta and GTF files? I'm a master student in biostatistics and I'm very new to simulation. So I don't know how to write the codes on my own.
Thanks a lot.
Thank you very much. I try to apply it. But from the manual, it says that the output is the fasta files ("create FASTA files containing RNA-seq reads simulated from provided transcripts, with optional differential expression between two groups (designated via read count matrix)"). does it mean that I have to obtain the count matrix using those fasta files? Or can I have the count matrix directly by running some other codes in the package?