I was walking through polyester with some students yesterday and remembered something which I seem to forget every now and then about the behavior of simulate_experiment
. It's summarized in this closed issue:
https://github.com/alyssafrazee/polyester/issues/15
If you give simulate_experiment
a transcript and set reads_per_transcript=0
it will turn that 0 to a 1 here:
https://github.com/mikelove/polyester/blob/master/R/NB.R#L18
I always seem to remember this after playing around a bit, so it's never ended up in one of my simulation datasets. I always remove the desired unexpressed transcripts from the FASTA instead of putting 0s, as recommended in that issue, but I noticed that it's not mentioned in the vignette or man pages of polyester. I could see someone accidentally giving simulated_experiment
e.g. all the reference transcripts, and putting a 0 for a majority of them. They may be surprised to find they are expressing every transcript with one read.