I am running some tests to compare trimLRpatterns vs other trimming tools (Skewer, cutadapt, AdapterRemoval).
I've generated simulated data using ART (https://www.niehs.nih.gov/research/resources/software/biostatistics/art/index.cfm). In particular, there's a modified version of the program from the authors of Skewer that allows to simulate the contamination with adapters (http://sourceforge.net/projects/skewer/files/Simulator/).
For my simulations, I've created reads of 150 bp for a coverage of 20x, and a fragment size of 200 bp +- 50 bp, to simulate the contamination with adapters in those reads with small fragment size. The quality profiles were taken from actual MiSeq E. coli Fastq files.
Most of the programs achieve a sensitivity/specificity of 99%. trimLRpatterns is showing high specificity (99%) but a very low sensitivity (max. 16%), having problems to remove the adapters globally. I've changed different parameters, but I can't improve the value.
In this repository: https://github.com/leandroroser/Test_trimLRpatterns, you can find a test script for a portion of the simulated data (also included in the same folder), where I'm varying max.Rmismatch from 1 to 50.
I know exactly the length of the true trimmed reads, the location in the genome is in the bed file of the repository. So, the width can be compared with the output of the program. I'm using the same statistics of the AdapterRemoval paper.
Any advice in relation to this?