Entering edit mode
Paul Shannon
★
1.1k
@paul-shannon-578
Last seen 10.2 years ago
I wish to trim a variable length sequence from the end of many
thousands of DNAStrings in a DNAStringSet.
The sequence to be trimmed is any recognizable chunk of a solexa short
read adapter, which ends up on the end of, for example, 22nt miRNAs.
The adapter chunk might be found in the middle of a 35 base read, or
it might be closer to the end. In every case, I want to delete every
base from the start of the adapter chunk to the end of the read.
I imagine there might be a BString operation equivalent to sed. See
could be used ike this:
echo 'CGAAGCGGGATGATCTATCTCGTATGCCGTCTTCT' | sed s/TCGTATGCCGTC.*$//
--> GAAGCGGGATGATCTATC
(where TCGTATGCCGTC is only part of the 21-base adapter, but is
probably a long enough portion to be representative)
Any way to do this with BStrings and friends?
Thanks!
- Paul