Question

Running HTSFilter in paralell

0

Entering edit mode

Richard Yanicky ▴ 10

@richard-yanicky-3394

Last seen 8.2 years ago

Hello,

I am using the HTSFilter library to filter out low count samples for some RNA data. It is working but takes a while to run. Is there a way to run it using multiple cores/cpu's?

Regards,

Richard

htsfilter parallel • 1.3k views

ADD COMMENT • link updated 8.3 years ago by andrea.rau ▴ 80 • written 8.3 years ago by Richard Yanicky ▴ 10

score 1 · Answer 1 · 2016-01-07

Hi Richard,

If HTSFilter is taking a while to run, I'm guessing that it's because you have a fairly large number of samples -- right? The method in HTSFilter is extremely parallelizable since for a given filtering threshold, the Jaccard similarity index is calculated in a loop for all possible pairs of replicates and then averaged (which means calculations could be done in parallel both for different pairs of samples and for different filtering thresholds).

That being said, unfortunately I haven't yet included the ability to run HTSFilter over multiple cores/cpu's since most of my use cases to date have had a limited number of replicate samples (say, less than 10 or say). However, if this is an option you're interested in, I could take a look at including it (although it may take me a bit of time since I need to familiarize myself with the necessary packages). Let me know!

Regards,

Andrea

score 1 · Answer 2 · 2016-01-07

1

Entering edit mode

andrea.rau ▴ 80

@andrearau-7032

Last seen 2.0 years ago

INRAE / Jouy en Josas, France

Ok, I will work on adding the possibility of parallel calculations to HTSFilter. It may take me a couple of weeks to get around to it, but I will let you know when it is ready for testing in the development version. Thanks again for the feedback!

Best,

Andrea

ADD COMMENT • link 8.3 years ago andrea.rau ▴ 80

0

Entering edit mode

After a longer delay than expected (my apologies!), HTSFilter now implements (as of Bioconductor 3.4, version 1.14.0) the option for parallel calculations through the BiocParallel package. There are now two additional optional arguments in calls to HTSFilter: parallel (TRUE/FALSE) and BPPARAM to specify the backend for parallel execution. I hope this helps the execution time for your use case! Any feedback is welcome.

Best,

Andrea

ADD REPLY • link 7.5 years ago andrea.rau ▴ 80