Running HTSFilter in paralell
2
0
Entering edit mode
@richard-yanicky-3394
Last seen 8.8 years ago

Hello,

 

I am using the HTSFilter library to filter out low count samples for some RNA data. It is working but takes a while to run. Is there a way to run it using multiple cores/cpu's?

 

Regards,

 

Richard 

htsfilter parallel • 1.5k views
ADD COMMENT
1
Entering edit mode
andrea.rau ▴ 80
@andrearau-7032
Last seen 2.7 years ago
INRAE / Jouy en Josas, France

Hi Richard,

If HTSFilter is taking a while to run, I'm guessing that it's because you have a fairly large number of samples -- right? The method in HTSFilter is extremely parallelizable since for a given filtering threshold, the Jaccard similarity index is calculated in a loop for all possible pairs of replicates and then averaged (which means calculations could be done in parallel both for different pairs of samples and for different filtering thresholds).

That being said, unfortunately I haven't yet included the ability to run HTSFilter over multiple cores/cpu's since most of my use cases to date have had a limited number of replicate samples (say, less than 10 or say). However, if this is an option you're interested in, I could take a look at including it (although it may take me a bit of time since I need to familiarize myself with the necessary packages). Let me know!

Regards,

Andrea

ADD COMMENT
0
Entering edit mode

Hi Andrea,

 

Thanks for the response!

Yes we do have a large number of samples and hope to setup a pipeline using HTSFilter. We have used sorter s.len to speed it up but need to be sure the results are robust. If there was a multicore option it would be a great help.

 

Thanks,

 

Richard 

 

ADD REPLY
1
Entering edit mode
andrea.rau ▴ 80
@andrearau-7032
Last seen 2.7 years ago
INRAE / Jouy en Josas, France

Ok, I will work on adding the possibility of parallel calculations to HTSFilter. It may take me a couple of weeks to get around to it, but I will let you know when it is ready for testing in the development version. Thanks again for the feedback!

Best,

Andrea

ADD COMMENT
0
Entering edit mode

After a longer delay than expected (my apologies!), HTSFilter now implements (as of Bioconductor 3.4, version 1.14.0) the option for parallel calculations through the BiocParallel package. There are now two additional optional arguments in calls to HTSFilter: parallel (TRUE/FALSE) and BPPARAM to specify the backend for parallel execution. I hope this helps the execution time for your use case! Any feedback is welcome.

Best,

Andrea

ADD REPLY

Login before adding your answer.

Traffic: 520 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6