Author Topic: interoperate with other schedulers, eg. LSF?  (Read 2826 times)

sdwilli3

  • Jr. Member
  • **
  • Posts: 3
interoperate with other schedulers, eg. LSF?
« on: September 13, 2012, 09:21:49 PM »
Hi Everyone,

Our centralized computing resources need to be split/shared between 2 purposes:

renderfarm managed by Qube
compute cluster managed by LSF

None of this is in production, at the moment, though we have already purchased Qube. The actual compute nodes will be Linux-based and GPU/CUDA capable. 

One strategy is, of course, to assign a separate/fixed set of compute nodes to each manager. I'm wondering what options are available to assign the nodes more dynamically to increase overall utilization.  An intermediate solution could use Qube's ability to (un)block worker nodes on a schedule, etc.


All suggestions are appreciated -- many thanks!

Scott



jburk

  • Administrator
  • *****
  • Posts: 493
Re: interoperate with other schedulers, eg. LSF?
« Reply #1 on: September 17, 2012, 10:57:51 PM »
We have done a a custom LSF/Qube integration as solution for one customer on a paid consulting basis; they have over 3,000 LSF nodes, so it made sense to for them to pay the consulting fees.

Essentially, they submit jobs to LSF that run on a node, the LSF job starts an RPC agent and starts the qube worker with a cluster restriction.   Then when the qube jobs are done (LSF node usually processes all jobs in a 'pgrp'), there's a signal sent via RPC to the LSF job  to instruct it to shut down the worker.  Then the worker is seen as "down" by Qube and the LSF job exits.

As I said, a very custom solution...