Author Topic: multiproc management  (Read 8186 times)

cam

  • Jr. Member
  • **
  • Posts: 7
multiproc management
« on: January 14, 2010, 12:17:22 PM »
Hi support,

What is the approach to managing a farm with multiproc machines and tasks with differing core requirements.

Let's consider the the following farm:

100x 8-core machines = 800 procs

Software A runs optimally (license/performance) on 8 procs - on the same physical box
Software B runs optimally (license/performance) on 4 procs - on the same physical box
Software C runs optimally (license/performance) on 1 proc

What are the stategies qube takes to tackle this issue?

If each of the 100 machines picks up a 1 proc job --> no 8 proc jobs can run. Do you reserve certain boxes as 8 core jobs only? The downside is these are idle when there are no 8 proc jobs on the farm.

It is a tricky packing problem, does qube have a novel take on this?

Thanks,

cl







shinya

  • Administrator
  • *****
  • Posts: 232
Re: multiproc management
« Reply #1 on: January 14, 2010, 08:50:08 PM »
Hi cam,

First, let me explain the concepts of "subjobs" and "jobslots" that Qube uses.

Subjobs are basically instances of your jobs, and are the things that actually run
on the farm.  For example, if you specify a renderman job to use 8 subjobs, you're
really telling the system to execute (up to) 8 instances of renderman to process the
job's frames.  For historical reasons, our submission dialogs refer to these as "CPUs"--
i.e., you specify 8 "CPUs" when submitting your job, which really means you're asking
for 8 "subjobs" or instances (that is, there's no specific connection with physical
number of CPUs) to carry out the job.

Jobslots are the number of subjobs that a given worker (render node) can process at a
time.  By default, workers have as many jobslots as there are cores on the node, so
an 8-core machine will have 8 jobslots, and therefore be able to process 8 subjobs
at once.


In your scenario, you can keep the workers to have the default number of jobslots,
which is 8, but when you submit your jobs, you can tell the system that your subjobs
each require N procs.  More specifically, each job you submit can have "resource
reservations" attached to them, and if you specify a "reservation" of
"host.processors=4", for  example, you are saying that this job's subjob will occupy
4 jobslots when run.

So, for instance, if you're submitting software B job, you'll want to specify
"host.processors=4".  That will allow the system to run only 2 subjobs of this
"software B" type on your 8-core nodes at once.

I hope that helps!

-shinya.


cam

  • Jr. Member
  • **
  • Posts: 7
Re: multiproc management
« Reply #2 on: January 14, 2010, 09:30:02 PM »
Hi shinya,

Sounds like you guys have put some thought into this one - which is great.

Reservations sound good. Are there any high level tools/algorithms that are aware of all reservation requests and fill the farm accordingly?

In the example above softA,B,C imagine I had 8 tasks to run of each (24 tasks total) on 13 8-core boxes. All tasks share the same priority. In an ideal world (optimistic I know) it would look like this:

Box 1: 8 - 1 core tasks
Box 2,3,4,5: 2 - 4 core tasks
Box 6,7,8,9,10,11,12,13 :  1 - 8 core task

Does this situation makes sense sort of renderfarm tetris?

cl