Author Topic: Job Potential Not Reaching Full  (Read 7758 times)

Yantor

  • Full Member
  • ***
  • Posts: 18
Job Potential Not Reaching Full
« on: November 20, 2009, 01:03:28 AM »
A mostly minor difficult with our farm, but bears some investigation:

We submit jobs with X number of processes allowed and it starts running. But for some reason it will only run a fraction of the frames. This problem is not encountered on every job, but it does nerf the efficiency of the farm a little.

For example, right now the only job running in our farm has permission for 16 processes but is only running 6. Every node in the farm is unlocked with at least 1 available processor, there are no reservations or restrictions on the job and we are not exceeding license consumption for either Qube or our renderer (in this case Renderman).

What could potentially be causes this, and how would we look to fixing it?

jburk

  • Administrator
  • *****
  • Posts: 493
Re: Job Potential Not Reaching Full
« Reply #1 on: November 20, 2009, 05:38:22 AM »
'qbhostorder <jobID>' will tell you why every machine in the farm is not picking up a subjob for a particular job.

Some of the reasons are so obvious it seems ridiculous; for example, if you run it against a job that's already completed, all the hosts will return "no pending subjobs".  So the output will need a bit of interpretation on your part.

One thing to look out for is a reason of 'none' when there are still pending subjobs; it means just what it seems- there's no good reason why that particular host hasn't had a subjob dispatched to it.  This is indicative of a definite issue with the scheduling mechanism on the supervisor, and digging into the supervisor logs would be indicated at this point.