Author Topic: Efficiently Sharing Nodes between Projects (Read 12686 times)

jesse · « **on:** July 22, 2011, 05:49:13 PM »

Howdy,
I've been a qube user/admin for almost 4 years, but I have not yet arrived at a great scheme for dividing our render nodes between multiple projects. Our farm has ranged from 20-100 nodes during this time. Most often there are 2-6 projects competing for render power. Each project has a number of render jobs queued at any one time.

This is generally our farm layout:
05 nodes of /2D
10 nodes of /3D/General
10 nodes of /3D/ProjectA
10 nodes of /3D/ProjectB
...

We keep our /2D nodes in a separate group so that they are only available for 2D renders.
For our 3D renders, all the user must specify a cluster before submitting. This is controlled by a custom in-app render submitter. Beyond assigning clusters, the only restriction option we allow is an exact cluster restriction such as "/3D/ProjectA". The docs describe the ability to specify multiple clusters in a restriction such as "/3D/ProjectA,/3D/General", but this has not yet worked for me. At the moment, I can have a job run on all the nodes in the group, or on a single cluster.

Beyond our clustering setup, I was hoping to play with preemption to achieve better queue processing. The only modes I see are 'passive', 'active', and 'disabled'. From my tests, 'active' will cause a loss of render time by killing a frame/task before completion. 'passive' should kill a subjob when "convenient", but it seems to me that the subjob will exhaust all tasks/frames from the current job before exiting. I would expect the subjob to exit when a task/frame is complete. From my standpoint, 'passive' seems to be equal to 'disabled'. It is quite possible that I have something misconfigured.

So if any of you all are willing to share, I would love to know what mix of priorities, clusters, restrictions, preemption, etc that you have set up. I would be glad to change my game if I can get the farm working more efficiently.

Cheers,
jesse

BrianK · « **Reply #1 on:** July 26, 2011, 01:09:09 AM »

Hi Jesse, we spoke about this off-forum, but to keep the world up to speed, I'll summarize here:

restrictions are evaluated as an expression, so to specify multiple restrictions, you "or" them:

"restriction1 || restriction2"

As far as preemption goes, quite a lot of work has gone into preemption since version 5.5. It works, now, as you would expect. "Active" kills any running jobs that are about to be preempted. "Passive" allows the currently running work/frame(s) to finish before they are preempted. "Disabled" does not interrupt a running job.

dmeyer · « **Reply #2 on:** July 31, 2011, 09:59:42 PM »

We have several clusters assigned to different groups, and a handful of workers in the base cluster available to all groups.

Groups then submit to their own cluster, and any other idle workers from other groups will help that job until something is submitted to their cluster.

Is this not working for you?

jesse · « **Reply #3 on:** September 02, 2011, 01:10:29 AM »

Brian, thanks for the response. John Burke mentioned the same thing about cluster restrictions. The GUI for 5.5 still generates a comma separated list which is where my confusion was stemming. Has this also been fixed in the newer gui?

As for my solution, I reverted back to my former scheme. Each project gets a cluster. All clusters are in the same group. The one change is that the custom submitter now supports the "||" joining for cluster restrictions.

Cheers

Author Topic: Efficiently Sharing Nodes between Projects (Read 12686 times)

jesse

Efficiently Sharing Nodes between Projects

BrianK

Re: Efficiently Sharing Nodes between Projects

dmeyer

Re: Efficiently Sharing Nodes between Projects

jesse

Re: Efficiently Sharing Nodes between Projects