PipelineFX Forum

Qube! => General => Topic started by: ttfRyan on August 13, 2012, 06:27:04 PM

Title: Kill frozen process
Post by: ttfRyan on August 13, 2012, 06:27:04 PM
This is my first post so hopefully this is in the right place.

Does Qube have a way to detect a frozen process? We are worried that our Maya sessions rendering our frames may hang up and stay idle on a machine until an admin comes along to clean up jobs.

We are aware of the job and frame timeout commands in Qube but feel it may not be the best option for us. At our studio we will not be submitting 1 frame per render node, instead we will submit 100+ frames per node. This may be a bit odd, but we do have our reasons for this. So without taking advantage of the timeout functions how can we catch a frozen job?

Title: Re: Kill frozen process
Post by: jburk on August 16, 2012, 11:41:12 PM
How would you differentiate between a frozen process and a long render?  Simply no cpu utilization for an extended period of time? 

Qube doesn't do this, you'd need to continuously stash the cpu utilization rates over time and always be looking for extended periods of low utilization...

I am curious why you're doing 100+ frame chunks.  Have you tried our "dynamic allocation" maya jobtype, which holds the instance of Maya open for the duration of the subjob, and only dispatches frame numbers to the subjob?  This gives you the best of all worlds, where Maya only starts up and loads the scene once per machine, but you don't have to send large chunks of frames to a single worker.
Title: Re: Kill frozen process
Post by: ttfRyan on August 17, 2012, 12:41:15 AM
Yeah basically when the memory or cpu usage bottoms out and stays idle for a period of time you can pretty much guarantee Maya is no longer rendering frames.

We are doing Maya playblasts which are fairly quick renders. So quick that it would make no sense to send 1 frame at a time to different nodes, as maya simply starting up would take longer than the frame. Do you have any documentation on the dynamic allocation? From what you are saying it sounds like this will solve that problem. (here is my layman's version) We submit a job to Qube, Qube starts maya on all workstations, Qube then starts sending a frame to each machine that has now loaded the scene, Qube receives all of these frames and when all are complete quit Maya and will continue with any subjobs.

I don't know if we would see a speed imporvment by using this method, but we would be able to utilize the frame timeout and retry code that comes with qube.

I couldn't find much on the forums regarding dynamic allocation...
http://www.pipelinefx.com/forum/index.php?topic=539.0 <- broken link