Workers don't pick up jobs


Workers don't pick up jobs
« on: May 01, 2009, 08:19:53 AM »
Recently jobs that have been submitted to our render farm have only been picked up by a few cores on one worker. The other 15 workers sit there idly.

To start with I tried restarting the supervisor service. I had no luck with that so rebooted the server. And when that had no effect I restarted the worker services on all workers. That didn't seem to help so I rebooted all worker servers. Again that had no effect. I then tried unlocking a couple of workers but that didn't seem to help either.

Does anyone have any ideas what the problem maybe? Could the workers be tied up on a stalled job? What is the best way to recover the render farm to a state where all hosts are available and there is no trace of any residual jobs? Killing jobs doesn't seem to be completely effective.

We are running a 5.3 Qube farm on Windows 2003 Servers.

Thanks in advance.

BTW is there a decent troubleshooting guide around? I have found the documentation to be useful to a point, but it doesn't give you a good picture of what commands are appropriate to use in given situations.



Re: Workers don't pick up jobs
« Reply #1 on: June 11, 2009, 09:56:41 PM »
I know this is an older post, but I'm running into same issue.  I was rendering fine yesterday, now today...only 1 computer is picking up a job and only using 2 of 4 threads.  Did anyone find anything to help with this?  Supervisor is running, workers were restarted....also all the jobs that were previously in the log are all please!