Author Topic: Pending Jobs stalling forever  (Read 11555 times)

dh

  • Jr. Member
  • **
  • Posts: 6
Pending Jobs stalling forever
« on: September 22, 2008, 08:00:40 PM »
Hi,

We're seeing a fairly frequent issue where a job that has a dependency on a previous job will remain in the 'pending' state forever. It looks like the 'complete-job' callback gets executed, but the job never starts moves out of 'pending'. Is this a known issue? Have others run into this before? Are there any workarounds for this?

Any help would be greatly appreciated.

Thanks in advance,
-dh

eric

  • Hero Member
  • *****
  • Posts: 229
Re: Pending Jobs stalling forever
« Reply #1 on: September 24, 2008, 02:14:03 AM »
I've looked at the job histories, and it appears that the callbacks worked properly, so there's no race condition regarding the timing of the callbacks.
It is possible that there is an issue regarding the timing of the job getting queued after the unblock, and using qbshove was the correct action to take. I've passed along your case to the developers, but I would like to know what version of the Supervisor you're running?

dh

  • Jr. Member
  • **
  • Posts: 6
Re: Pending Jobs stalling forever
« Reply #2 on: September 24, 2008, 04:35:02 PM »
We're using 5.3.0 for the supervisor. In regards to what you said about the job not getting queued properly after the callback triggers -- should we look at switching to the strictly FIFO option for job processing on the server? I saw in the docs that this gives worse performance in general, but it might be better than having our jobs stalled until a user shoves the job.

Scot Brew

  • Hero Member
  • *****
  • Posts: 272
    • PipelineFX
Re: Pending Jobs stalling forever
« Reply #3 on: February 06, 2009, 01:37:49 AM »
Are you still seeing this issue with the latest Qube 5.4?

dh

  • Jr. Member
  • **
  • Posts: 6
Re: Pending Jobs stalling forever
« Reply #4 on: February 06, 2009, 06:12:19 PM »
We haven't upgraded to 5.4, so I can't comment. Ultimately, we were able to work around this issue by:

1. Upgrading the hardware on the Qube supervisor
2. Writing a program that periodically (every 5 minutes) queried the supervisor for 'pending' jobs and gave them a shove.
3. Switching to the simple FIFO priority scheme.

Once all three of these were in place, we were able to dependably process jobs without things getting stalled.

jasonnic

  • Newbie
  • *
  • Posts: 1
Re: Pending Jobs stalling forever
« Reply #5 on: April 06, 2009, 05:29:42 PM »
Hi we are seeing the same issue and with jobs which have no dependencies.

We are running 5.4 on the supervisor and renderboxes. In a windows xp 64bit environment.

Has the development team had any joy on why this is happening??

Cheers

jason