Author Topic: QB_Boot Sucking Life out of my Qube Bliss *(Qube 5.50)  (Read 11000 times)

mdonovan

  • Full Member
  • ***
  • Posts: 19
QB_Boot Sucking Life out of my Qube Bliss *(Qube 5.50)
« on: November 23, 2009, 04:29:44 PM »
We sporadically get this error on our workers ....

----------------------------------
-     Worker 484.10
----------------------------------
Can't locate qbboot.pm in @INC (@INC contains: C:\Program Files (x86)\pfx\jobtypes C:\Program Files\pfx\jobtypes C:\Program Files\pfx\qube\\types C:\Program Files\pfx\qube\api/perl/qb/blib/arch/auto/qb C:\Program Files\pfx\qube\api/perl C:/Perl64/site/lib C:/Perl64/lib .) at C:\Program Files\pfx\qube\api/perl/qb.pm line 66.
BEGIN failed--compilation aborted at C:\Program Files\pfx\qube\api/perl/qb.pm line 66.
Compilation failed in require at C:\WINDOWS\Temp\job\0\484\484_10.pm line 6.
BEGIN failed--compilation aborted at C:\WINDOWS\Temp\job\0\484\484_10.pm line 6.
6.

 :-\ :-\ :-\ :-\

jburk

  • Administrator
  • *****
  • Posts: 493
Re: QB_Boot Sucking Life out of my Qube Bliss *(Qube 5.50)
« Reply #1 on: November 23, 2009, 04:35:06 PM »
does this only happen on certain workers?

mdonovan

  • Full Member
  • ***
  • Posts: 19
Re: QB_Boot Sucking Life out of my Qube Bliss *(Qube 5.50)
« Reply #2 on: November 23, 2009, 04:35:40 PM »
it happens on random jobs ... and when it does it is on all the workers.

jburk

  • Administrator
  • *****
  • Posts: 493
Re: QB_Boot Sucking Life out of my Qube Bliss *(Qube 5.50)
« Reply #3 on: November 25, 2009, 09:13:58 PM »
Do the random jobs have the 'export_environment' flag set, and are they from a consistent set of users?

I'm wondering if something in 1 or more user's environments is mangling the runtime environment on the workers.

mdonovan

  • Full Member
  • ***
  • Posts: 19
Re: QB_Boot Sucking Life out of my Qube Bliss *(Qube 5.50)
« Reply #4 on: December 08, 2009, 04:50:00 PM »
We still have not solved this one ...

It seems to happen anytime submit a job while there is a job already rendering.

here are some excerpts from the logs ... help =/


PROPERTIES


Status
status      : failed
timesubmit  : 2009-12-08 07:44:57
timestart   : 2009-12-08 07:45:01
timeelapsed : 0:01:06
timecomplete: 2009-12-08 07:46:07

Basic Job Properties
id          : 573
name        : change_textures_heavy_b
prototype   : xsi
user        : Smoke
priority    : 50
cpus        : 13
tasks       : 100 (100 pending)

Advanced Properties
hosts       :
groups      :
omithosts   :
omitgroups  :
cluster     : /
requirements:
reservations: host.processors=1
restrictions:
dependency  :
dependsup   :
dependsdown :

Package
appversion      : 8.0.249.0
passes          : color
renderMode      : render
scenefile       : \\isilon.smoke.nyc\jobs\2009_Jobs\temp_development_sm99999\3D\PRODUCTION\xsi\xsi_development\Scenes\rendering\change_textures_heavy_b_qube.scn
verbose         : 1
xsibatchExecutable: C:\Softimage\Softimage_2010_SP1_x64\Application\bin\XSIBatch.bat

Notes


Details
account     :
agendastatus: pending
datetime_supervisorQuery: 2009-12-08 07:48:46
domain      : .
flags       : 2056
flagsstring : auto_mount,disable_windows_job_object
hostorder   :
kind        :
label       : qube1
lastupdate  : 2009-12-08 07:46:07
mailaddress :
pgrp        : 573
pid         : 1
retrysubjob : -1
retrywork   : -1
subjobstatus: complete
timeout     : -1

////// StdErr for One of the workers
----------------------------------
-     Worker 573.0
----------------------------------
Can't locate qbboot.pm in @INC (@INC contains: C:\Program Files (x86)\pfx\jobtypes C:\Program Files\pfx\jobtypes C:\Program Files\pfx\qube\\types C:\Program Files\pfx\qube\api/perl/qb/blib/arch/auto/qb C:\Program Files\pfx\qube\api/perl C:/Perl64/site/lib C:/Perl64/lib .) at C:\Program Files\pfx\qube\api/perl/qb.pm line 66.
BEGIN failed--compilation aborted at C:\Program Files\pfx\qube\api/perl/qb.pm line 66.
Compilation failed in require at C:\WINDOWS\Temp\job\0\573\573_0.pm line 6.
BEGIN failed--compilation aborted at C:\WINDOWS\Temp\job\0\573\573_0.pm line 6.
6.

jburk

  • Administrator
  • *****
  • Posts: 493
Re: QB_Boot Sucking Life out of my Qube Bliss *(Qube 5.50)
« Reply #5 on: December 08, 2009, 08:03:26 PM »
This was tracked down to a single mis-configured host burning through all the subjobs.

Note to users: if you're getting frequent failures, it's often informative to sort the subjobs for a failed job by state and see if the hostname is common across all failed subjobs.

In this case, all subjobs for the bad worker were marked as complete, but processed no frames (visible in the 'Timeline' graph).