Author Topic: Qube! 6.4.2 Core/Supervisor/Worker maintenance release is available  (Read 17404 times)

jburk

  • Administrator
  • *****
  • Posts: 493
A new 6.4.2 maintenance release is now available for the Qube 6.4 Core/Supervisor/Worker. This is a recommended release for all customers running Qube v6.4.

This is essentially a re-release of 6.4.1 with a patch involving fixes to the new run-time path translation.

=======================================================
    Highlights
=======================================================
* many fixes for out-of-order dispatch issues

* added V-Ray Distributed Rendering (DR) support to Maya jobtype.

* path translation for jobs can now be performed on the worker at run-time,
not when the job is submitted.  This allows for path translation maps to be
defined in the workers' qb.conf or the central qbwrk.conf, not in each user's
GUI preferences

* Desktop User mode workers can now perform auto-mounting of drive mappings
on Windows, but only when the worker is configured to run a 1 job at a time
with "worker_cpus=1"

* addition of a job removal script, can be run by the Qube administrator or
as an automated task

* add Mac OS X 10.8 Mountain Lion support

* add activeperl 5.16 support for Windows


Below is a detailed list of the fixes and enhancements since the last point-release.

===========================================
Core / Supervisor / Worker changes
===========================================
==== CL 10543 ====
@FIX: issue with worker_path_map not working when defined in qbwrk.conf and containing backslashes.

==== CL 10537 ====
@FIX: issue where qbconvertpath() can return an empty string when worker_path_map is undefined.

==== CL 10514 ====
@FIX: another patch for out-of-order issue. Fixed unexpected short-circuit evaluation that was happening in the startResources() routine

==== CL 10513 ====
@FIX: another patch for out-of-order issue. Fixed unexpected short-circuit evaluation that was happening in the startHost() routine

==== CL 10512 ====
@INTERNAL: QbJob object's _subjobswaiting data was not being initialized or copied correctly, causing some job comparisons based on subjobs waiting counts to unexpectedly fail.

==== CL 10504 ====
@INTERNAL: added more log output for debugging builds, added more comments while working on out-of-order issue.

ZD: 8198

==== CL 10477 ====
@FIX: Another out-of-order fix. Jobs at the same numerical and cluster priority should dispatch in the correct FIFO order now.

The FIFO enforcing should work most of the time, but there still will be
occasional out-of-order behavior, due to the multi-threaded nature of the
supervisor. ("qbshove"-ing the older job should correct it, when it's seen)

ZD: 8198

==== CL 10462 ====
@FIX: yet yet another fix for out-of-order dispatch behavior-- eliminate race-condition that would allow lower priority jobs that were just preempted to get workers before higher-priority jobs.
See also CL10440 10452

ZD: 8198

==== CL 10461 ====
@CHANGE: modified/compacted the multi-line "found a duty to replace" logging to be a single line.

==== CL 10452 ====
@FIX: yet another fix for out-of-order dispatch behavior-- eliminate race-condition that would allow lower priority jobs that were just preempted to get workers before higher-priority jobs.
See also CL10440

ZD: 8198

==== CL 10441 ====
@FIX: killing an already finished (complete, failed, killed) job leaves the job in the "dying" state.

==== CL 10440 ====
@FIX: another fix for out-of-order dispatch behavior-- eliminate race-condition that would allow lower priority jobs that were just preempted to get workers before higher-priority jobs.

ZD: 8198

==== CL 10429 ====
@FIX: out-of-order job dispatching issue with jobs using the "+" sign with the "host.processors" reservations.

ZD: 8198 8261 8229 8233 8228

==== CL 10389 ====
@NEW: add new appFinder submission for C4D

==== CL 10323 ====
@NEW: add support to pyCmd* jobtypes for new "auto-pathing" feature; can now send jobs to a mixed set of workers and find the 3rd-party executable on all OS's, not pre-defined in the job's package

==== CL 10271 ====
@CHANGE: desktop user mode worker to only allow automount when "worker_cpus = 1" is set explicitly.

==== CL 10264 ====
@NEW: add automount support for desktop user mode on Windows

@CHANGE: db table change (additional column to the assignment table) required-- adding QbTableVersion7 definition.

@FIX: unmounting of  "subst" style local mounts was broken

@INTERNAL: added a bunch of comments, and renamed some methods in the QbMission class, for readability.  

==== CL 10254 ====
@NEW: pyCmdline and pyCmdrange do run-time path translation

==== CL 10233 ====
@FIX: added qb::workerconfig() that was missing to the Perl API

==== CL 10228 ====
@FIX: missing "bin/qbhash" command on Linux

==== CL 10223 ====
@FIX: examples in the code to reflect previous change to the command line options/arg

==== CL 10216 ====
@NEW:Job cleanup script in utils directory.  This script is designed to be run by a user or by a user-created scheduled task.

==== CL 10191 ====
@FIX: removed unneeded "install_worker" and "uninstall_worker" scripts from being installed on Mac OSX

==== CL 10189 ====
@FIX: timing issue where some worker resources (host.xyz) would disappear after the worker received a remote config.

@FIX: issue where supervisor tries to dispatch a subjob to a worker with
insufficient resources (reduced the likeliness of that from happening)

@FIX: the above 2 fixes combined should now prevent some of the
out-of-priority-order dispatch issues, especially in environments where
worker resources are deployed.

ZD: 7885

==== CL 10149 ====
@CHANGE: modified so the worker_path_map mapping definition order is preserved when it is applied to paths via convertpath()

==== CL 10144 ====
@FIX: bug with handling lone backslash in the worker_path_map
@CHANGE: modifying QbConfig class to maintain order of option (config parameter) addition

==== CL 10125 ====
@NEW: add automatic runtime path conversion to cmdline and cmdrange jobtypes
@NEW: jobs may have the "convert_path" flag set to tell the jobtype to do runtime path conversion.
@NEW: qbsub now has a "-convertpath" option to set the flag.
@NEW: qubegui simpleCmd interface has a new "convert path" checkbox

==== CL 10118 ====
@FIX: fixed issue where agenda timeouts don't work properly on the first agenda item processed by a subjob, on Unix (Linux/OSX) workers

==== CL 10117 ====
@FIX: fixed issue where agenda items that fail because of timeout don't get automatically retried via retrywork
ZD: 7763

==== CL 10097 ====
@NEW: add Mac OS X 10.8 Mountain Lion support

==== CL 10095 ====
@FIX: fixed newly introduced issue with errors reading licenses in dev/main branch supe

==== CL 10074 ====
@INTEG: main -> rel-6.4
-----
@FIX: data warehouse installation/upgrade scripts on linux/OSX now search /etc/qb.conf for database_user/_password/_port/_host values in order to support non-default values for these parameters

==== CL 10072 ====
@NEW: add activeperl 5.16 support for Windows

==== CL 10068 ====
@NEW: Add doc on QB_CONVERT_PATH(srcpath) in Use.doc and qbsub's online help

==== CL 10067 ====
@NEW: Add documentation on worker_path_map config parameter and the qbconvertpath() API routine.

==== CL 10062 ====
@FIX: fixed parsing code in QbConfigFile.cpp so that the "name" part of a name-value pair can contain special chars if double-quoted.

==== CL 10048 ====
@FIX: reduce the number of times qb.supervisorconfig() and qb.getusers() are called during GUI startup and normal operation, pre-populate the qbCache with this data at startup

==== CL 10025 ====
@FIX: data warehouse installation/upgrade scripts on linux/OSX now search /etc/qb.conf for database_user/_password/_port/_host values in order to support non-default values for these parameters

==== CL 10022 ====
@FIX: modified the worker to only report to the supe of its host status when subjobs are completely done and removed, and NOT when they are only marked/scheduled for removal.

This was causing jobs to sometimes run out-of-order, especially when there
are many subjobs to each job (such as one subjob per frame), since that
situation tends to increase the chance of the supervisor dispatching the
same subjob to the same worker. The subjob will be dispatched to the same
worker, but rejected since the worker thinks it's a duplicate assignment of
a subjob that's being removed (and consequently a lower priority job will
get the worker's slot, causing out-of-order job execution)
« Last Edit: January 04, 2013, 05:21:34 PM by jburk »