A new 6.0.3 maintenance release is now available for the Qube 6.0 Core/Supervisor/Worker. This is a recommended release for all customers running Qube v6.0.
Below is a list of the fixes and enhancements.
================================================
@CHANGE: Removing Mac OS X 10.4 support.
@FIX: Handle case where database_host defined in the supervisor as 127.0.0.1
@FIX: Python API issue where null, empty or incorrect input to commands such
as qb.block() will apply the function to ALL jobs/work items
Passing a wrong parameter or a null/empty to these routines will now raise
an exception.
BUGZID: 63565
@FIX: Windows-specific bug where PreForkDaemon-based daemon's (supe , worker)
background "kin" threads were not always running their intended routines.
@TWEAK: added more useful messages to print on sending host report to supe,
and when checking job's resource requirements and reservations agains the
worker's current resources.
@FIX: Worker memory/swap resource tracking (Linux)
* memory/swap resource tracking was broken, as the code was adding the
reserved amount on top of the actually used values. For example, if a
subjobs has 'reservations="host.memory=1000"' and actually running and
using 900 MB, the code was incorrectly subtracting 1900 MB from the
available host.memory for the worker.
@FIX: Python API routine qb.rangechunk() crash with Bus error on certain
input.
The Python API qb.genchunk() routine was reported to crash on input that
included whitespace. It turns out that the C++ _qb_rangechunk() routine
was crashing when it had an empty input sequence.
Also fixed regex in Python API routine qb.rangechunk() to be more
permissive about whitespaces in sequence strings (i.e., " -1 - 10 x 1" is
equivalent to "-1-10x1").
BUGZID: 63559
ZD: 3401
@FIX: OSX: host swap "usage/total" collected and displayed accurately
@CHANGE: OSX: host memory usage excludes "inactive" memory
@FIX: fixed accuracy of the "host.memory" reports (used/total) of workers on
Linux.
BUGZID: ZD: 3308
@CHANGE: removed the not-too-useful tmp_used worker host property
@FIX: fixed accuracy of tmp_used.
@FIX: changed worker code that collects the /tmp usage to use statfs(),
instead of crawling the dir (recursively), for efficiency.
BUGZID: 63475
ZD: 3175 2586
@CHANGE: Modified QbWorker::remoteConfig() routine to retry up to 10
times with random intevals, then give up.
@TWEAK: Added more useful error message to print on "tcp compressed
header" writing errors.
@FIX: fixed perl .xs code to accomodate interface change in perl 5.11
and above.
@FIX: disabled code (temporarily) that crawls /tmp to get its size, as
it was causing worker threads to choke on systems with large
amounts of data in /tmp.
Change with Linux code only-- other platforms (win, osx), doesn't have any
similar code that collects /tmp (or similar) size info.
ZD: 3175
@FIX: fixed issue where worker_max_threads specified in qbwrk.conf
wasn't being effective.
ZD: 3175
@FIX: reduced number of retries when host is found "busy" (retry
busy), in effect reducing the blocking time of the thread down to
7 seconds max, and give up after that.
@FIX: added code to do a final check before dispatching a subjob to a
worker, to see if there's already another subjob trying to start
on it. (Check is done in the duty table for a "ghost" entry for
the host)
ZD: 3175
@FIX: Fixed default value of worker_max_clients to actually be the
intended value, 256
(instead of picking up the default value for worker_max_threads,
which is
.
@FIX: reverted code that attempts to automatically backoff and retry
in QbFarm::reserve(). it was causing many supe threads to stall
for a long time, eventually maxing out the max_threads and
resulting in "connection overflow.
ZD: 3164
@FIX: removed debugging log output of "IN: QbQueue::clearGhostDutys( "
@FIX: Adding code to show why getpwuid() call failed
@CHANGE: optimization. add code to prevent supe from dispatching
subjobs from the same job to a worker until the worker says "no".
Works for most simple cases where "host.processors" is an integer value.
This should prevent many brief "oversubscription" of workers seen
(where the GUI would show workers running more subjobs than they
have jobslots for) when jobs are first being dispatched.
Changes made to the startJob(), startHost(), and startQualified()
routines.
Note that even with these changes, timing issues can/will still cause
dispatch-a-subjob-then-get-rejected-by-worker scenarios (which is fine).
ZD: 2646
@FIX: reverting back the supervisor python engine changes introduced
in 6.1, which was causing initilizing threads to stall/crash.
BUGZID: 63502
ZD: 2894 3047
@FIX: don't create jobstatus_sk column in job_fact; it belongs in table
version 6.
move column creation commands into upgrade_v6 script
@CHANGE: Add section to Installation docs detailing
Win7/Vista/Server2008 considerations - re: disabling UAC and
Interactive Services Detection service
@NEW: datawarehouse: NEW FEATURE - add 'jobstatus' column to job_fact
table to store terminal state of job in datawarehouse - customer
request.
* also create pfx_stats.tableversion placeholder for consistency,
we already create a placeholder pfx_stats db if necessary. Set
the tableversion for the pfx_stats DB equal to 0.
@TWEAK: datawarehouse: TWEAK - allow the use of setting DATAWH_DIR
env_var to specify location of dataw/h sql scripts during
installation
@COSMETIC: datawarehouse: COSMETIC - update feedback printed during
initial job fact table creation
@FIX: updated supervisor_flags and supervisor_log_flags to show the
correct default values in the default qb.conf