Author Topic: Workers dropping their groups  (Read 1459 times)

ashok

  • Full Member
  • ***
  • Posts: 11
Workers dropping their groups
« on: April 16, 2013, 04:52:29 PM »
Hello,

We've noticed an issue over the past week (it could have been going on longer but we only noticed a week ago) where random machines will suddenly drop their group assignments at some point during the day. Restarting the supervisor will reassign the correct groups to the machines. None of the machines have been moved or rebooted and there doesn't seem to be any consistency as to which machines decide to drop their groups.

Any ideas what could be causing this?

BrianK

  • Hero Member
  • *****
  • Posts: 107
Re: Workers dropping their groups
« Reply #1 on: April 16, 2013, 05:06:44 PM »
Groups are assigned either through the worker's local qb.conf or through the supervisor's central qbwrk.conf.

In order for a group to be "lost", one of those things must change, or a signal is being sent to say there is a system-wide change taking place (which would then re-read the qbwrk.conf).  The supervisor restarting is one of those types of signals.

I recommend:

1. check to make sure there are no other supervisors on your network.  Go to one of the workers that seems to lose its group & run (from the command line/terminal): "qbadmin supervisor --list".  (You will probably use the full path to qbadmin - it's [c:\program files\pfx\qube\sbin\, /Applications/pfx/qube/sbin/, /user/local/pfx/qube/sbin/]qbadmin.  It should return that there is only 1 supe. If there are more, that's the problem.  Uninstall them.

2. check the supervisor's qbwrk.conf.  It will be located next to qb.conf and qb.lic.  The directory containing these files is either c:\programdata\pfx\qube or /etc (linux and os x).  Looking in this file (assuming it even exists - it may not), you're looking for duplicate entries for the same machine, possibly with differing worker_groups specified.  If there are duplicate entries, remove the duplicates - leaving on the entries with correct worker_groups.  If you make a change to the qbwrk.conf file, you'll need to push it out either by restarting the supervisor service, or, as a qube admin, run the command "qbadmin worker --reconfigure".

3. Check that worker_groups is set in *either* the worker's local qb.conf or the supervisor's central qbwrk.conf, but not both.  If it's set in both places, comment one of them out.  Personally, I prefer all my settings in the central qbwrk.conf as opposed to the local qb.conf.

If none of that helps, I recommend you open up a support ticket - support@pipelinefx.com so we can dig deeper into the problem.