Author Topic: Re: Is it possible to allow more than one machine to process a single job ??  (Read 4890 times)

jason

  • Jr. Member
  • **
  • Posts: 6
Anthony, I'm having the same exact problem and error output as upperstorey. Please let me know what your findings are. Thanks.

anthony

  • Senior Software Engineer
  • Hero Member
  • *****
  • Posts: 183
Re: Is it possible to allow more than one machine to process a single job ??
« Reply #1 on: February 07, 2006, 10:23:19 PM »
Hey Thr33TwoSe7en,

     I'm splitting this topic so that we can address it with a clean message chain.   Could I first ask you a few questions?

     What kind of environment are you using?  Windows? Linux? OSX? Mixed?

     Do you use a file server?

     Do you use a authentication server?  Primary Domain Controller? Ldap?

     Thanks,
        Anthony

jason

  • Jr. Member
  • **
  • Posts: 6
Re: Is it possible to allow more than one machine to process a single job ??
« Reply #2 on: February 07, 2006, 10:40:25 PM »
Right now, I'm trying to get it working on a Windows XP SP2 farm (12 systems dual opterons) and a Windows 2000 SP4 Supervisor (dual xeon). In the future, I expect to have a mix of Linux/Windows servers for my farm.

Yes, i am using a file server and I use Windows Active Directory for security.

anthony

  • Senior Software Engineer
  • Hero Member
  • *****
  • Posts: 183
Re: Is it possible to allow more than one machine to process a single job ??
« Reply #3 on: February 07, 2006, 11:09:48 PM »
Excellent.  Sounds like you've got a good network setup there.

First, let's address a few configuration recommendations.  Then we'll go into stepping through tests you should run before attempting the actual "render" itself.

#1.  Since you are using a PDC, I'm guessing your file server is using it to authenticate users.  This is to serve both file permissions as well as ease of maintenence.  You can take full advantage of this in qube!.  Unfourtunatly because not everyone has a good network, qube! is not configured by default to do so.  So... I'll give you the configuration you'll want:

        I.   Setup your workers -
                  a.   Install the worker on a windows host.
                  b.   Open the Configuration Dialog: Program Files->Pipelinefx->qube! 4.0
                  c.    Click on the worker settings tab, then select the proxy tab.
                  d.    Change the Execution mode from proxy to user.
                  e.    Click on OK
                  f.    Allow the worker to restart. 

       II.   Setup your clients -
                  a.    Install the core msi
                  b.    Open the Configuration Dialog: Program Files->Pipelinefx->qube! 4.0
                  c.     Select the "Client Settings" tab and "check on" the "Auto-Mount (Windows Only)" box. 
                  d.    Click on OK.

       III.  Authenticate each user -
                 a.  Open the "Windows Login" dialog under: Program Files->Pipelinefx->qube! 4.0
                 b.  Type in your domain password twice.
                 c.   Click on OK

       I'll run through a quick explination for each step so that you can get a better understanding of what qube! is doing.

       Setting up the worker first requires that you change it from running a job under a precreated user called qubeproxy to running the job under the actual user's authenticated data.

       Since you're probably going to want to run your jobs with that user's mapped drives premounted, you'll want to turn auto-mounting to "on".  The client host will duplicate any network and loopback drive maps on the worker host.

       Because qube! requires your real windows password, the Windows login is required per user.  If you don't change the passwords, then you don't have to login more than once.  In terms of security, the login dialog encrypts the password using a 512 bit encryption algorithm and stored on the supervisor encrypted. 

      To test these settings, first you'll want to submit a simple command line job. 

      Open Start->Run  (type: cmd)
      qbsub dir

      you can check on the status from the qubic! gui or from the command line using qbjobs.

      If the job fails, you'll want to check on the permissions for the worker's log directory.  This is where file permissions starts to become extremely important as qube! itself may not be in the position to correct them for you.  The log directory for the worker is by default here: C:\windows\temp or C:\winnt\temp depending on your OS.  You should open the directory to Everyone with full permissions.  If you prefer not to keep the log directory in the windows directory (which is understandable)  you can always change it in the Configuration Gui. 

     If the job completes, you'll want to double check the output.  It should look something like this:

     qbsub dir
     2916

     qbjobs 2916
total: 0/1 cpu(s)       0/0 work
%    id    pid  pgrp  label  status    user     type     name  cpus  priority  cluster  groups
n/a  2916  1    2916  qube1  complete  anthony  cmdline        0/1 9999      /

     qbout 2916
[Feb 7, 2006 11:53:41] anthonyws : job type version:
loading command line executor.
job id: 2916
COMMAND: "C:\WINNT\system32\cmd.exe" /C dir
 Volume in drive C has no label.
 Volume Serial Number is 38E7-6EBB

 Directory of C:\WINNT\system32

02/06/2006  07:16p      <DIR>          .
02/06/2006  07:16p      <DIR>          ..
03/03/2005  05:48a                 304 $winnt$.inf
03/29/2002  03:32p               2,151 12520437.cpx
03/29/2002  03:32p               2,233 12520850.cpx
09/19/2001  02:32p             720,896 a3d.dll
12/07/1999  02:00a              32,016 aaaamon.dll
04/03/2003  12:17a             172,032 ac3filter.ax
12/07/1999  02:00a              67,344 access.cpl
08/29/2002  07:06a              64,512 acctres.dll
06/19/2003  09:05a             150,800 accwiz.exe
12/07/1999  02:00a              61,952 acelpdec.ax
12/07/1999  02:00a             131,856 acledit.dll
06/19/2003  09:05a              78,096 aclui.dll
12/07/1999  02:00a               4,368 acsetupc.dll
12/07/1999  02:00a              17,168 acsetups.exe
     
   This confirms that a job will run and the log's are ok.  Now you'll want to check your mounted drives:

    qbsub G:\

    If you do not get an output which looks like the contents of your mapped drive, like in the example G: then check the workerlog.  This is located here: C:\Program Files\pfx\qube\logs\workerlog.*

   The worker will mention that it is unable to map the drives and will also mention why.

   Once you have solved the file system issues, the renders will work properly.


       Thanks,
                Anthony

jason

  • Jr. Member
  • **
  • Posts: 6
Re: Is it possible to allow more than one machine to process a single job ??
« Reply #4 on: February 08, 2006, 12:49:57 AM »
Hi anthony,

Thanks for your help. I've actually gone thru most of the provided manuals already and have setup the farm to those specifications. The only missing link was the "Auto-Mount (Windows Only)" option on the clients. After that switch, everything worked.

Upon testing maya renders on qube, I ran into another quirk. It appears that the workers are going through each frame several times when rendering it. I could tell this from watching the filesize of the frames change from 0-full size kb, over and over again. After about 5 passes, it completes. Do you have any ideas about this? If you need further information about my process of render submission, please let me know.

Thanks

anthony

  • Senior Software Engineer
  • Hero Member
  • *****
  • Posts: 183
Re: Is it possible to allow more than one machine to process a single job ??
« Reply #5 on: February 08, 2006, 01:11:08 AM »
Hey Thr33TwoSe7en,

? ? ?As far as the maya Renders are concerned, I suspect your trying to do something like this:

? ? ?qbsub --cpus 3 Render scenefile.ma

? ? ?qube! in this case will behave exactly like you described.? It will launch a copy of the "Render" command on each host.? This command doesn't take into consideration the "frame" based nature of the render.?

? ? ?There are 4 suggested ways to submit a maya render:
? ? ? ? ?#1.? The qubic gui contains a custom submission dialog for maya.

? ? ? ? ?#2.? From within maya itself.? Load the qb plugin and select the qube! Render option from the maya menu.

? ? ? ? ?#3.? Within the C:\Program Files\pfx\jobtypes\maya directory you'll find a perl script called qbMayaRender.pl.? This command is exactly like the "Render" command with the exception of the qube! options.?

? ? ? ? #4.? Submit the qbsub command slightly differently:
 
? ? ? ? ? ? ? ? ?qbsub --cpus 3 --range 1-100? Render -s QB_FRAME_NUMBER -e QB_FRAME_NUMBER myscenefile.ma

? ? ? ? The first 3 methods runs the pfx maya job type.? This job type is pretty intelligent.? It is designed to launch Maya first.? Load the scene file and render the frames.? This is so that a single instance of maya can be used per host and the cost of reloading scene data can be avoided on a per frame basis.

? ? ? ? The second method takes advantage of the generic frame based job.? This job is designed to execute your submitted command with the QB_FRAME_NUMBER replaced.? It's meant for kinds of jobs which either qube! doesn't support yet, but is mostly for stuff we havn't seen before, such as custom tools and scripts built for video game and production houses.

? ? ? ?Thanks,
? ? ? ? ? ? ? Anthony

« Last Edit: February 08, 2006, 01:12:58 AM by anthony »

jason

  • Jr. Member
  • **
  • Posts: 6
Re: Is it possible to allow more than one machine to process a single job ??
« Reply #6 on: February 08, 2006, 01:19:33 AM »
I have actually tried submitting both through Maya itself (qb plugin) and through the Qubic GUI Maya submission panel. It does the same. On that note, this only happens when I render mental ray for Maya. When I render using Maya Software, it seems to be fine.

anthony

  • Senior Software Engineer
  • Hero Member
  • *****
  • Posts: 183
Re: Is it possible to allow more than one machine to process a single job ??
« Reply #7 on: February 09, 2006, 01:06:37 AM »
Thr33TwoSe7en,

      Thanks for the clarification.  I'm logging this one as a bug to be investigated furthur.  We'll get back to you once it's been determined why the MR for Maya exhibits this behaviour.

       Anthony