Author Topic: Building a render farm without a shared filesystem - NOT recommended  (Read 7996 times)

jburk

  • Administrator
  • *****
  • Posts: 493
This question comes up from time to time, so I thought I'd post something here in the forum.

Potential customers call our Sales department and ask "Can I just install Qube on our workstations and turn them into a render farm?  We don't have a file server, we all work off of local disks."

If your site is running Linux or OSX and you have a limited number of machines (say, less than 10), then you could possibly have each workstation function as a file server.  Each machine would then have to mount all other machines' filesystems, and the path to any scene or scene component would have to include the machine name.  This gets a bit tricky when the file lives on your own machine; you have to treat it as if it lived on a remote filesystem.  You machine would have to be configured to mount its own exported filesystem the same way that other machines would mount it.  This is known as a loopback file system, and setting this up is beyond the scope of this forum.

I'm going to say that I strongly recommend that you don't try and run without a central file server if the job content resides on machines running a Windows desktop OS.

Qube doesn't have a file transfer agent that would copy files around.  It relies on the networking setup present at your site.  All workers have to have visiblity into the shared filesystem where the files reside.

While you could set all paths in the project to UNC and include that machine name of the host where the files reside (simulating a loopback filesystem), which would allow all machines to access files on that host's local disk, Windows desktop operating systems do not support more than 5 connections from remote hosts doing CIFS access (remote machines accessing the local host as a file server).  They're only meant to share disks with a limited number of machines under very light load; if you're trying to act as a file server, Microsoft wants you to buy a server OS.

When more than 5 hosts try and access a Windows desktop OS as a file server, the connections from the remote machines are dropped in an unpredictable manner, and are not reinstated.  It will work if you only have 1 or 2 subjobs running at the same time accessing the job content, but as soon as you try and run more than 5 or 6 subjobs at the same time, you'll get nothing but errors and hung jobs, since the remote workers will either be able to only intermittently connect to the machine where the files reside, or they'll get a connection that might be dropped in the middle of a read or write request.

Very Bad Things will happen...
« Last Edit: July 02, 2010, 04:19:19 PM by jburk »