Author Topic: Dedicated supervisor as framestore? (Read 17665 times)

rhagen · « **on:** April 20, 2011, 09:34:35 PM »

Got another question for y'all -

I was thinking to buy some very heavy (perhaps overspecced) hardware as a server with lots of internal fast storage and lots of ram in order to simply my setup here. I was wondering about the performance issues with running a supervisor that is itself the "framestore" shared file storage. I'd be running Server 2008 Standard on this new machine; the entire farm is Windows 7 based.

rhagen · « **Reply #1 on:** April 21, 2011, 03:23:46 PM »

Additionally, this server would be multihomed. It would run both fiber connectivity for the dedicated renderfarm and a teamed ethernet 4gb/s connection for the university labs to render when the labs are idle. There would also be additional fiber-attached storage via a terrablock unit.

jburk · « **Reply #2 on:** April 21, 2011, 03:56:10 PM »

This is generally not recommended, but there's no hard and fast rule.

It all depends on how large your farm is and how fast the supervisor h/w is. If you're running less than 10 workers and usually run jobs that are comprised of frames or tasks that take at least several minutes to run, and don't output much log data, then you might get away with this.

If your farm is larger, or you run a lot of composites or other types of jobs where the frames run on the order of 10s or less, or your jobs simply spew a lot of log data, then you may find that supervisor performance will suffer.

The first symptom of degraded performance will be if you have a lot of running subjobs that have no work assigned to them. This will be visible in the "subjob timeline" pane in the QubeGUI. If you see a lot of sections in the horizontal graphs that have long skinny sections between the fatter sections, this means that the subjobs are running, but don't have a frame to process (the skinny section in the middle is the subjob itself, and the fatter sections around it are the individual frames that the subjob is working on).

In any case, I would strongly recommend that you configure your farm so that the supervisor processes don't have to handle the job log data. See the posting on this forum "Writing job logs directly to a network filesystem" http://www.pipelinefx.com/forum/index.php?topic=1137.0

Setup the "shared location" mentioned in that thread to one of the fast external filesystems that this machine is serving out.

This way, it's only the file server portion of the machine that is handling the log data, and not the Qube supervisor processes themselves.

rhagen · « **Reply #3 on:** April 21, 2011, 08:05:05 PM »

Offloading the logs will probably help even in the current config - much appreciation for the tutorial!

I'll do some more tests with my idea to see if it is workable, as well as trying out the mySQL optimize methods I read about in another thread to give the supervisor more threads.

Currently the supervisor we use only has about 12GB of ram with 2 xeon cpu for anywhere from 20 nodes to 75 nodes, which isn't enough ram according to your recommended specs anyway.

The machine I want to buy would ideally have something on the order of 48gb of ram or more to facilitate further optimization...

jburk · « **Reply #4 on:** May 08, 2011, 12:34:22 PM »

12GB of RAM is a ton for a supervisor that's running less than 100 workers; what specs are you referring to? (I may have a typo somewhere in a post or some documentation, and I'd like to fix it if so...)

I usually recommend 8GB of RAM for a supervisor for up to between 50 and 100 workers, depending on how many subjobs you're running on your farm concurrently (how many "job slots" are in use at any one time).

Some of our largest farms with over 500 hosts are running on a 16GB server

rhagen · « **Reply #5 on:** May 18, 2011, 09:42:29 PM »

I got that info from Page 8 of the Installation.pdf

"50-100 node renderfarm

jburk · « **Reply #6 on:** May 26, 2011, 06:25:19 PM »

Thanks for the heads-up on the documentation error. Defintely something to fix.

With regards to running 1000+ subjobs, I'm guessing that you easily have over 100 active (running/pending/blocked) jobs in Qube at any one time. You probably need to tune the number of filehandles that your mysql instance can open at any one time. We have a whitepaper for tuning MySQL on linux for just this case.

http://www.pipelinefx.com/support/whitepapers/Qube_TuningForHighPerformance.pdf

On Linux, the default number of handles a non-root user can open is 1024. Since each job is contained in 5 tables in Qube, each job represents 10 filehandles when you include the table indices.

If your supervisor is not Linux these optimizations are not available to you.

Author Topic: Dedicated supervisor as framestore? (Read 17665 times)

rhagen

Dedicated supervisor as framestore?

rhagen

Re: Dedicated supervisor as framestore?

jburk

Re: Dedicated supervisor as framestore?

rhagen

Re: Dedicated supervisor as framestore?

jburk

Re: Dedicated supervisor as framestore?

rhagen

Re: Dedicated supervisor as framestore?

jburk

Re: Dedicated supervisor as framestore?