The first thing you should consider on the workers is the ratio of RAM to cpus (cores, not physical processor packages). The amount of RAM will determine how many different renders you can run at the same time on a host.
So how many renders do we want to run at once? Let's consider a Maya render running on an 8-core worker.
At one end of the spectrum, we could have a single render running 8 renderThreads. This provides the fastest turnaround for a single frame with the smallest memory footprint. But a 40-worker farm that has 320 cores will only be rendering 40 frames at one time. And the end-users will see that only a few frames are being worked on at the same time, so they'll perceive the farm as being slow. Never underestimate the power of someone else's perception to make your life miserable...
But we almost never render single frames. When rendering sequences, it's often advantageous to get more frames rendering at once, so that if there's something wrong in some of the later frames (missing geometry or textures, wrong/old camera or animation) it gets noticed quicker.
The other end of the spectrum would be to run 8 single-threaded renders at the same time on the host. This gets more frames in progress at the same time, but the memory footprint is almost 8 times larger that the single 8-threaded render. And running 8 simultaneous renders on a worker puts a large load on the network interface for the host, the network itself, and file servers. The end-user sees more frame being worked on, so they perceive that the farm is working faster, but it may in actuality be killing the network or file server.
At the end of the day in a perfect world (there was enough RAM and network and disk throughput), it would take close to the same amount of time to render 8 frames at once single-threaded as it would to render 8 frames one after the other with 8 renderThreads. The 8 single-threaded renders would be ~slightly~ faster.
So the answer lies somewhere in the middle. 2 or 4 renders on a worker is usually a good number for most cases.
If we consider that renders usually run between 2GB and 4BB of RAM, then 16GB in an 8-core box (2GB/core) is always a good starting point. If your renders are expected to have a much larger footprint, then you would want to buy more RAM. But don't plan on running more than 4 renders at once on a host; you'll just end pushing the system load onto the network or file servers, and that will kill the performance of the entire farm (and probably the workstations, too).
As most jobs on a farm are cpu-bound, it makes sense to buy the fastest processors you can afford. But often the fastest processors are twice as expensive as the 2nd-fastest. So the operative word here becomes "afford". You're probably better off spec'ing the 2nd-fastest processors, and being able to buy more blades for the same budget. It's always a trade-off, but you don't want to be the one who paid absolutely top dollar for processors that are only considered "adequate" in 18-months time, when the farm could have been 15% larger during that 18 months.
Buy as fast a file server or Network Attached Storage as you can afford, and put as fast a network interface in it as you can. Rendering on a farm puts a heavy load on file servers, and Nuke jobs can crush them. Many nuke jobs are entirely i/o bound; basic 'over' composites represent almost no cpu load to the worker, but they can easily saturate the network and file server.
You may want to have the file server/NAS be on 2 subnets; 1 for the workstations and 1 for the farm workers. This way, the network i/o from the workers will not impact the users' workstations.
More spindles in the file server or NAS is better than fewer.