Author Topic: Establishing dependencies between jobs with differing chunk sizes  (Read 6443 times)

jburk

  • Administrator
  • *****
  • Posts: 493
Establishing dependencies between jobs with differing chunk sizes
« on: September 02, 2011, 10:47:48 PM »
Occasionally it becomes necessary to define a dependency between 2 jobs that operate over the same frame range, but for efficiency's sake the chunk size differs between the 2 jobs.

A common example is a 2-stage job, where the 1st stage does an export of some sort, eg. to .mi or .ifd for example, and the 2nd stage is the render.  If the export cannot be done with Qube's Dynamic Allocation technology, it's more efficient to export in chunks.  But the 2nd stage is the render, and this almost always is done as single frames. 

Since single frames can also be considered a chunk size of 1, the following script can serve as an example of how to define dependencies between any two jobs whose chunk size differs, including a chunk <--> single frame job set.

The script uses Qube's python api, and builds the 2 jobs are simple python dictionaries.

Since a Qube job's "dependency" attribute can only be used to describe simple dependency relationships, the dependencies are defined in the job's callbacks. 


Code: [Select]

import qb

jobList = []

fRange = '1-42'

jobA = {
    'prototype': 'cmdrange',
    'name': 'job A',
    'label': 'first',
    'package': {
        'cmdline': 'echo frame range:QB_FRAME_RANGE; sleep 5',
        'padding': 1,
    },
    'agenda': qb.genchunks(4, fRange),
}

jobList.append(jobA)


jobB = {
    'prototype': 'cmdrange',
    'name': 'job B',
    'package': {
        'cmdline': 'echo frame range:QB_FRAME_RANGE; sleep 2',
        'padding': 2,
    },
    'agenda': qb.genchunks(6, fRange),
    'callbacks': [],
    'status': 'blocked'
}

for work in jobB['agenda']:
    #  we need to set the status of each agenda item as blocked
    work['status'] = 'blocked'

    #  Now we need to figure out what upstream chunks contain the current chunk, we'll do this by
    #  looking for set intersections for each chunk in this job across all upstream chunks
    upstreamChunks = []
   
    currentFrames = set([x['name'] for x in qb.genframes(work['name']) ])
    for chunkName in [x['name'] for x in jobA['agenda']]:
        # if any of the frames comprising the current chunk in the downstream job are present in the
        # frames comprising the chunk from the upstream jobs, add that upstream chunk to the list of
        # chunks the downstream job is dependent upon
        if currentFrames.intersection([x['name'] for x in qb.genframes(chunkName)]):
            upstreamChunks.append(chunkName)

    triggerStr = 'complete-work-%s-%s' % (jobA['label'], upstreamChunks[0])
    for upstreamDep in upstreamChunks[1:]:
        triggerStr += ' && complete-work-%s-%s' % (jobA['label'], upstreamDep)

    cb = {
        'language': 'python',
        'code': '''
jobID = qb.jobid()
qb.workunblock('%%s:%s' %% jobID)
qb.unblock(jobID)''' % work['name'],
        'triggers': triggerStr
    }
    jobB['callbacks'].append(cb)

jobList.append(jobB)

for job in qb.submit(jobList):
    print 'submitted %(name)s, id:%(id)s' % job



For a single-frame second job, the call to qb.genchunks() to build jobB's agenda can either be passed a '1' as the chunksize, or replaced altogether with qb.genframes()