Author Topic: autoflag as failed based on log data?  (Read 11323 times)

mpursley

  • Full Member
  • ***
  • Posts: 15
autoflag as failed based on log data?
« on: February 01, 2009, 03:03:18 AM »
Hello,

Does anyone know of a way to flag a job as failed based on some text in it's error log?

I have a command-line job that spits our a lot of stderr messages.  Most of them are fine, but other indicate that something went wrong with the job and it should be marked as "failed"... So I can retry (or auto-retry?) these specific frames?



Thanks,
Matt

mpursley

  • Full Member
  • ***
  • Posts: 15
Re: autoflag as failed based on log data?
« Reply #1 on: February 01, 2009, 03:53:12 AM »

Ok, so I playing with using qberr to give me the stderr of the job, and then searching for the error I'm interested. If the error exists in the subjobs stderr, then I can use qbretry to retry the subjob...

But, this is not really a subjob problem.  I would like to be able to use qberr to get the stderr for all of the _frames_ of the job.  And then use qbretry to retry just those failed frames.  But, qberr and qbretry only seem to work on SubJobs.. not frames...

Does anyone know if these a way to get stderr and/or retry specific frames of a job?


Thanks,
Matt


siyuan.pipelinefx

  • Sr. Member
  • ****
  • Posts: 32
Re: autoflag as failed based on log data?
« Reply #2 on: February 03, 2009, 04:24:57 AM »
You can retry specific frame using these two steps:

1. "qbretry <jobid#>:<frame#>"
2. "qbretry <jobid#> <jobid#>:<frame#>"

So if you want to retry jobid number 300 frame number 5, then it be like this

  qbretry 300:5

and then

  qbretry 300 300:5

mpursley

  • Full Member
  • ***
  • Posts: 15
Re: autoflag as failed based on log data?
« Reply #3 on: February 03, 2009, 04:47:47 AM »


Ok, i'll try that.  And should that syntax work for qberr too?

qberr 300:5



You can retry specific frame using these two steps:

1. "qbretry <jobid#>:<frame#>"
2. "qbretry <jobid#> <jobid#>:<frame#>"

So if you want to retry jobid number 300 frame number 5, then it be like this

  qbretry 300:5

and then

  qbretry 300 300:5

mpursley

  • Full Member
  • ***
  • Posts: 15
Re: autoflag as failed based on log data?
« Reply #4 on: February 03, 2009, 04:54:11 AM »

>> Ok, i'll try that. 
>> And should that syntax work for qberr too?
>> qberr 300:5


Yep... that worked for qbretry...

But I tried it with qberror, and it spewed out tons of lines... repeats of the stderr for the frame over and over again I think...






You can retry specific frame using these two steps:

1. "qbretry <jobid#>:<frame#>"
2. "qbretry <jobid#> <jobid#>:<frame#>"

So if you want to retry jobid number 300 frame number 5, then it be like this

  qbretry 300:5

and then

  qbretry 300 300:5
[/quote]