Hi there,
I've noticed that our system seems to miss rows every now and again. It seems like it may be linked to job cancellations. Does anyone understand how exactly the canellation mechanism in TWS works?
If a DS job is currently executing a DS build and I cancel the job in TWS is there any possibility that only the running build is killed but the job execution continues after the build to the next procedure node? This seems unlikely but it would explain some of the problems I'm having.
Thanks,
It's possible. Builds and jobstreams run as two separate executables. If TWS is just killing the databuild.exe process and not the rundsjob.exe process, the jobstream will continue. The way to check is to modify the jobstream to check for failure of the build using a condition node, and only proceed to the procedure node if the build succeeded.
Cheers!
MF.
Yeah ok thanks, that's kinda what I was thinking. Now I just need to find some Tivoli experts :D
Until you manage to find them, just add a condition node after the build node, and code the expression as return $RESULT; - link the True output to your Procedure node and the False output to a procedure node which writes a message to the log and doesn't link on anywhere else. It's good practice to do this anyway, to trap if your builds are failing for any reason and terminate the processing leg. Oh, don't forget to set the 'Action on Failure" setting of your build node to 'Continue' so it moves along to perform the next check. I imagine it must be set to this anyway...
Cheers!
MF.
At the moment it seems like all our builds have their 'Action on failure' set to 'TERMINATE'. Do you know what the behaviour of that is?
I think I'm using a fairly old version of DS. Version 7.1.778.0 incase that makes a difference.
That is the default setting, and it means that processing should stop in the current flow if an error occurs.
I don't recall any issues with this in either DecisionStream or Data Manager. Do you have parallel flows running in your jobstream? If so, TERMINATE will allow these to continue when the current flow stops because of the error. Setting the value to ABORT should stop all flows in the jobstream.
Cheers!
MF.
So if TWS kills databuild.exe instead of rundsjob.exe but the build has an 'Action on failure' set to 'TERMINATE', is there a chance that a procedure node directly after the build node is executed? The Job is a linear chain of builds + procedure nodes.
Personally I kind of like your suggestion of 'Action on failure' set to 'CONTINUE' followed by a condition node which checks $RESULT. At least then we'd have clearer logging.
My guess is that it's not possible that the procedure node receives execution time which would mean my problem is in another castle (|:( <-- Sad Toad.
Quote from: eknight on 11 Dec 2012 08:09:16 AMSo if TWS kills databuild.exe instead of rundsjob.exe but the build has an 'Action on failure' set to 'TERMINATE', is there a chance that a procedure node directly after the build node is executed? The Job is a linear chain of builds + procedure nodes.
No - with Action on Failure set to Terminate, the jobstream should end processing when the build fails if there are no other parallel legs of processing. What does the log of the jobstream show? If you enable Detail logging, it should list each node and it's success or failure.
Cheers!
MF.