Quick question, how does the “retry limit” on jobs work? I’m assuming there 's something simple i’m missing, but my goal is just retry the same job because it failed or it timed out on the handshake. I’ve got a simple 1 line testing script that fails (as expected) but i cant get the job to retry. Is there something else i need to do / set / update to actually get the retry to trigger?
Updating to 4.1.4 got the retries working. One odd thing i noticed is that only my first retry (set to 3 retries) is nesting in the original “parent” failed job. Each retry seems to be nesting in it’s own, new job.
Looking at all of the jobs, it looks like all of the retries are triggering from a manual run, however the only job that i manually ran was 2329 – the original job that trigged all 3 retries
My use case here is an employee separation processes. I have one parent script that invokes about 7 sub scripts (that process different sep processes on different applicaitons). The parent script waits for all 7 sub scripts to finish, pulls any pipeline output , throws it all into json and logs it. In the event that one of the sub script fails, I do want that sub script to retry (for as many retries as specified), but if those retries dont all nest under the originally failed job, I wont be able to track all of the retried jobs to find any available pipeline output