Skip to main content

Run Job (Manifest) Configuration has a non-zero backoff does not show results of multi-failure

Issue

An organization may run into a known issue where in a Run Manifest stage, the Run Job (Manifest) Configuration has a non-zero backoff limit e.g. apiVersion: batch/v1

kind: Job

metadata:

name: reschedule-iop-tickets

namespace: spinnaker-jobs

spec:

backoffLimit: 2

template:

The console output shows the results of the first attempt only. There is no way to see whether the pods are re-attempting it or failing somewhere else.

Cause

It is a known issue when the pod crashes, logs are only created in the first instance where the crash occurs. The logs from the latest pod created are pulled. Sometimes there are multiple executions of a runjob, but there is no place in the UI to let know the users what happened to each pod. This issue is known to Armory Engineering and the OSS community.