Below is my definition for a k8s job (to convert a column of a mysql table from int->bigint using Percona's pt-online-schema-change):
apiVersion: batch/v1
kind: Job
metadata:
name: bigint-tablename-columnname
namespace: prod
spec:
backoffLimit: 0
template:
metadata:
name: convert-int-to-bigint-
spec:
containers:
- name: percona
image: perconalab/percona-toolkit:3.2.1
command: [
"/bin/bash",
"-c",
"pt-online-schema-change --host=dbhost --user=dbuser --password=dbpassword D=dbname,t=tablename --alter \"MODIFY COLUMN columnname BIGINT\" --alter-foreign-keys-method \"rebuild_constraints\" --nocheck-foreign-keys --execute"
]
env:
- name: SYMFONY__POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
restartPolicy: Never
I've experienced that the pod failed for some reason - in a kubectl describe job jobname I see Pods Statuses: 0 Active (0 Ready) / 0 Succeeded / 1 Failed. However in kubectl get pods there is no pod associated with the job so I cannot view the pod logs to find out why it failed.
I thought using restartPolicy: Never should keep the pod around as per 1, 2, but clearly my understanding isn't correct. So how do I ensure that if this process fails then the pod is still kept for me to inspect?
If the pod fails or terminates, you won't be able to get the logs. This is because logs are only fetched for existing resources.
One way to do so is to continuously save your logs while your pod is alive. There are different strategies to do so that you can find in the documentation. Using a logging backend is one of them.
https://kubernetes.io/docs/concepts/cluster-administration/logging/