Batch jobs
Running batch jobs
Kubernetes has a support for running batch jobs. A Job is a daemon which watches your pod and makes sure it exited with exit status 0. If it did not for any reason, it will be restarted up to backoffLimit
number of times.
Since jobs in Nautilus are not limited in runtime, you can only run jobs with meaningful command
field. Running in manual mode (sleep infinity
command
and manual start of computation) is prohibited.
Let's run a simple job and get it's result.
Create a job.yaml file and submit:
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
resources:
limits:
memory: 200Mi
cpu: 1
requests:
memory: 50Mi
cpu: 50m
restartPolicy: Never
backoffLimit: 4
Explore what's running:
When job is finished, your pod will stay in Completed state, and Job will have COMPLETIONS field 1/1. For long jobs, the pods can have Error, Evicted, and other states until they finish properly or backoffLimit is exhausted.
Our job did not use any storage and outputed the result to STDOUT, which can be seen as our pod logs:
The pod and job will remain for you to come and look at for ttlSecondsAfterFinished=604800
seconds (1 week) by default, and you can adjust this value in your job definition if desired.
You can use the more advanced example when ready.
The end
Please make sure you did not leave any pods and jobs behind.