I am installing Open Service Mesh (OSM) and my app via helmfile . I control the order of installation by using multiple helmfiles, and I proceed like so.
- I install osm
- I install my app
namespace, along with the OSM annotations so that OSM knows to inject a sidecar into all entities joining the namespace (see below) - I wait for OSM to recognise the namespace (how to do this elegantly is my question).
- I install the app in my
namespace.
My issue is that OSM seems to take a few seconds to recognize the new namespace. If I don't insert a pause, then I end up with some pods that do not seem to have the sidecar injected into them. Is there some service/job in OSM that I can watch so that I know when the namespace registration is complete? I could not find this in the documentation.
Code:
My main helmfile is as follows:
helmfiles:
- apps/00-service-mesh/helmfile.yaml
- apps/10-namespace/helmfile.yaml
- apps/20-database/helmfile.yaml
Step 1) Service Mesh:
repositories:
- name: osm
url: https://openservicemesh.github.io/osm
releases:
- name: osm
namespace: osm-system
chart: osm/osm
version: '1.2.4'
values:
- ./values/osm.yaml
with values:
osm:
enablePermissiveTrafficPolicy: true
deployPrometheus: true
deployGrafana: true
osmNamespace: osm-system
contour:
enabled: true
configInline.tls:
envoy-client-certificate:
name: osm-contour-envoy-client-cert
namespace: osm-system
Step 2 and 3) Namespace & Sleep
I create a custom chart with the namespace defined with the OSM annotations:
apiVersion: v1
kind: Namespace
metadata:
name: myNamespace
labels:
"openservicemesh.io/monitored-by": "osm"
annotations:
'openservicemesh.io/sidecar-injection': 'enabled'
"openservicemesh.io/metrics": "enabled"
While this works fine when I add the namespace with the kubectl command, it seems to "go too quick" when scripting with helmfile.
To workaround for now, I have a inelegant pause. This is what I want to replace with something more reliable
apiVersion: batch/v1
kind: Job
metadata:
name: sleep-job
namespace: myNamespace
spec:
template:
spec:
containers:
- name: node-app-job
image: alpine
command: ["sleep", "20"]
restartPolicy: Never
My helmcart is then:
releases:
- name: app-namespace
chart: ./app-namespace
hooks:
- events: ["postsync"]
showlogs: true
command: "kubectl"
args: ["wait", "--for=condition=complete", "job/sleep-job", "-n", "myNamespace"]
With the pause this works okay. If I remove the pause it seems that some pods (step 4) don't have the sidecar and are unreachable. How do I replace the pause with something more relibale.