Why is skaffold failing when rebuilding containers in skaffold dev?

1.8k Views Asked by At

I'm trying to learn more about Kubernetes and related tools. This time I'm trying to learn how I can set up a local dev environment using skaffold. However, I'm experiencing some errors during skaffold dev that happen on rebuild only, not on the initial build. I need some tips on where to look and/or how to troubleshoot that issue.

I am using k3s for a local cluster.

Here are my Kubernetes files:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mepipe-videos-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: mepipe-videos
  template:
    metadata:
      labels:
        app: mepipe-videos
    spec:
      containers:
        - name: mepipe
          image: registry.local:5000/mepipe-videos:v1
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8000
          livenessProbe:
            httpGet:
              path: /health
              port: 8000
              scheme: HTTP
            initialDelaySeconds: 5
            periodSeconds: 15
            timeoutSeconds: 5
          readinessProbe:
            httpGet:
              path: /readiness
              port: 8000
              scheme: HTTP
            initialDelaySeconds: 5
            timeoutSeconds: 1
apiVersion: v1
kind: Service
metadata:
  name: mepipe-videos-service
spec:
  type: ClusterIP
  ports:
    - name: http
      port: 80
      targetPort: 8000
  selector:
    app: mepipe-videos
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: mepipe-ingress
  annotations:
    kubernetes.io/ingress.class: "traefik"
spec:
  rules:
    - http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: mepipe-videos-service
                port:
                  number: 80

This is my skaffold.yaml.

apiVersion: skaffold/v2beta9
kind: Config
metadata:
  name: mepipe
build:
  artifacts:
  - image: registry.local:5000/mepipe-videos
    sync:
      infer:
        - 'cmd/**/*.go'
        - 'pkg/**/*.go'
        - 'go.mod'
        - 'go.sum'

deploy:
  kubectl:
    manifests:
    - deployments/dev/mepipe-deployment.yaml
    - deployments/dev/mepipe-service.yaml
    - deployments/dev/traefik-ingress.yaml

When I run skaffold dev for the first time, this is the result.

enter image description here

As you can see - it seems to run as it should (the health check changed is the correct log). It logs a message when Kubernetes hits the /health endpoint (no logs for readiness). I can also hit both /health, /readiness and other endpoints from my local machine.

However, if I change any part of my codebase (for example to health check really changed), I will get this.

enter image description here

This will hang forever. If I run kubectl get pods - I can see all of the pods are new, freshly rebuilt. If I grab logs from one of them - I will get the proper result back, as expected. I would expect skaffold to rebuild the images and still tail the results. It doesn't.

Any ideas why? Where would I look for what kind of errors happened?

EDIT: Here is a sample repository that I'm using to test this. https://github.com/galkowskit/k8s-skaffold-example

When running skaffold dev on a freshly set up k3s cluster this is what I get in the output.

Starting deploy...
 - deployment.apps/mepipe-videos-deployment created
 - service/mepipe-videos-service created
 - ingress.networking.k8s.io/mepipe-ingress created
Waiting for deployments to stabilize...
 - deployment/mepipe-videos-deployment: FailedMount: MountVolume.SetUp failed for volume "default-token-r7bv7" : failed to sync secret cache: timed out waiting for the condition
    - pod/mepipe-videos-deployment-86c74fb58b-qkg6n: FailedMount: MountVolume.SetUp failed for volume "default-token-r7bv7" : failed to sync secret cache: timed out waiting for the condition
    - pod/mepipe-videos-deployment-86c74fb58b-k98gt: FailedMount: MountVolume.SetUp failed for volume "default-token-r7bv7" : failed to sync secret cache: timed out waiting for the condition
 - deployment/mepipe-videos-deployment: FailedMount: MountVolume.SetUp failed for volume "default-token-r7bv7" : failed to sync secret cache: timed out waiting for the condition
    - pod/mepipe-videos-deployment-86c74fb58b-qkg6n: FailedMount: MountVolume.SetUp failed for volume "default-token-r7bv7" : failed to sync secret cache: timed out waiting for the condition
    - pod/mepipe-videos-deployment-86c74fb58b-k98gt: FailedMount: MountVolume.SetUp failed for volume "default-token-r7bv7" : failed to sync secret cache: timed out waiting for the condition
 - deployment/mepipe-videos-deployment is ready.
Deployments stabilized in 16.200469338s
Press Ctrl+C to exit
Watching for changes...

Looks like some errors with the secrets. I feel a bit lost about how to troubleshoot this. I was basing my setup on this https://devopsspiral.com/articles/k8s/k3d-skaffold/ blogpost.

1

There are 1 best solutions below

3
On

Interesting! That the container logs aren't available suggests that the container creation is failing. I think you're able to subsequently yse kubectl get pods because the deployment is subsequently trying to recreate the pods and succeeding; because the initial deployment failed, Skaffold may not be looking for logs. I'm not sure how you get the logs from k3d but there may be more detail there. You can also try kubectl describe deployment/mepipe-video-deployment.

Would it be possible to share this project? It would be useful for us to have an example to improve this diagnostics output.

One issue I see here is that your image reference to image: registry.local:5000/mepipe-videos:v1 within your deployment.yaml won't work: it needs to match the image in your skaffold.yaml (registry.local:5000/mepipe-videos — note the absence of the :v1 tag). But Skaffold should have warned that this image was not referenced anywhere, so maybe you fixed that already.