Resource monitoring for Kubernetes Pods

1.3k Views Asked by At

I am using the kubernetes-client java library for the K8s REST API. I want to explore the resource monitoring feature desctibed here https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/

I set the resource for the Pods while creating Deployments like this

// ******************* RESOURCES*********************

    Quantity memLimit = new Quantity();
    memLimit.setAmount("400");
    Map<String, Quantity> memMap = new HashMap<String,Quantity>();
    memMap.put("memory", memLimit);
    ResourceRequirements resourceRequirements = new ResourceRequirementsBuilder()
      .withRequests(memMap)
      .build();

    // ******************* DEPLOYMENT *********************
    Deployment deployment = new DeploymentBuilder()
        .withNewMetadata()
        .withName("first-deployment")
        .endMetadata()
        .withNewSpec()
        .withReplicas(3)
        .withNewTemplate()
        .withNewMetadata()
        .addToLabels(namespaceID, "hello-world-example")
        .endMetadata()
        .withNewSpec()
        .addNewContainer()      
        .withName("nginx-one")
        .withImage("nginx")
        .addNewPort()
        .withContainerPort(80)
        .endPort()
        .withResources(resourceRequirements)
        .endContainer()
        .endSpec()
        .endTemplate()
        .endSpec()
        .build();
    deployment = client.extensions().deployments().inNamespace(namespace).create(deployment);

How do I now know, how much memory is being used out of the alocated memory for pods? The documentation says its part of the pod status , but pod status is of the form

     (conditions=
    [PodCondition
    (lastProbeTime=null, lastTransitionTime=2018-01-09T15:53:28Z, 
    message=null, reason=null, 
status=True, type=PodScheduled, 
    additionalProperties={})],
 containerStatuses=[], hostIP=null, 
    initContainerStatuses=[],
 message=null, phase=Pending, podIP=null,
 qosClass=Burstable, reason=null, 
startTime=null, additionalProperties={})

And the container status

(containerID=null, image=nginx, 
imageID=, lastState=ContainerState(running=null, terminated=null, waiting=null, additionalProperties={}),
 name=nginx-one, ready=false, restartCount=0, state=ContainerState(running=null, terminated=null, waiting=
ContainerStateWaiting(message=null, reason=ContainerCreating, additionalProperties={}), additionalProperties={}), 
additionalProperties={})

Is there an example for monitoring resources on Pods?

3

There are 3 best solutions below

0
On

I know this question is two years old but the answers here don't provide the actual answer to this question.

In order to get your CPU and memory utilization you need to have the kubernetes metrics server installed on your Kubernetes cluster (See also the official Helm chart if you use helm). Once the metrics server is installed, this lets you run kubernetes commands that will report the metrics usage. For example, running kubectl top pods -A will sort all of your pods by their CPU utilization, or kubectl top nodes will list each node's utilization. kubectl describe pods as well as the Kubernetes dashboard will also report CPU and memory utilization numbers once the metrics server is installed.

To answer your specific question about fabric8, once the metrics server is running, you can obtain CPU and memory utilization with the following code:

KubernetesClient k8s = new KubernetesClientBuilder().build()
NodeMetricsList nodeMetricsList = k8s.top().nodes().metrics();
for (NodeMetrics nodeMetrics : nodeMetricsList.getItems()) {
    logger.info("{} {} {}",
        nodeMetrics.getMetadata().getName(),
        nodeMetrics.getUsage().get("cpu"),
        nodeMetrics.getUsage().get("memory")
    );
}
0
On

Take an hour and watch the video: Load Testing Kubernetes: How to Optimize Your Cluster Resource Allocation in Production which walks through a couple of techniques and recommendations on how to size your resource configuration based on load testing. The example in the video leverages cAdvisor, so once your Pod/container is up and running you can utilize that mechanism to capture at least a basic view of how much resource your container is taking.

0
On

I am not sure if k8 api-server provides an endpoint to get performance related metrics but using fabric8, you should not be able to monitor resource consumption even when Pod is in running state.

Here is the Pod response json:

{
  "kind": "Pod",
  "apiVersion": "v1",
  "metadata": {
    "name": "nginx-41cbe3-10-json-9cc655bcc-w576m",
    "generateName": "nginx-41cbe3-10-json-9cc655bcc-",
    "namespace": "default",
    "selfLink": "/api/v1/namespaces/default/pods/nginx-41cbe3-10-json-9cc655bcc-w576m",
    "uid": "e14a955f-18b7-11e8-a642-42010a800090",
    "resourceVersion": "12765988",
    "creationTimestamp": "2018-02-23T16:37:47Z",
    "labels": {
      "app": "nginx",
      "cliqr": "99911519403865240",
      "pod-template-hash": "577211677"
    },
    "annotations": {
      "kubernetes.io/created-by": "{\"kind\":\"SerializedReference\",\"apiVersion\":\"v1\",\"reference\":{\"kind\":\"ReplicaSet\",\"namespace\":\"default\",\"name\":\"nginx-41cbe3-10-json-9cc655bcc\",\"uid\":\"e1493bd0-18b7-11e8-a642-42010a800090\",\"apiVersion\":\"extensions\",\"resourceVersion\":\"12765971\"}}\n",
      "kubernetes.io/limit-ranger": "LimitRanger plugin set: cpu request for container nginx"
    },
    "ownerReferences": [
      {
        "apiVersion": "extensions/v1beta1",
        "kind": "ReplicaSet",
        "name": "nginx-41cbe3-10-json-9cc655bcc",
        "uid": "e1493bd0-18b7-11e8-a642-42010a800090",
        "controller": true,
        "blockOwnerDeletion": true
      }
    ]
  },
  "spec": {
    "volumes": [
      {
        "name": "default-token-zrhj5",
        "secret": {
          "secretName": "default-token-zrhj5",
          "defaultMode": 420
        }
      }
    ],
    "containers": [
      {
        "name": "nginx",
        "image": "nginx:latest",
        "ports": [
          {
            "containerPort": 80,
            "protocol": "TCP"
          }
        ],
        "resources": {
          "requests": {
            "cpu": "100m"
          }
        },
        "volumeMounts": [
          {
            "name": "default-token-zrhj5",
            "readOnly": true,
            "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount"
          }
        ],
        "terminationMessagePath": "/dev/termination-log",
        "terminationMessagePolicy": "File",
        "imagePullPolicy": "Always"
      }
    ],
    "restartPolicy": "Always",
    "terminationGracePeriodSeconds": 30,
    "dnsPolicy": "ClusterFirst",
    "serviceAccountName": "default",
    "serviceAccount": "default",
    "nodeName": "gke-rishi-k8-cluster-default-pool-6ca1467e-xtmw",
    "securityContext": {},
    "schedulerName": "default-scheduler",
    "tolerations": [
      {
        "key": "node.alpha.kubernetes.io/notReady",
        "operator": "Exists",
        "effect": "NoExecute",
        "tolerationSeconds": 300
      },
      {
        "key": "node.alpha.kubernetes.io/unreachable",
        "operator": "Exists",
        "effect": "NoExecute",
        "tolerationSeconds": 300
      }
    ]
  },
  "status": {
    "phase": "Running",
    "conditions": [
      {
        "type": "Initialized",
        "status": "True",
        "lastProbeTime": null,
        "lastTransitionTime": "2018-02-23T16:37:47Z"
      },
      {
        "type": "Ready",
        "status": "True",
        "lastProbeTime": null,
        "lastTransitionTime": "2018-02-23T16:37:53Z"
      },
      {
        "type": "PodScheduled",
        "status": "True",
        "lastProbeTime": null,
        "lastTransitionTime": "2018-02-23T16:37:47Z"
      }
    ],
    "hostIP": "10.240.0.23",
    "podIP": "10.20.3.164",
    "startTime": "2018-02-23T16:37:47Z",
    "containerStatuses": [
      {
        "name": "nginx",
        "state": {
          "running": {
            "startedAt": "2018-02-23T16:37:52Z"
          }
        },
        "lastState": {},
        "ready": true,
        "restartCount": 0,
        "image": "nginx:latest",
        "imageID": "docker-pullable://nginx@sha256:600bff7fb36d7992512f8c07abd50aac08db8f17c94e3c83e47d53435a1a6f7c",
        "containerID": "docker://2c227a901bcde4705c5b79aedf1963079dfb345fae5849616d29e8cc7af0fd74"
      }
    ],
    "qosClass": "Burstable"
  }
}