Metric profile¶

The metric-collection feature is configured through a file pointed by the metrics-profile flag, which can point to a local path or URL of a YAML-formatted file containing a list of the Prometheus expressions. Kube-burner will perform those queries one by one, once all jobs are finished.

In a single job benchmark, the queries are executed using the benchmark start and end time as time range. In multiple job benchmarks, these queries are executed in a per job basis, and they take the different start and end times from the executed jobs.

The metrics profile file has the following structure:

- query: irate(process_cpu_seconds_total{job=~".*(crio|etcd|controller-manager|apiserver|scheduler).*"}[2m])
  metricName: controlPlaneCPU

- query: sum(irate(node_cpu_seconds_total[2m])) by (mode,instance)
  metricName: nodeCPU

The query field holds the Prometheus expression to evaluate, and metricName controls the value that kube-burner will set on the metricName field of the generated documents. This is useful to identify metrics from a specific query. More information is available in the metric format section.

Instant queries¶

In addition to the default range queries, kube-burner has the ability execute instant queries against the provided Prometheus API. This can be configured by enabling the field instant to the desired metric.

- query: kube_node_role
  metricName: nodeRoles
  instant: true

Info

When using instant queries, at least two documents are generated, one resulting from scraping the last timestamp of the job, which would have the configued metricName field and an another one resulting from scraping the first timestamp of the job, the metricName of document is appended the -start suffix.

Metric format¶

The collected metrics have the following shape:

[
  {
    "timestamp": "2021-06-23T11:50:15+02:00",
    "labels": {
      "instance": "ip-10-0-219-170.eu-west-3.compute.internal",
      "mode": "user"
    },
    "value": 0.3300880234732172,
    "uuid": "<UUID>",
    "query": "sum(irate(node_cpu_seconds_total[2m])) by (mode,instance) > 0",
    "metricName": "nodeCPU",
  },
  {
    "timestamp": "2021-06-23T11:50:45+02:00",
    "labels": {
      "instance": "ip-10-0-219-170.eu-west-3.compute.internal",
      "mode": "user"
    },
    "value": 0.31978102677038506,
    "uuid": "<UUID>",
    "query": "sum(irate(node_cpu_seconds_total[2m])) by (mode,instance) > 0",
    "metricName": "nodeCPU",
  }
]

Notice that kube-burner enriches the query results by adding some extra fields like uuid, query and metricName.

Info

These extra fields are especially useful at the time of identifying and representing the collected metrics.

Using the elapsed variable¶

There is a special go-template variable that can be used within the Prometheus expressions of a metric profile; the variable elapsed is automatically populated with the job duration, in seconds. This variable is especially useful in PromQL expressions using aggregations over time functions.

For example, the following expression gets the top 3 datapoints with the average CPU usage kubelets processes in the cluster.

- query: irate(process_cpu_seconds_total{service="kubelet",job="kubelet"}[2m]) * 100 and on (node) topk(3,avg_over_time(irate(process_cpu_seconds_total{service="kubelet",job="kubelet"}[2m])[{{ .elapsed }}:]))
  metricName: top3KubeletCPU
  instant: true

Info

Note that in the [time-range:] notation, the colon specifies to get the values for the given duration.

Examples of metrics profiles can be found in the examples directory. There are also Elasticsearch based Grafana dashboards available in the same examples directory.