Exploring long JSON files with jq

The JSON file format is used for marshalling data in lots of different applications. If you are new to an application, and don’t know the data, it might be hard to visually parse the JSON and understand what you are seeing.  The jq command line utility can help make it easier to scope in to a section of the file.  This is a starting point.

Kubelet, the daemon that runs on a Kuberenets node, has a web API for returning stats.  To query it from that node:

curl -k https://localhost:10250/stats/

However, the amount of text returned is several thousand lines.  The first few lines look like this:

$ curl -sk https://localhost:10250/stats/ | head 
{
 "name": "/",
 "subcontainers": [
 {
 "name": "/machine.slice"
 },
 {
 "name": "/system.slice"
 },
 {

Since the JSON top level construct is a dictionary, we can use the function keys from jq to enumerate just the keys.

$ curl -sk https://localhost:10250/stats/ | jq keys
[
 "name",
 "spec",
 "stats",
 "subcontainers"
]

To view the subcontainers, use that key:

$ curl -sk https://localhost:10250/stats/ | jq .subcontainers
[
 {
 "name": "/machine.slice"
 },
 {
 "name": "/system.slice"
 },
 {
 "name": "/user.slice"
 }
]

The stats key returns an array:

$ curl -sk https://localhost:10250/stats/ | jq .stats | head
[
 {
 "timestamp": "2017-01-12T13:23:45.301168504Z",
 "cpu": {
 "usage": {
 "total": 420399104294,
 "per_cpu_usage": [
 202178115170,
 218220989124
 ],

How long is it?  use the length function.  Note that jq functions are piped one into the next.

$ curl -sk https://localhost:10250/stats/ | jq ".stats | length"
9

Want to see the keys of an element?  Index it as an array:

$ curl -sk https://localhost:10250/stats/ | jq ".stats[0] | keys"
[
 "cpu",
 "diskio",
 "filesystem",
 "memory",
 "network",
 "task_stats",
 "timestamp"
]

To see a subelement, use the pipe format.  For example, to see the timestamp of the top element,

$ curl -sk https://localhost:10250/stats/ | jq ".stats[0] | .timestamp"
"2017-01-12T13:29:16.162797308Z"

To see a value for all elements, remove the index from the array. Again, use the pipe notation:

$ curl -sk https://localhost:10250/stats/ | jq ".stats[] | .timestamp"
"2017-01-12T13:32:13.732338602Z"
"2017-01-12T13:32:25.713656307Z"
"2017-01-12T13:32:43.443936137Z"
"2017-01-12T13:33:02.796007138Z"
"2017-01-12T13:33:14.53537449Z"
"2017-01-12T13:33:32.540031699Z"
"2017-01-12T13:33:42.732536856Z"
"2017-01-12T13:33:53.235774027Z"
"2017-01-12T13:34:10.351984713Z"

Which shows that the last element of the array is the latest.  Use the index of -1 to reference this value:

$ curl -sk https://localhost:10250/stats/ | jq ".stats[-1] | .timestamp"
"2017-01-12T13:33:53.235774027Z"

Edit: added below.

To find an element of a list based on the value of a key, or the value of a sub element, use the pipe notation within the parameter list of a the call to select. I use a slightly different curl query here, note the summary element at the end. I want to get the pod entry that matches a section of a particular pod name.

curl -sk https://localhost:10250/stats/summary | jq '.pods[] | select(.podRef | .name | contains("virt-launcher-testvm"))'

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.