Sunday, 20 August 2017

Using Sed and Aw to munge JSON

One of my friends had a requirement to pull apart from JSON data, stripping out specific "columns" of data, ideally using standard Unix/Linux commands, again ideally in one single command.

This is with what I came up :-)

So here's an example of the JSON data: -

davehay.json

{
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
}

and this is the desired output: -

Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com


and this is the script that I developed: -

sed -n '/familyName/,/eddress/p' davehay.json | sed 's/,//' | sed 's/"//g' | awk '{printf "%s\n", $2}'

The way this works is: -

  • The first sed command pulls out everything between the familyName and eddress columns
  • The second sed command strips out the comma ( , ) terminator
  • The third sed command strips out the double quotes ( " )
  • The awk command prints the second column, thus just the data, rather than the column header

If I wanted this: -

Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com


then I'd adapt the command: -

sed -n '/givenName/,/eddress/p' davehay.json | sed 's/,//' | sed 's/"//g' | awk '{printf "%s\n", $2}'

Like most things, I bet there're a million ways to achieve the same objective :-)

But this'll do for me :-)

No comments:

Note to self - use kubectl to query images in a pod or deployment

In both cases, we use JSON ... For a deployment, we can do this: - kubectl get deployment foobar --namespace snafu --output jsonpath="{...