Sunday, 20 August 2017

Using Sed and Aw to munge JSON

One of my friends had a requirement to pull apart from JSON data, stripping out specific "columns" of data, ideally using standard Unix/Linux commands, again ideally in one single command.

This is with what I came up :-)

So here's an example of the JSON data: -

davehay.json

{
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
"givenName": "Dave",
"familyName": "Hay",
"eddress": "david_hay@uk.ibm.com"
}

and this is the desired output: -

Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com
Hay
david_hay@uk.ibm.com


and this is the script that I developed: -

sed -n '/familyName/,/eddress/p' davehay.json | sed 's/,//' | sed 's/"//g' | awk '{printf "%s\n", $2}'

The way this works is: -

  • The first sed command pulls out everything between the familyName and eddress columns
  • The second sed command strips out the comma ( , ) terminator
  • The third sed command strips out the double quotes ( " )
  • The awk command prints the second column, thus just the data, rather than the column header

If I wanted this: -

Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com
Dave
Hay
david_hay@uk.ibm.com


then I'd adapt the command: -

sed -n '/givenName/,/eddress/p' davehay.json | sed 's/,//' | sed 's/"//g' | awk '{printf "%s\n", $2}'

Like most things, I bet there're a million ways to achieve the same objective :-)

But this'll do for me :-)

No comments:

Yay, VMware Fusion and macOS Big Sur - no longer "NAT good friends" - forgive the double negative and the terrible pun ...

After macOS 11 Big Sur was released in 2020, VMware updated their Fusion product to v12 and, sadly, managed to break Network Address Trans...