Monday, 11 February 2019

AWK - It's my "new" best friend forever ....

Nobody can accuse me of being late to the party ....

Well, OK, some *nix l33t can accuse me of being late to the party ....

So, whilst I've been tinkering with sed ( Stream Editor ) for years, albeit in a very very very minimal way, as per this example: -

sed -i'' "s/PidFile\ logs/PidFile\ ${Product}\/logs/g" /opt/ibm/HTTPServer/${Product}/conf/httpd.conf

I have mainly managed to avoid using AWK for no particular reason other than lack of need.

However, recently, I've wanted to grab specific columns from output such as: -

docker ps -a

CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                  PORTS               NAMES
51e1af565b7e        ibmcom/ace:latest   "runaceserver"      5 days ago          Exited (0) 4 days ago                       ace
3d8899de8f32        hello-world         "/hello"            5 days ago          Exited (0) 5 days ago                       happy_colden

where I just want the CONTAINER ID column: -

docker ps -a | awk '{print $1}'

which gives me this: -

CONTAINER
51e1af565b7e
3d8899de8f32

where awk is only printing the first column using print $1.

Better still: -

docker ps -a | sed 1d | awk '{print $1}'

51e1af565b7e
3d8899de8f32

which uses sed 1d to delete the first row of the output from docker ps -a and then let awk do its thing.

Similarly, I wanted to grab a specific attribute ( cgroup ) from the output of the ps -elf process listing.

Ordinarily, this command would return a whole slew of columns: -

F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 S root         1     0  0  80   0 -  3158 ep_pol 10:44 ?        00:00:00 systemd --no-pager
...

So I want to lose the first row and only print column 4 ( PID ) to get a list of process IDs, and then run ps -o cgroup against the resulting list.

This is how I did it: -

ps -o cgroup `ps -elf | sed 1d | awk '{print $4}'`

which did the job: -

CGROUP
14:name=systemd:/init.scope,0::/init.scope
14:name=systemd:/system.slice/systemd-journald.service,12:pids:/system.slice/systemd-journald.service,6:devices:/system.slice/systemd-journald.service,5:memory:/system.slice/systemd-journald.service,4:blkio:/syst
...

I could've then put the resulting output through another sed filter to remove the CGROUP column header: -

ps -o cgroup `ps -elf | sed 1d | awk '{print $4}'` | sed 1d

which is nice.

Note that, in the above example, I'm running the output of ps -elf | sed 1d | awk '{print $4}' as input into ps o cgroup using the back-tick symbol ( ` ) which normally gets GARBLED when one copies from browser to terminal session.

For reference, and I had to Google for this, here's the reason for the name AWK: -

...
Aho is also widely known for his co-authorship of the AWK programming language with Peter J. Weinberger and Brian Kernighan (the "A" stands for "Aho").
....


So, we have Aho + Weinberger + Kernighan which gives us AWK :-)


No comments:

Note to self - use kubectl to query images in a pod or deployment

In both cases, we use JSON ... For a deployment, we can do this: - kubectl get deployment foobar --namespace snafu --output jsonpath="{...