Tuesday, 29 January 2019

Helm and "Error: trying to send message larger than max (23014173 vs. 20971520)"

I've been seeing this: -

Error: trying to send message larger than max (23014173 vs. 20971520)

when running: -

helm list --tls

against my IBM Cloud Private 3.1.1 environment.

This had been working until I rebuilt my IBM Cloud Automation Manager (CAM) 3.1.0 environment last week - which MAY be coincidence :-)

I'm using Helm / Tiller v2.9.1, as per this: -

helm version --tls

Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}

This appears to be the subject of a huge number of issues on Github, but with no obvious fix.

The best ( ! ) solution is to simply limit the number of results that the command returns, using the --max XXX switch, as per this example: -

helm list --tls  --max 100

Through trial and error, I realised that a number of my Helm releases were FAILED, as per this: -

NAME                   REVISION UPDATED                 STATUS   CHART                           NAMESPACE   
agile-tiger           1       Fri Jan 25 14:25:24 2019 FAILED   ibm-cam-3.1.0                   cert-manager
audit-logging         1       Tue Jan  1 17:53:11 2019 DEPLOYED audit-logging-3.1.1             kube-system 
auth-apikeys           1       Tue Jan  1 17:44:00 2019 DEPLOYED auth-apikeys-3.1.1               kube-system 
auth-idp               1       Tue Jan  1 17:43:52 2019 DEPLOYED auth-idp-3.1.1                   kube-system 
auth-pap               1       Tue Jan  1 17:44:09 2019 DEPLOYED auth-pap-3.1.1                   kube-system 
auth-pdp               1       Tue Jan  1 17:44:17 2019 DEPLOYED auth-pdp-3.1.1                   kube-system 
broken-narwhal         1       Fri Jan 25 14:12:45 2019 FAILED   ibm-cam-3.1.0                   cert-manager
calico                 1       Tue Jan  1 17:39:56 2019 DEPLOYED calico-3.1.1                     kube-system 
catalog-ui             1       Tue Jan  1 17:50:55 2019 DEPLOYED icp-catalog-chart-3.1.1         kube-system 
cert-manager           1       Tue Jan  1 17:41:17 2019 DEPLOYED ibm-cert-manager-3.1.1           cert-manager
custom-metrics-adapter 1       Tue Jan  1 17:52:12 2019 DEPLOYED ibm-custom-metrics-adapter-3.1.1 kube-system 
handy-hummingbird     1       Fri Jan 25 13:46:22 2019 FAILED   ibm-cam-3.1.0                   cert-manager
harping-abalone       1       Fri Jan 25 14:08:24 2019 FAILED   ibm-cam-3.1.0                   cert-manager
heapster               1       Tue Jan  1 17:50:18 2019 DEPLOYED heapster-3.1.1                   kube-system 
helm-api               1       Tue Jan  1 17:51:06 2019 DEPLOYED helm-api-3.1.1                   kube-system 

Working on the assumption (!) that this may be part of the problem i.e. the helm list command was simply choking on the number of releases, I did a little bit of purging, ending up with a command like this: -

helm delete --purge `helm list --tls  --max 100|grep FAILED|awk '{print $1}'` --tls

Again, through trial and error, I ended up with NO failed releases and, even better, a working helm list --tls command :-)

I no longer seem to need to specify the --max switch ....

... which is nice

One other thing - the problem also seemed to affect the Workloads -> Helm Releases element of the ICP UI. Having got rid of the FAILED releases, that also now works.......

No comments:

Following up ... defining K8S Services using YAML

As a fup to this: - Playing with Kubernetes deployments and NodePort services life is SO much easier if I choose to define the service...