So this is definitely a work-in-progress but I may have resolved an issue that I was seeing with a newly created OpenShift Container Platform (OCP) cluster.
TL;DR; the command ic cs cluster ls showed my cluster state as warning and never as ready.
When I inspected the cluster using ic cs cluster get --cluster $cluster_name I saw: -
Ingress Subdomain: - †
Ingress Secret: - †
Ingress Status: -
Ingress Message: -
and: -
† Your Ingress subdomain and secret might not be ready yet. For more info by cluster type, see 'https://ibm.biz/ingress-sub' for Kubernetes or 'https://ibm.biz/ingress-sub-ocp' for OpenShift.
and, after a while, this: -
Ingress Message: Could not upload certificates to Certificate Manager instance. Ensure you have the correct IAM permissions. For more info, see http://ibm.biz/ingress-secret
I followed the suggested link: -
and ended up with: -
which said, in part: -
What’s happening
You create and delete a cluster multiple times, such as for automation purposes.
Every time that you create the cluster, you use either the same name or a name that is very similar to previous names that you used. When you run ibmcloud ks cluster get --cluster <cluster>, your cluster is in a normal state but no Ingress Subdomain or Ingress Secret are available.
Why it’s happening
When you create and delete a cluster that uses the same name multiple times, the Ingress subdomain for that cluster in the format <cluster_name>.<globally_unique_account_HASH>-0000.<region>.containers.appdomain.cloud is registered and unregistered each time.
The certificate for the subdomain is also generated and deleted each time. If you create and delete a cluster with the same name 5 times or more within 7 days, you might reach the Let's Encrypt Duplicate Certificate rate limit, because the same Ingress subdomain and certificate are registered every time that you create the cluster. Because very long cluster names are truncated to 24 characters in the Ingress subdomain for the cluster, you can also reach the rate limit if you use multiple cluster names that have the same first 24 characters.
Given that I'm writing a document guiding one through the process of deploying OCP on IBM Cloud, I have been re-using the same cluster name e.g. roks-oct2021 over and over the past few days.
Working on the hypothesis that that's the root cause, I've changed the way that I generate the cluster name for my document: -
export cluster_name="roks_`date +%s`"
which uses the date in epoch format e.g. run the command three times in sequence: -
date +%s
1635176609
date +%s
1635176610
date +%s
1635176611
and note the difference.
I've just deleted and recreated my cluster, and it's looking good thus far: -
ic cs cluster ls
OK
Name ID State Created Workers Location Version Resource Group Name Provider
roks_1635176017 c5rcsngf0kf7u096q2e0 deploying 10 minutes ago 2 Frankfurt 4.8.11_1526_openshift default vpc-gen2
The state shows as deploying rather than warning and, even more promisingly, the number of Worker ( Computer Nodes ) shows as 2 rather than 0.
We'll see ....
No comments:
Post a Comment