The LastUpdateTime is no longer updated on every resync. It only changes if the underlying status has changed, so that it effectively shows when the transition happened.
This change happened in ab750f48aa, but we missed this test. It only fails when it has been more than ten minutes since the CredentialIssuer transitioned into a healthy state, but that can happen in our long-running CI environments.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Previously, our controllers would automatically create a CredentialIssuer with a singleton name. The helpers we had for this also used "raw" client access and did not take advantage of the informer cache pattern.
With this change, the CredentialIssuer is always created at install time in the ytt YAML. The controllers now only update the existing CredentialIssuer status, and they do so using the informer cache as much as possible.
This change is targeted at only the kubecertagent controller to start. The impersonatorconfig controller will be updated in a following PR along with other changes.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This is a relatively large rewrite of much of the kube-cert-agent controllers. Instead of managing raw Pod objects, they now create a single Deployment and let the builtin k8s controller handle it from there.
This reduces the amount of code we need and should handle a number of edge cases better, especially those where a Pod becomes "wedged" and needs to be recreated.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
All controller unit tests were accidentally using a timeout context
for the informers, instead of a cancel context which stays alive until
each test is completely finished. There is no reason to risk
unpredictable behavior of a timeout being reached during an individual
test, even though with the previous 3 second timeout it could only be
reached on a machine which is running orders of magnitude slower than
usual, since each test usually runs in about 100-300 ms. Unfortunately,
sometimes our CI workers might get that slow.
This sparked a review of other usages of timeout contexts in other
tests, and all of them were increased to a minimum value of 1 minute,
under the rule of thumb that our tests will be more reliable on slow
machines if they "pass fast and fail slow".