This is a relatively large rewrite of much of the kube-cert-agent controllers. Instead of managing raw Pod objects, they now create a single Deployment and let the builtin k8s controller handle it from there.
This reduces the amount of code we need and should handle a number of edge cases better, especially those where a Pod becomes "wedged" and needs to be recreated.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
All controller unit tests were accidentally using a timeout context
for the informers, instead of a cancel context which stays alive until
each test is completely finished. There is no reason to risk
unpredictable behavior of a timeout being reached during an individual
test, even though with the previous 3 second timeout it could only be
reached on a machine which is running orders of magnitude slower than
usual, since each test usually runs in about 100-300 ms. Unfortunately,
sometimes our CI workers might get that slow.
This sparked a review of other usages of timeout contexts in other
tests, and all of them were increased to a minimum value of 1 minute,
under the rule of thumb that our tests will be more reliable on slow
machines if they "pass fast and fail slow".