Matt Moyer 24c8bdef44
Add a test to verify that the kube-cert-agent recovers when a pod becomes unhealthy.
This required some small adjustments to the produciton code to make it more feasible to test.

The new test takes an existing agent pod and terminates the `sleep` process, causing the pod to go into an `Error` status.
The agent controllers _should_ respond to this by deleting and recreating that failed pod, but the current code just gets stuck.

This is meant to replicate the situation when a cluster is suspended and resumed, which also causes the agent pod to be in this terminal error state.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
2021-04-21 16:48:00 -05:00
..
2020-12-16 14:27:09 -08:00