This required some small adjustments to the produciton code to make it more feasible to test.
The new test takes an existing agent pod and terminates the `sleep` process, causing the pod to go into an `Error` status.
The agent controllers _should_ respond to this by deleting and recreating that failed pod, but the current code just gets stuck.
This is meant to replicate the situation when a cluster is suspended and resumed, which also causes the agent pod to be in this terminal error state.
Signed-off-by: Matt Moyer <moyerm@vmware.com>