This test could flake in some rare scenarios. This change adds a bunch of retries, improves the debugging output if the tests fail, and puts all of the subtests in parallel which saves ~10s on my local machine.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This test has occasionally flaked because it only waited for the APIService GET to finish, but did not wait for the controller to successfully update the target object.
The new code should be more patient and allow the controller up to 10s to perform the expected action.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This new capability describes whether a cluster is expected to allow anonymous requests (most do since k8s 1.6.x, but AKS has it disabled).
This commit also contains new capability YAML files for AKS and EKS, mostly to document publicly how we expect our tests to function in those environments.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
These tests occasionally flake because of a conflict error such as:
```
supervisor_discovery_test.go:105:
Error Trace: supervisor_discovery_test.go:587
supervisor_discovery_test.go:105
Error: Received unexpected error:
Operation cannot be fulfilled on federationdomains.config.supervisor.pinniped.dev "test-oidc-provider-lvjfw": the object has been modified; please apply your changes to the latest version and try again
Test: TestSupervisorOIDCDiscovery
```
These retries should improve the reliability of the tests.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
I think this is another aspect of the test flakes we're trying to fix. This matters especially for the "Multiple Pinnipeds" test environment where two copies of the test suite are running concurrently.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
If the test is run immediately after the Concierge is installed, the API server can still have broken discovery data and return an error on the first call.
This commit adds a retry loop to attempt this first kubectl command for up to 60s before declaring failure.
The subsequent tests should be covered by this as well since they are not run in parallel.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This change adds a new virtual aggregated API that can be used by
any user to echo back who they are currently authenticated as. This
has general utility to end users and can be used in tests to
validate if authentication was successful.
Signed-off-by: Monis Khan <mok@vmware.com>
This is a partial revert of 288d9c999e. For some reason it didn't occur to me
that we could do it this way earlier. Whoops.
This also contains a middleware update: mutation funcs can return an error now
and short-circuit the rest of the request/response flow. The idea here is that
if someone is configuring their kubeclient to use middleware, they are agreeing
to a narrow-er client contract by doing so (e.g., their TokenCredentialRequest's
must have an Spec.Authenticator.APIGroup set).
I also updated some internal/groupsuffix tests to be more realistic.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I think the reason we were seeing flakes here is because the kube cert agent
pods had not reached a steady state even though our test assertions passed, so
the test would proceed immediately and run more assertions on top of a weird
state of the kube cert agent pods.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This allows us to keep all of our resources in the pinniped category
while not having kubectl return errors for calls such as:
kubectl get pinniped -A
Signed-off-by: Monis Khan <mok@vmware.com>
Because it is a test of the conciergeclient package, and the naming
convention for integration test files is supervisor_*_test.go,
concierge_*_test.go, or cli_*_test.go to identify which component
the test is primarily covering.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Yes, this is a huge commit.
The middleware allows you to customize the API groups of all of the
*.pinniped.dev API groups.
Some notes about other small things in this commit:
- We removed the internal/client package in favor of pkg/conciergeclient. The
two packages do basically the same thing. I don't think we use the former
anymore.
- We re-enabled cluster-scoped owner assertions in the integration tests.
This code was added in internal/ownerref. See a0546942 for when this
assertion was removed.
- Note: the middlware code is in charge of restoring the GV of a request object,
so we should never need to write mutations that do that.
- We updated the supervisor secret generation to no longer manually set an owner
reference to the deployment since the middleware code now does this. I think we
still need some way to make an initial event for the secret generator
controller, which involves knowing the namespace and the name of the generated
secret, so I still wired the deployment through. We could use a namespace/name
tuple here, but I was lazy.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Co-authored-by: Ryan Richard <richardry@vmware.com>
The group claims read from the session cache file are loaded as `[]interface{}` (slice of empty interfaces) so when we previously did a `groups, _ := idTokenClaims[oidc.DownstreamGroupsClaim].([]string)`, then `groups` would always end up nil.
The solution I tried here was to convert the expected value to also be `[]interface{}` so that `require.Equal(t, ...)` does the right thing.
This bug only showed up in our acceptance environnment against Okta, since we don't have any other integration test coverage with IDPs that pass a groups claim.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We were seeing a race in this test code since the require.NoError() and
require.Eventually() would write to the same testing.T state on separate
goroutines. Hopefully this helper function should cover the cases when we want
to require.NoError() inside a require.Eventually() without causing a race.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Co-authored-by: Margo Crawford <margaretc@vmware.com>
Co-authored-by: Monis Khan <i@monis.app>
This change updates our clients to always set an owner ref when:
1. The operation is a create
2. The object does not already have an owner ref set
Signed-off-by: Monis Khan <mok@vmware.com>
See comment. This is at least a first step to make our GKE acceptance
environment greener. Previously, this test assumed that the Pinniped-under-test
had been deployed in (roughly) the last 10 minutes, which is not an assumption
that we make anywhere else in the integration test suite.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Only sync on add/update of secrets in the same namespace which
have the "storage.pinniped.dev/garbage-collect-after" annotation, and
also during a full resync of the informer whenever secrets in the
same namespace with that annotation exist.
- Ignore deleted secrets to avoid having this controller trigger itself
unnecessarily when it deletes a secret. This controller is never
interested in deleted secrets, since its only job is to delete
existing secrets.
- No change to the self-imposed rate limit logic. That still applies
because secrets with this annotation will be created and updated
regularly while the system is running (not just during rare system
configuration steps).
This is a bit more clear. We're changing this now because it is a non-backwards-compatible change that we can make now since none of this RFC8693 token exchange stuff has been released yet.
There is also a small typo fix in some flag usages (s/RF8693/RFC8693/)
Signed-off-by: Matt Moyer <moyerm@vmware.com>
It would be great to do this for the supervisor's callback endpoint as well, but it's difficult to get at those since the request happens inside the spawned browser.
Signed-off-by: Matt Moyer <moyerm@vmware.com>