- Only inject things through the constructor that the controller
will need
- Use pkg private constants when possible for things that are not
actually configurable by the user
- Make the agent pod template private to the pkg
- Introduce a test helper to reduce some duplicated test code
- Remove some `it.Focus` lines that were accidentally committed, and
repair the broken tests that they were hiding
- Lots of TODOs added that need to be resolved to finish this WIP
- execer_test.go seems like it should be passing, but it fails (sigh)
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Annotations do not have this restriction, so we can put it there instead. This only currently occurs on clusters without the cluster signing capability (GKE).
Signed-off-by: Matt Moyer <moyerm@vmware.com>
New resource naming conventions:
- Do not repeat the Kind in the name,
e.g. do not call it foo-cluster-role-binding, just call it foo
- Names will generally start with a prefix to identify our component,
so when a user lists all objects of that kind, they can tell to which
component it is related,
e.g. `kubectl get configmaps` would list one named "pinniped-config"
- It should be possible for an operator to make the word "pinniped"
mostly disappear if they choose, by specifying the app_name in
values.yaml, to the extent that is practical (but not from APIService
names because those are hardcoded in golang)
- Each role/clusterrole and its corresponding binding have the same name
- Pinniped resource names that must be known by the server golang code
are passed to the code at run time via ConfigMap, rather than
hardcoded in the golang code. This also allows them to be prepended
with the app_name from values.yaml while creating the ConfigMap.
- Since the CLI `get-kubeconfig` command cannot guess the name of the
CredentialIssuerConfig resource in advance anymore, it lists all
CredentialIssuerConfig in the app's namespace and returns an error
if there is not exactly one found, and then uses that one regardless
of its name
I also started updating the script to deploy the test-webhook instead of
doing TMC stuff. I think the script should live in this repo so that
Pinniped contributors only need to worry about one repo for running
integration tests.
There are a bunch of TODOs in the script, but I figured this was a good
checkpoint. The script successfully runs on my machine and sets up the
test-webhook and pinniped on a local kind cluster. The integration tests
are failing because of some issue with pinniped talking to the test-webhook,
but this is step in the right direction.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
It looks like requests to our aggregated API service on GKE vacillate
between success and failure until they reach a converged successful
state. I think this has to do with our pods updating the API serving
cert at different times. If only one pod updates its serving cert to
the correct value, then it should respond with success. However, the
other pod would respond with failure. Depending on the load balancing
algorithm that GKE uses to send traffic to pods in a service, we could
end up with a success that we interpret as "all pods have rotated
their certs" when it really just means "at least one pod has rotated
its certs."
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This should simplify our build/test setup quite a bit, since it means we have only a single module (at the top level) with all hand-written code. I'll leave `module.sh` alone for now but we may be able to simplify that a bit more.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
We were using this at one point to control which tests ran with `go test ./...`, but now we're also using the `-short` flag to differentiate unit vs. integration tests.
Hopefully this will simplify things a bit.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
kubectl pulls these in in their main package...I wonder if we should do
the same for our main packages?
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Controller and aggregated API server are allowed to run
- Keep retrying to borrow the cluster signing key in case the failure
to get it was caused by a transient failure
- The CredentialRequest endpoint will always return an authentication
failure as long as the cluster signing key cannot be borrowed
- Update which integration tests are skipped to reflect what should
and should not work based on the cluster's capability under this
new behavior
- Move CreateOrUpdateCredentialIssuerConfig() and related methods
to their own file
- Update the CredentialIssuerConfig's Status every time we try to
refresh the cluster signing key
- Indicate the success or failure of the cluster signing key strategy
- Also introduce the concept of "capabilities" of an integration test
cluster to allow the integration tests to be run against clusters
that do or don't allow the borrowing of the cluster signing key
- Tests that are not expected to pass on clusters that lack the
borrowing of the signing key capability are now ignored by
calling the new library.SkipUnlessClusterHasCapability test helper
- Rename library.Getenv to library.GetEnv
- Add copyrights where they were missing
Didn't fix CI. I didn't think it would.
I have never seen the integration tests fail like this locally, so I
have to imagine the failure has something to do with the environment
on which we are testing.
This reverts commit ba2e2f509a.
We are getting these weird flakes in CI where the kube client that we
create with these helper functions doesn't work against the kube API.
The kube API tells us that we are unauthorized (401). Seems like something
is wrong with the keypair itself, but when I create a one-off kubeconfig
with the keypair, I get 200s from the API. Hmmm...I wonder what CI will
think of this change?
I also tried to align some naming in this package.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
These configuration knobs are much more human-understandable than the
previous percentage-based threshold flag.
We now allow users to set the lifetime of the serving cert via a ConfigMap.
Previously this was hardcoded to 1 year.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
The rotation is forced by a new controller that deletes the serving cert
secret, as other controllers will see this deletion and ensure that a new
serving cert is created.
Note that the integration tests now have an addition worst case runtime of
60 seconds. This is because of the way that the aggregated API server code
reloads certificates. We will fix this in a future story. Then, the
integration tests should hopefully get much faster.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This switches us back to an approach where we use the Pod "exec" API to grab the keys we need, rather than forcing our code to run on the control plane node. It will help us fail gracefully (or dynamically switch to alternate implementations) when the cluster is not self-hosted.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Co-authored-by: Ryan Richard <richardry@vmware.com>
- It would sometimes fail with this error:
namespaces is forbidden: User "tanzu-user-authentication@groups.vmware.com"
cannot list resource "namespaces" in API group "" at the cluster scope
- Seems like it was because the RBAC rule added by the test needs a
moment before it starts to take effect, so change the test to retry
the API until it succeeds or fail after 3 seconds of trying.
- We want to follow the <noun>Request convention.
- The actual operation does not login a user, but it does retrieve a
credential with which they can login.
- This commit includes changes to all LoginRequest-related symbols and
constants to try to update their names to follow the new
CredentialRequest type.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
As discussed in API review, this field exists for convenience right
now. Since the username/groups are encoded in the Credential sent in
the LoginRequestStatus, the client still has access to their
user/groups information. We want to remove this for now to be
conservative and limit our API surface area (smaller surface area =
less to maintain). We can always add this back in the future.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- When we call the LoginRequest endpoint in loginrequest_test.go,
do it with an unauthenticated client, to make sure that endpoint works
with unauthenticated clients.
- For tests which want to test using certs returned by LoginRequest to
make API calls back to kube to check if those certs are working, make
sure they start with a bare client and then add only those certs.
Avoid accidentally picking up other kubeconfig configuration like
tokens, etc.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- For high availability reasons, we would like our app to scale linearly
with the size of the control plane. Using a DaemonSet allows us to run
one pod on each node-role.kubernetes.io/master node.
- The hope is that the Service that we create should load balance
between these pods appropriately.
- Add integration test for serving cert auto-generation and rotation
- Add unit test for `WithInitialEvent` of the cert manager controller
- Move UpdateAPIService() into the `apicerts` package, since that is
the only user of the function.
I suppose we could solve this other ways, but this utility was
only used in one place right now, so it is easiest to copy it over.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Instead, make the integration tests a separate module. You can't run
these tests by accident because they will not run at all when you
`go test` from the top-level directory. You will need to `cd test`
before using `go test` in order to run the integration tests.
Signed-off-by: Ryan Richard <richardry@vmware.com>
- Why? Because the discovery URL is already there in the kubeconfig; let's
not make our lives more complicated by passing it in via an env var.
- Also allow for ytt callers to not specify data.values.discovery_url - there
are going to be a non-trivial number of installers of placeholder-name
that want to use the server URL found in the cluster-info ConfigMap.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Seems like the next step is to allow override of the CA bundle; I didn't
do that here for simplicity of the commit, but seems like it is the right
thing to do in the future.
- Also includes bumping the api and client-go dependencies to the newer
version which also moved LoginDiscoveryConfig to the
crds.placeholder.suzerain-io.github.io group in the generated code
It turns out these fields are not meant to be base64 encoded, even though that's how they are in the kubeconfig.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Dynamically grant RBAC permission to the test user to allow them
to make read requests via the API
- Then use the credential returned from the LoginRequest to make a
request back to the API server which should be successful