We were previously issuing both client certs and server certs with
both extended key usages included. Split the Issue*() methods into
separate methods for issuing server certs versus client certs so
they can have different extended key usages tailored for each use
case.
Also took the opportunity to clean up the parameters of the Issue*()
methods and New() methods to more closely match how we prefer to call
them. We were always only passing the common name part of the
pkix.Name to New(), so now the New() method just takes the common name
as a string. When making a server cert, we don't need to set the
deprecated common name field, so remove that param. When making a client
cert, we're always making it in the format expected by the Kube API
server, so just accept the username and group as parameters directly.
The impersonator_test.go unit test now starts the impersonation
server and makes real HTTP requests against it using client-go.
It is backed by a fake Kube API server.
The CA IssuePEM() method was missing the argument to allow a slice
of IP addresses to be passed in.
To make an impersonation request, first make a TokenCredentialRequest
to get a certificate. That cert will either be issued by the Kube
API server's CA or by a new CA specific to the impersonator. Either
way, you can then make a request to the impersonator and present
that client cert for auth and the impersonator will accept it and
make the impesonation call on your behalf.
The impersonator http handler now borrows some Kube library code
to handle request processing. This will allow us to more closely
mimic the behavior of a real API server, e.g. the client cert
auth will work exactly like the real API server.
Signed-off-by: Monis Khan <mok@vmware.com>
This makes output that's easier to copy-paste into the test. We could also make it ignore the order of key/value pairs in the future.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
All controller unit tests were accidentally using a timeout context
for the informers, instead of a cancel context which stays alive until
each test is completely finished. There is no reason to risk
unpredictable behavior of a timeout being reached during an individual
test, even though with the previous 3 second timeout it could only be
reached on a machine which is running orders of magnitude slower than
usual, since each test usually runs in about 100-300 ms. Unfortunately,
sometimes our CI workers might get that slow.
This sparked a review of other usages of timeout contexts in other
tests, and all of them were increased to a minimum value of 1 minute,
under the rule of thumb that our tests will be more reliable on slow
machines if they "pass fast and fail slow".
In impersonator_config_test.go, instead of waiting for the resource
version to appear in the informers, wait for the actual object to
appear.
This is an attempt to resolve flaky failures that only happen in CI,
but it also cleans up the test a bit by avoiding inventing fake resource
version numbers all over the test.
Signed-off-by: Monis Khan <mok@vmware.com>
- Use `Eventually` when making tls connections because the production
code's handling of starting and stopping the TLS server port
has some async behavior.
- Don't use resource version "0" because that has special meaning
in the informer libraries.
This updates our issuerconfig.UpdateStrategy to sort strategies according to a weighted preference.
The TokenCredentialRequest API strategy is preffered, followed by impersonation proxy, followed by any other unknown types.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- This commit does not include the updates that we plan to make to
the `status.strategies[].frontend` field of the CredentialIssuer.
That will come in a future commit.
This is more than an automatic merge. It also includes a rewrite of the CredentialIssuer API impersonation proxy fields using the new structure, and updates to the CLI to account for that new API.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
These controllers were a bit inconsistent. There were cases where the controllers ran out of the expected order and the custom labels might not have been applied.
We should still plan to remove this label handling or move responsibility into the middleware layer, but this avoids any regression.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- The CA cert will end up in the end user's kubeconfig on their client
machine, so if it changes they would need to fetch the new one and
update their kubeconfig. Therefore, we should avoid changing it as
much as possible.
- Now the controller writes the CA to a different Secret. It writes both
the cert and the key so it can reuse them to create more TLS
certificates in the future.
- For now, it only needs to make more TLS certificates if the old
TLS cert Secret gets deleted or updated to be invalid. This allows
for manual rotation of the TLS certs by simply deleting the Secret.
In the future, we may want to implement some kind of auto rotation.
- For now, rotation of both the CA and TLS certs will also happen if
you manually delete the CA Secret. However, this would cause the end
users to immediately need to get the new CA into their kubeconfig,
so this is not as elegant as a normal rotation flow where you would
have a window of time where you have more than one CA.
Also update concierge_impersonation_proxy_test.go integration test
to use real TLS when calling the impersonator.
Signed-off-by: Ryan Richard <richardry@vmware.com>
These are prone to breaking when stdr is upgraded because they rely on the exact ordering of keys in the log message. If we have more problems we can rewrite the assertions to be more robust, but for this time I'm just fixing them to match the new output.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Also:
- Changed base64 encoding of impersonator bearer tokens to use
`base64.StdEncoding` to make it easier for users to manually
create a token using the unix `base64` command
- Test the headers which are and are not passed through to the Kube API
by the impersonator more carefully in the unit tests
- More WIP on concierge_impersonation_proxy_test.go
Signed-off-by: Margo Crawford <margaretc@vmware.com>
This change adds a new virtual aggregated API that can be used by
any user to echo back who they are currently authenticated as. This
has general utility to end users and can be used in tests to
validate if authentication was successful.
Signed-off-by: Monis Khan <mok@vmware.com>
This is a more reliable way to determine whether the load balancer
is already running.
Also added more unit tests for the load balancer.
Signed-off-by: Ryan Richard <richardry@vmware.com>
If someone has already set impersonation headers in their request, then
we should fail loudly so the client knows that its existing impersonation
headers will not work.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Watch a configmap to read the configuration of the impersonation
proxy and reconcile it.
- Implements "auto" mode by querying the API for control plane nodes.
- WIP: does not create a load balancer or proper TLS certificates yet.
Those will come in future commits.
Signed-off-by: Margo Crawford <margaretc@vmware.com>
This is a partial revert of 288d9c999e. For some reason it didn't occur to me
that we could do it this way earlier. Whoops.
This also contains a middleware update: mutation funcs can return an error now
and short-circuit the rest of the request/response flow. The idea here is that
if someone is configuring their kubeclient to use middleware, they are agreeing
to a narrow-er client contract by doing so (e.g., their TokenCredentialRequest's
must have an Spec.Authenticator.APIGroup set).
I also updated some internal/groupsuffix tests to be more realistic.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I added that test helper to create an http.Request since I wanted to properly
initialize the http.Request's context.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This allows us to keep all of our resources in the pinniped category
while not having kubectl return errors for calls such as:
kubectl get pinniped -A
Signed-off-by: Monis Khan <mok@vmware.com>
When the Pinniped server has been installed with the `api_group_suffix`
option, for example using `mysuffix.com`, then clients who would like to
submit a `TokenCredentialRequest` to the server should set the
`Spec.Authenticator.APIGroup` field as `authentication.concierge.mysuffix.com`.
This makes more sense from the client's point of view than using the
default `authentication.concierge.pinniped.dev` because
`authentication.concierge.mysuffix.com` is the name of the API group
that they can observe their cluster and `authentication.concierge.pinniped.dev`
does not exist as an API group on their cluster.
This commit includes both the client and server-side changes to make
this work, as well as integration test updates.
Co-authored-by: Andrew Keesler <akeesler@vmware.com>
Co-authored-by: Ryan Richard <richardry@vmware.com>
Co-authored-by: Margo Crawford <margaretc@vmware.com>
This makes sure that if our clients ever send types with the wrong
group, the server will refuse to decode it.
Signed-off-by: Monis Khan <mok@vmware.com>
- I realized that the hardcoded fakekubeapi 404 not found response was invalid,
so we were getting a default error message. I fixed it so the tests follow a
higher fidelity code path.
- I caved and added a test for making sure the request body was always closed,
and believe it or not, we were double closing a body. I don't *think* this will
matter in production, since client-go will pass us ioutil.NopReader()'s, but
at least we know now.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Yes, this is a huge commit.
The middleware allows you to customize the API groups of all of the
*.pinniped.dev API groups.
Some notes about other small things in this commit:
- We removed the internal/client package in favor of pkg/conciergeclient. The
two packages do basically the same thing. I don't think we use the former
anymore.
- We re-enabled cluster-scoped owner assertions in the integration tests.
This code was added in internal/ownerref. See a0546942 for when this
assertion was removed.
- Note: the middlware code is in charge of restoring the GV of a request object,
so we should never need to write mutations that do that.
- We updated the supervisor secret generation to no longer manually set an owner
reference to the deployment since the middleware code now does this. I think we
still need some way to make an initial event for the secret generator
controller, which involves knowing the namespace and the name of the generated
secret, so I still wired the deployment through. We could use a namespace/name
tuple here, but I was lazy.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Co-authored-by: Ryan Richard <richardry@vmware.com>
This reverts commit 4a28d1f800.
This commit was originally made to fix a bug that caused TokenCredentialRequest
to become slow when the server was idle for an extended period of time. This was
to address a Kubernetes issue that was fixed in 1.19.5 and onward. We are now
running with Kubernetes 1.20, so we should be able to pick up this fix.
This change updates our clients to always set an owner ref when:
1. The operation is a create
2. The object does not already have an owner ref set
Signed-off-by: Monis Khan <mok@vmware.com>
- JWKSWriterController
- JWKSObserverController
- FederationDomainSecretsController for HMAC keys
- FederationDomainSecretsController for state signature key
- FederationDomainSecretsController for state encryption key
Signed-off-by: Ryan Richard <richardry@vmware.com>
- Only sync on add/update of secrets in the same namespace which
have the "storage.pinniped.dev/garbage-collect-after" annotation, and
also during a full resync of the informer whenever secrets in the
same namespace with that annotation exist.
- Ignore deleted secrets to avoid having this controller trigger itself
unnecessarily when it deletes a secret. This controller is never
interested in deleted secrets, since its only job is to delete
existing secrets.
- No change to the self-imposed rate limit logic. That still applies
because secrets with this annotation will be created and updated
regularly while the system is running (not just during rare system
configuration steps).
We stared at this very carefully and we don't think there are any structural changes. Maybe something small happened to get the RNG off by one?
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This implementation is janky because I wanted to make the smallest change
possible to try to get the code back to stable so we can release.
Also deep copy an object so we aren't mutating the cache.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This is a bit more clear. We're changing this now because it is a non-backwards-compatible change that we can make now since none of this RFC8693 token exchange stuff has been released yet.
There is also a small typo fix in some flag usages (s/RF8693/RFC8693/)
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Fosite overrides the `Cache-Control` header we set, which is basically fine even though it's not exactly what we want.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
From RFC2616 (https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2):
> It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair,
> without changing the semantics of the message, by appending each subsequent field-value to the first,
> each separated by a comma.
This was correct before, but this simplifes a bit and shaves off a few bytes from the response.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
The bug itself has to do with when headers are streamed to the client. Once a wrapped handler has sent any bytes to the `http.ResponseWriter`, the value of the map returned from `w.Header()` no longer matters for the response. The fix is fairly trivial, which is to add those response headers before invoking the wrapped handler.
The existing unit test didn't catch this due to limitations in `httptest.NewRecorder()`. It is now replaced with a new test that runs a full HTTP test server, which catches the previous bug.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Because the library that we are using which returns that error
formats the timestamp in localtime, which is LMT when running
on a laptop, but is UTC when running in CI.
Signed-off-by: Ryan Richard <richardry@vmware.com>
- Refactor the test to avoid testing a private method and instead
always test the results of running the controller.
- Also remove the `if testing.Short()` check because it will always
be short when running unit tests. This prevented the unit test
from ever running, both locally and in CI.
Signed-off-by: Ryan Richard <richardry@vmware.com>
We believe this API is more forwards compatible with future secrets management
use cases. The implementation is a cry for help, but I was trying to follow the
previously established pattern of encapsulating the secret generation
functionality to a single group of packages.
This commit makes a breaking change to the current OIDCProvider API, but that
OIDCProvider API was added after the latest release, so it is technically still
in development until we release, and therefore we can continue to thrash on it.
I also took this opportunity to make some things private that didn't need to be
public.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This forced us to add labels to the CSRF cookie secret, just as we do
for other Supervisor secrets. Yay tests.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- AudienceMatchingStrategy: we want to use the default matcher from
fosite, so remove that line
- AllowedPromptValues: We can use the default if we add a small
change to the auth_handler.go to account for it (in a future commit)
- MinParameterEntropy: Use the fosite default to make it more likely
that off the shelf OIDC clients can work with the supervisor
Signed-off-by: Ryan Richard <richardry@vmware.com>
- Also add more log statements to the controller
- Also have the controller apply a rate limit to itself, to avoid
having a very chatty controller that runs way more often than is
needed.
- Also add an integration test for the controller's behavior.
Signed-off-by: Margo Crawford <margaretc@vmware.com>
When we try to decode with the wrong decryption key, we could get any number of
error messages, depending on what failure mode we are in (couldn't authenticate
plaintext after decryption, couldn't deserialize, etc.). This change makes the
test weaker, but at least we know we will get an error message in the case where
the decryption key is wrong.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This also sets the CSRF cookie Secret's OwnerReference to the Pod's grandparent
Deployment so that when the Deployment is cleaned up, then the Secret is as
well.
Obviously this controller implementation has a lot of issues, but it will at
least get us started.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>