I'm worried that these errors are going to be really burried from the user, so
add some log statements to try to make them a tiny bit more observable.
Also follow some of our error message convetions by using lowercase error
messages.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This CSRF cookie needs to be included on the request to the callback endpoint triggered by the redirect from the OIDC upstream provider. This is not allowed by `Same-Site=Strict` but is allowed by `Same-Site=Lax` because it is a "cross-site top-level navigation" [1].
We didn't catch this earlier with our Dex-based tests because the upstream and downstream issuers were on the same parent domain `*.svc.cluster.local` so the cookie was allowed even with `Strict` mode.
[1]: https://tools.ietf.org/html/draft-ietf-httpbis-cookie-same-site-00#section-3.2
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This commit includes a failing test (amongst other compiler failures) for the
dynamic signing key fetcher that we will inject into fosite. We are checking it
in so that we can pass the WIP off.
Signed-off-by: Margo Crawford <margaretc@vmware.com>
We are currently using EC keys to sign ID tokens, so we should reflect that in
our OIDC discovery metadata.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
We missed this in the original interface specification, but the `grant_type=authorization_code` requires it, per RFC6749 (https://tools.ietf.org/html/rfc6749#section-4.1.3).
Signed-off-by: Matt Moyer <moyerm@vmware.com>
This allows the token exchange request to be performed with the correct TLS configuration.
We go to a bit of extra work to make sure the `http.Client` object is cached between reconcile operations so that connection pooling works as expected.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Note that this WIP commit includes a failing unit test, which will
be addressed in the next commit
Signed-off-by: Ryan Richard <richardry@vmware.com>
Mainly, avoid using some `testing` helpers that were added in 1.14, as well as a couple of other niceties we can live without.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Generate a new cookie for the user and move on as if they had not sent
a bad cookie. Hopefully this will make the user experience better if,
for example, the server rotated cookie signing keys and then a user
submitted a very old cookie.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Also use ConstantTimeCompare() to compare CSRF tokens to prevent
leaking any information in how quickly we reject bad tokens.
Signed-off-by: Ryan Richard <richardry@vmware.com>
This is much nicer UX for an administrator installing a UpstreamOIDCProvider
CRD. They don't have to guess as hard at what the callback endpoint path should
be for their UpstreamOIDCProvider.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Also aggresively refactor for readability:
- Make helper validations functions for each type of storage
- Try to label symbols based on their downstream/upstream use and group them
accordingly
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Also handle several more error cases
- Move RequireTimeInDelta to shared testutils package so other tests
can also use it
- Move all of the oidc test helpers into a new oidc/oidctestutils
package to break a circular import dependency. The shared testutil
package can't depend on any of our other packages or else we
end up with circular dependencies.
- Lots more assertions about what was stored at the end of the
request to build confidence that we are going to pass all of the
right settings over to the token endpoint through the storage, and
also to avoid accidental regressions in that area in the future
Signed-off-by: Ryan Richard <richardry@vmware.com>
Also refactor to get rid of duplicate test structs.
Also also don't default groups ID token claim because there is no standard one.
Also also also add some logging that will hopefully help us in debugging in the
future.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Because we want it to implement an AuthcodeExchanger interface and
do it in a way that will be more unit test-friendly than the underlying
library that we intend to use inside its implementation.
This will allow it to be imported by Go code outside of our repository, which was something we have planned for since this code was written.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- To better support having multiple downstream providers configured,
the authorize endpoint will share a CSRF cookie between all
downstream providers' authorize endpoints. The first time a
user's browser hits the authorize endpoint of any downstream
provider, that endpoint will set the cookie. Then if the user
starts an authorize flow with that same downstream provider or with
any other downstream provider which shares the same domain name
(i.e. differentiated by issuer path), then the same cookie will be
submitted and respected.
- Just in case we are sharing the domain name with some other app,
we sign the value of any new CSRF cookie and check the signature
when we receive the cookie. This wasn't strictly necessary since
we probably won't share a domain name with other apps, but it
wasn't hard to add this cookie signing.
Signed-off-by: Ryan Richard <richardry@vmware.com>
Our unit tests are gonna touch a lot more corner cases than our
integration tests, so let's make them run as close to the real
implementation as possible.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
We want to run all of the fosite validations in the authorize
endpoint, but we don't need to store anything yet because
we are storing what we need for later in the upstream state
parameter.
Signed-off-by: Ryan Richard <richardry@vmware.com>
- Add a new helper method to plog to make a consistent way to log
expected errors at the info level (as opposed to unexpected
system errors that would be logged using plog.Error)
Signed-off-by: Ryan Richard <richardry@vmware.com>
Also move definition of our oauth client and the general fosite
configuration to a helper so we can use the same config to construct
the handler for both test and production code.
Signed-off-by: Ryan Richard <richardry@vmware.com>
This prevents unnecessary sync loop runs when the controller is
running with a single worker. When the controller is running with
more than one worker, it prevents subtle bugs that can cause the
controller to go "back in time."
Signed-off-by: Monis Khan <mok@vmware.com>
Signed-off-by: Matt Moyer <moyerm@vmware.com>
Previously we were injecting the whole oauth handler chain into this function,
which meant we were essentially writing unit tests to test our tests. Let's push
some of this logic into the source code.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Does not validate incoming request parameters yet. Also is not
served on the http/https ports yet. Those will come in future commits.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
I tried to follow a principle of encapsulation here - we can still default to
peeps making connections to 80/443 on a Service object, but internally we will
use 8080/8443.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
And delete the agent pod when it needs its custom labels to be
updated, so that the creator controller will notice that it is missing
and immediately create it with the new custom labels.
This is the first of a few related changes that re-organize our API after the big recent changes that introduced the supervisor component.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- Setting a Secret in the supervisor's namespace with a special name
will cause it to get picked up and served as the supervisor's TLS
cert for any request which does not have a matching SNI cert.
- This is especially useful for when there is no DNS record for an
issuer and the user will be accessing it via IP address. This
is not how we would expect it to be used in production, but it
might be useful for other cases.
- Includes a new integration test
- Also suppress all of the warnings about ignoring the error returned by
Close() in lines like `defer x.Close()` to make GoLand happier
- TLS certificates can be configured on the OIDCProviderConfig using
the `secretName` field.
- When listening for incoming TLS connections, choose the TLS cert
based on the SNI hostname of the incoming request.
- Because SNI hostname information on incoming requests does not include
the port number of the request, we add a validation that
OIDCProviderConfigs where the issuer hostnames (not including port
number) are the same must use the same `secretName`.
- Note that this approach does not yet support requests made to an
IP address instead of a hostname. Also note that `localhost` is
considered a hostname by SNI.
- Add port 443 as a container port to the pod spec.
- A new controller watches for TLS secrets and caches them in memory.
That same in-memory cache is used while servicing incoming connections
on the TLS port.
- Make it easy to configure both port 443 and/or port 80 for various
Service types using our ytt templates for the supervisor.
- When deploying to kind, add another nodeport and forward it to the
host on another port to expose our new HTTPS supervisor port to the
host.
- When two different Issuers have the same host (i.e. they differ
only by path) then they must have the same secretName. This is because
it wouldn't make sense for there to be two different TLS certificates
for one host. Find any that do not have the same secret name to
put an error status on them and to avoid serving OIDC endpoints for
them. The host comparison is case-insensitive.
- Issuer hostnames should be treated as case-insensitive, because
DNS hostnames are case-insensitive. So https://me.com and
https://mE.cOm are duplicate issuers. However, paths are
case-sensitive, so https://me.com/A and https://me.com/a are
different issuers. Fixed this in the issuer validations and in the
OIDC Manager's request router logic.
EC keys are smaller and take less time to generate. Our integration
tests were super flakey because generating an RSA key would take up to
10 seconds *gasp*. The main token verifier that we care about is
Kubernetes, which supports P256, so hopefully it won't be that much of
an issue that our default signing key type is EC. The OIDC spec seems
kinda squirmy when it comes to using non-RSA signing algorithms...
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- The OIDCProviderConfigWatcherController synchronizes the
OIDCProviderConfig settings to dynamically mount and unmount the
OIDC discovery endpoints for each provider
- Integration test passes but unit tests need to be added still
This should fix integration tests running on clusters that don't have
visible controller manager pods (e.g., GKE). Pinniped should boot, not
find any controller manager pods, but still post a status in the CIC.
I also updated a test helper so that we could tell the difference
between when an event was not added and when an event was added with
an empty key.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Right now in the YTT templates we assume that the agent pods are gonna use
the same image as the main Pinniped deployment, so we can use the same logic
for the image pull secrets.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Simplifies the implementation, makes it more consistent with other
updaters of the cic (CredentialIssuerConfig), and also retries on
update conflicts
Signed-off-by: Ryan Richard <richardry@vmware.com>
- Only inject things through the constructor that the controller
will need
- Use pkg private constants when possible for things that are not
actually configurable by the user
- Make the agent pod template private to the pkg
- Introduce a test helper to reduce some duplicated test code
- Remove some `it.Focus` lines that were accidentally committed, and
repair the broken tests that they were hiding
I think we want to reconcile on these pod template fields so that if
someone were to redeploy Pinniped with a new image for the agent, the
agent would get updated immediately. Before this change, the agent image
wouldn't get updated until the agent pod was deleted.
This thing is supposed to be used to help our CredentialRequest handler issue certs with a dynamic
CA keypair.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
3 main reasons:
- The cert and key that we store in this object are not always used for TLS.
- The package name "provider" was a little too generic.
- dynamiccert.Provider reads more go-ish than provider.DynamicCertProvider.
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
- Lots of TODOs added that need to be resolved to finish this WIP
- execer_test.go seems like it should be passing, but it fails (sigh)
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
This also has fallback compatibility support if no IDP is specified and there is exactly one IDP in the cache.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
- All of the `kubecertagent` controllers now take two informers
- This is moving in the direction of creating the agent pods in the
Pinniped installation namespace, but that will come in a future
commit
New resource naming conventions:
- Do not repeat the Kind in the name,
e.g. do not call it foo-cluster-role-binding, just call it foo
- Names will generally start with a prefix to identify our component,
so when a user lists all objects of that kind, they can tell to which
component it is related,
e.g. `kubectl get configmaps` would list one named "pinniped-config"
- It should be possible for an operator to make the word "pinniped"
mostly disappear if they choose, by specifying the app_name in
values.yaml, to the extent that is practical (but not from APIService
names because those are hardcoded in golang)
- Each role/clusterrole and its corresponding binding have the same name
- Pinniped resource names that must be known by the server golang code
are passed to the code at run time via ConfigMap, rather than
hardcoded in the golang code. This also allows them to be prepended
with the app_name from values.yaml while creating the ConfigMap.
- Since the CLI `get-kubeconfig` command cannot guess the name of the
CredentialIssuerConfig resource in advance anymore, it lists all
CredentialIssuerConfig in the app's namespace and returns an error
if there is not exactly one found, and then uses that one regardless
of its name
Eventually we could refactor to remove support for the old APIs, but they are so similar that a single implementation seems to handle both easily.
Signed-off-by: Matt Moyer <moyerm@vmware.com>