Commit Graph

47 Commits

Author SHA1 Message Date
Monis Khan
9599ffcfb9
Update all deps to latest where possible, bump Kube deps to v0.23.1
Highlights from this dep bump:

1. Made a copy of the v0.4.0 github.com/go-logr/stdr implementation
   for use in tests.  We must bump this dep as Kube code uses a
   newer version now.  We would have to rewrite hundreds of test log
   assertions without this copy.
2. Use github.com/felixge/httpsnoop to undo the changes made by
   ory/fosite#636 for CLI based login flows.  This is required for
   backwards compatibility with older versions of our CLI.  A
   separate change after this will update the CLI to be more
   flexible (it is purposefully not part of this change to confirm
   that we did not break anything).  For all browser login flows, we
   now redirect using http.StatusSeeOther instead of http.StatusFound.
3. Drop plog.RemoveKlogGlobalFlags as klog no longer mutates global
   process flags
4. Only bump github.com/ory/x to v0.0.297 instead of the latest
   v0.0.321 because v0.0.298+ pulls in a newer version of
   go.opentelemetry.io/otel/semconv which breaks k8s.io/apiserver.
   We should update k8s.io/apiserver to use the newer code.
5. Migrate all code from k8s.io/apimachinery/pkg/util/clock to
   k8s.io/utils/clock and k8s.io/utils/clock/testing
6. Delete testutil.NewDeleteOptionsRecorder and migrate to the new
   kubetesting.NewDeleteActionWithOptions
7. Updated ExpectedAuthorizeCodeSessionJSONFromFuzzing caused by
   fosite's new rotated_secrets OAuth client field.  This new field
   is currently not relevant to us as we have no private clients.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-12-16 21:15:27 -05:00
Monis Khan
4bf715758f
Do not rotate impersonation proxy signer CA unless necessary
This change fixes a copy paste error that led to the impersonation
proxy signer CA being rotated based on the configuration of the
rotation of the aggregated API serving certificate.  This would lead
to occasional "Unauthorized" flakes in our CI environments that
rotate the serving certificate at a frequent interval.

Updated the certs_expirer controller logs to be more detailed.

Updated CA common names to be more specific (this does not update
any previously generated CAs).

Signed-off-by: Monis Khan <mok@vmware.com>
2021-10-06 12:03:49 -04:00
Monis Khan
c356710f1f
Add leader election middleware
Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-20 12:18:25 -04:00
Monis Khan
7a812ac5ed
impersonatorconfig: only unload dynamiccert when proxy is disabled
In the upstream dynamiccertificates package, we rely on two pieces
of code:

1. DynamicServingCertificateController.newTLSContent which calls
   - clientCA.CurrentCABundleContent
   - servingCert.CurrentCertKeyContent
2. unionCAContent.VerifyOptions which calls
   - unionCAContent.CurrentCABundleContent

This results in calls to our tlsServingCertDynamicCertProvider and
impersonationSigningCertProvider.  If we Unset these providers, we
subtly break these consumers.  At best this results in test slowness
and flakes while we wait for reconcile loops to converge.  At worst,
it results in actual errors during runtime.  For example, we
previously would Unset the impersonationSigningCertProvider on any
sync loop error (even a transient one caused by a network blip or
a conflict between writes from different replicas of the concierge).
This would cause us to transiently fail to issue new certificates
from the token credential require API.  It would also cause us to
transiently fail to authenticate previously issued client certs
(which results in occasional Unauthorized errors in CI).

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-16 16:07:46 -04:00
Monis Khan
8b4ed86071
certs_expirer: be specific about what secret to delete
This change fixes a race that can occur because we have multiple
writers with no leader election lock.

1. TestAPIServingCertificateAutoCreationAndRotation/automatic
   expires the current serving certificate
2. CertsExpirerController 1 deletes expired serving certificate
3. CertsExpirerController 2 starts deletion of expired serving
   certificate but has not done so yet
4. CertsManagerController 1 creates new serving certificate
5. TestAPIServingCertificateAutoCreationAndRotation/automatic
   records the new serving certificate
6. CertsExpirerController 2 finishes deletion, and thus deletes the
   newly created serving certificate instead of the old one
7. CertsManagerController 2 creates new serving certificate
8. TestAPIServingCertificateAutoCreationAndRotation/automatic keeps
   running and eventually times out because it is expecting the
   serving certificate created by CertsManagerController 2 to match
   the value it recorded from CertsManagerController 1 (which will
   never happen since that certificate was incorrectly deleted).

Signed-off-by: Monis Khan <mok@vmware.com>
2021-07-28 09:56:05 -04:00
Monis Khan
00694c9cb6
dynamiccert: split into serving cert and CA providers
Signed-off-by: Monis Khan <mok@vmware.com>
2021-03-15 12:24:07 -04:00
Ryan Richard
c82f568b2c certauthority.go: Refactor issuing client versus server certs
We were previously issuing both client certs and server certs with
both extended key usages included. Split the Issue*() methods into
separate methods for issuing server certs versus client certs so
they can have different extended key usages tailored for each use
case.

Also took the opportunity to clean up the parameters of the Issue*()
methods and New() methods to more closely match how we prefer to call
them. We were always only passing the common name part of the
pkix.Name to New(), so now the New() method just takes the common name
as a string. When making a server cert, we don't need to set the
deprecated common name field, so remove that param. When making a client
cert, we're always making it in the format expected by the Kube API
server, so just accept the username and group as parameters directly.
2021-03-12 16:09:37 -08:00
Monis Khan
2d28d1da19
Implement all optional methods in dynamic certs provider
Signed-off-by: Monis Khan <mok@vmware.com>
2021-03-11 16:24:08 -05:00
Ryan Richard
0b300cbe42 Use TokenCredentialRequest instead of base64 token with impersonator
To make an impersonation request, first make a TokenCredentialRequest
to get a certificate. That cert will either be issued by the Kube
API server's CA or by a new CA specific to the impersonator. Either
way, you can then make a request to the impersonator and present
that client cert for auth and the impersonator will accept it and
make the impesonation call on your behalf.

The impersonator http handler now borrows some Kube library code
to handle request processing. This will allow us to more closely
mimic the behavior of a real API server, e.g. the client cert
auth will work exactly like the real API server.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-03-10 10:30:06 -08:00
Ryan Richard
d8c6894cbc All controller unit tests should not cancel context until test is over
All controller unit tests were accidentally using a timeout context
for the informers, instead of a cancel context which stays alive until
each test is completely finished. There is no reason to risk
unpredictable behavior of a timeout being reached during an individual
test, even though with the previous 3 second timeout it could only be
reached on a machine which is running orders of magnitude slower than
usual, since each test usually runs in about 100-300 ms. Unfortunately,
sometimes our CI workers might get that slow.

This sparked a review of other usages of timeout contexts in other
tests, and all of them were increased to a minimum value of 1 minute,
under the rule of thumb that our tests will be more reliable on slow
machines if they "pass fast and fail slow".
2021-03-04 17:26:01 -08:00
Ryan Richard
126f9c0da3 certs_manager.go: Rename some local variables
Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-02-18 11:16:34 -08:00
Matt Moyer
6565265bee
Use new 'go.pinniped.dev/generated/latest' package.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2021-02-16 13:00:08 -06:00
Margo Crawford
5611212ea9 Changing references from 1.19 to 1.20 2021-01-07 15:25:47 -08:00
Monis Khan
9c8b081906
Prevent multiple pinnipeds from thrashing on the API service
Signed-off-by: Monis Khan <mok@vmware.com>
2020-11-11 20:09:49 -05:00
Matt Moyer
f0320dfbd8
Rename login API to login.concierge.pinniped.dev.
This is the first of a few related changes that re-organize our API after the big recent changes that introduced the supervisor component.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-10-30 09:58:28 -05:00
Ryan Richard
38802c2184 Add a way to set a default supervisor TLS cert for when SNI won't work
- Setting a Secret in the supervisor's namespace with a special name
  will cause it to get picked up and served as the supervisor's TLS
  cert for any request which does not have a matching SNI cert.
- This is especially useful for when there is no DNS record for an
  issuer and the user will be accessing it via IP address. This
  is not how we would expect it to be used in production, but it
  might be useful for other cases.
- Includes a new integration test
- Also suppress all of the warnings about ignoring the error returned by
  Close() in lines like `defer x.Close()` to make GoLand happier
2020-10-27 16:33:08 -07:00
Andrew Keesler
110c72a5d4
dynamiccertauthority: fix cert expiration test failure
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-10-23 15:34:25 -04:00
Ryan Richard
94f20e57b1 Concierge controllers add labels to all created resources 2020-10-15 10:14:23 -07:00
Andrew Keesler
9e0195e024
kubecertagent: use initial event for when key can't be found
This should fix integration tests running on clusters that don't have
visible controller manager pods (e.g., GKE). Pinniped should boot, not
find any controller manager pods, but still post a status in the CIC.

I also updated a test helper so that we could tell the difference
between when an event was not added and when an event was added with
an empty key.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-09-24 16:54:20 -04:00
Andrew Keesler
6c555f94e3
internal/provider -> internal/dynamiccert
3 main reasons:
- The cert and key that we store in this object are not always used for TLS.
- The package name "provider" was a little too generic.
- dynamiccert.Provider reads more go-ish than provider.DynamicCertProvider.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-09-23 08:29:35 -04:00
Ryan Richard
6989e5da63 Merge branch 'main' into rename_stuff 2020-09-18 16:39:58 -07:00
Ryan Richard
80a520390b Rename many of resources that are created in Kubernetes by Pinniped
New resource naming conventions:
- Do not repeat the Kind in the name,
  e.g. do not call it foo-cluster-role-binding, just call it foo
- Names will generally start with a prefix to identify our component,
  so when a user lists all objects of that kind, they can tell to which
  component it is related,
  e.g. `kubectl get configmaps` would list one named "pinniped-config"
- It should be possible for an operator to make the word "pinniped"
  mostly disappear if they choose, by specifying the app_name in
  values.yaml, to the extent that is practical (but not from APIService
  names because those are hardcoded in golang)
- Each role/clusterrole and its corresponding binding have the same name
- Pinniped resource names that must be known by the server golang code
  are passed to the code at run time via ConfigMap, rather than
  hardcoded in the golang code. This also allows them to be prepended
  with the app_name from values.yaml while creating the ConfigMap.
- Since the CLI `get-kubeconfig` command cannot guess the name of the
  CredentialIssuerConfig resource in advance anymore, it lists all
  CredentialIssuerConfig in the app's namespace and returns an error
  if there is not exactly one found, and then uses that one regardless
  of its name
2020-09-18 15:56:50 -07:00
Matt Moyer
78ac27c262
Remove deprecated "pinniped.dev" API group.
This has been replaced by the "login.pinniped.dev" group with a slightly different API.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-09-18 17:32:15 -05:00
Matt Moyer
2d4d7e588a
Add Go vanity import paths.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-09-18 14:56:24 -05:00
Matt Moyer
8c9c1e206d
Update module/package names to match GitHub org switch.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-09-17 12:56:54 -05:00
Matt Moyer
af034befb0
Paramaterize the APIService name in apiServiceUpdaterController rather than hardcoding.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-09-17 09:52:23 -05:00
Andrew Keesler
eab5c2b86b
Save 2 lines by using inline-style comments for Copyright
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-09-16 10:35:19 -04:00
Andrew Keesler
e7b389ae6c
Update copyright to reference Pinniped contributors
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-09-16 10:05:51 -04:00
Ryan Richard
20b21e8639 Prefactor: Move updating of APIService to a separate controller
- The certs manager controller, along with its sibling certs expirer
  and certs observer controllers, are generally useful for any process
  that wants to create its own CA and TLS certs, but only if the
  updating of the APIService is not included in those controllers
- So that functionality for updating APIServices is moved to a new
  controller which watches the same Secret which is used by those
  other controllers
- Also parameterize `NewCertsManagerController` with the service name
  and the CA common name to make the controller more reusable
2020-09-08 16:36:49 -07:00
Matt Moyer
a503fa8673 Pull controller-go back into this repository as internal/controllerlib.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-08-28 13:07:47 -05:00
Andrew Keesler
ddb7a20c53
Use EC crypto (instead of RSA) to workaround weird test timeout
When we use RSA private keys to sign our test certificates, we run
into strange test timeouts. The internal/controller/apicerts package
was timing out on my machine more than once every 3 runs. When I
changed the RSA crypto to EC crypto, this timeout goes away. I'm not
gonna try to figure out what the deal is here because I think it would
take longer than it would be worth (although I am sure it is some fun
story involving prime numbers; the goroutine traces for timed out
tests would always include some big.Int operations involving prime
numbers...).

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-28 11:19:52 -04:00
Ryan Richard
cbc80d5bc4 RetryOnConflict when updating CredentialIssuerConfig from outside any controller
- Controllers will automatically run again when there's an error,
  but when we want to update CredentialIssuerConfig from server.go
  we should be careful to retry on conflicts
- Add unit tests for `issuerconfig.CreateOrUpdateCredentialIssuerConfig()`
  which was covered by integration tests in previous commits, but not
  covered by units tests yet.
2020-08-27 17:11:10 -07:00
Andrew Keesler
92a6b7f4a4
Use same lifetime for serving cert and CA cert
So that operators won't look at the lifetime of the CA cert and be
like, "wtf, why does the serving cert have the lifetime that I
specified, but its CA cert is valid for 100 years".

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-27 15:59:47 -04:00
Matt Moyer
8b36f2e8ae Convert code to use the new generated packages.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-08-24 14:42:27 -05:00
Andrew Keesler
142e9a1583
internal/certauthority: backdate certs even further
We are seeing between 1 and 2 minutes of difference between the current time
reported in the API server pod and the pinniped pods on one of our testing
environments. Hopefully this change makes our tests pass again.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-24 15:01:07 -04:00
Andrew Keesler
39c299a32d
Use duration and renewBefore to control API cert rotation
These configuration knobs are much more human-understandable than the
previous percentage-based threshold flag.

We now allow users to set the lifetime of the serving cert via a ConfigMap.
Previously this was hardcoded to 1 year.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-20 16:35:04 -04:00
Ryan Richard
3929fa672e Rename project 2020-08-20 10:54:15 -07:00
Andrew Keesler
43888e9e0a
Make CA age threshold delta more observable via more precision
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-20 11:42:29 -04:00
Andrew Keesler
6b90dc8bb7
Auto-rotate serving certificate
The rotation is forced by a new controller that deletes the serving cert
secret, as other controllers will see this deletion and ensure that a new
serving cert is created.

Note that the integration tests now have an addition worst case runtime of
60 seconds. This is because of the way that the aggregated API server code
reloads certificates. We will fix this in a future story. Then, the
integration tests should hopefully get much faster.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-08-20 10:03:36 -04:00
Matt Moyer
1b9a70d089
Switch back to an exec-based approach to grab the controller-manager CA. (#65)
This switches us back to an approach where we use the Pod "exec" API to grab the keys we need, rather than forcing our code to run on the control plane node. It will help us fail gracefully (or dynamically switch to alternate implementations) when the cluster is not self-hosted.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
Co-authored-by: Ryan Richard <richardry@vmware.com>
2020-08-19 13:21:07 -05:00
Matt Moyer
864db74306 Make sure we have an explicit DNS SAN on our API serving certificate.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-08-12 11:01:06 -05:00
Ryan Richard
e0f0eca512 Add another assertion to certs_manager_test.go 2020-08-11 17:33:06 -07:00
Ryan Richard
5ec1fbd1ca Add an assertion that the private key and cert chain match in certs_manager_test.go
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-08-11 10:39:50 -07:00
Ryan Richard
fadd718d08 Add integration and more unit tests
- Add integration test for serving cert auto-generation and rotation
- Add unit test for `WithInitialEvent` of the cert manager controller
- Move UpdateAPIService() into the `apicerts` package, since that is
  the only user of the function.
2020-08-11 10:14:57 -07:00
Ryan Richard
8034ef24ff Fix a mistake from the previous commit
- Got the order of multiple return values backwards, which was caught
  by the integration tests
2020-08-10 19:34:45 -07:00
Ryan Richard
cc9ae23a0c Add tests for the new cert controllers and some other small refactorings
- Add a unit test for each cert controller
- Make DynamicTLSServingCertProvider an interface and use a mutex
  internally
- Create a shared ToPEM function instead of having two very similar
  functions
- Move the ObservableWithInformerOption test helper to testutils
- Rename some variables and imports
2020-08-10 18:53:53 -07:00
Ryan Richard
86c3f89b2e First draft of moving API server TLS cert generation to controllers
- Refactors the existing cert generation code into controllers
  which read and write a Secret containing the certs
- Does not add any new functionality yet, e.g. no new handling
  for cert expiration, and no leader election to allow for
  multiple servers running simultaneously
- This commit also doesn't add new tests for the cert generation
  code, but it should be more unit testable now as controllers
2020-08-09 10:04:05 -07:00