Commit Graph

52 Commits

Author SHA1 Message Date
Joshua Casey
e8490c0244 Do not use long-lived service account tokens in secrets 2023-10-30 23:18:21 -05:00
Ryan Richard
ca6c29e463 Fix deadlock during shutdown which prevented leader election cleanup
Before this fix, the deadlock would prevent the leader pod from giving
up its lease, which would make it take several minutes for new pods to
be allowed to elect a new leader. During that time, no Pinniped
controllers could write to the Kube API, so important resources were not
being updated during that window. It would also make pod shutdown take
about 1 minute.

After this fix, the leader gives up its lease immediately, and pod
shutdown takes about 1 second. This improves restart/upgrade time and
also fixes the problem where there was no leader for several minutes
after a restart/upgrade.

The deadlock was between the post-start hook and the pre-shutdown hook.
The pre-shutdown hook blocked until a certain background goroutine in
the post-start hook finished, but that goroutine could not finish until
the pre-shutdown hook finished. Thus, they were both blocked, waiting
for each other infinitely. Eventually the process would be externally
killed.

This deadlock was most likely introduced by some change in Kube's
generic api server package related to how the many complex channels used
during server shutdown interact with each other, and was not noticed
when we upgraded to the version which introduced the change.
2023-09-20 16:54:24 -07:00
Benjamin A. Petersen
3160b5bad1 Reorganized FederationDomain packages to avoid circular dependency
Co-authored-by: Ryan Richard <richardry@vmware.com>
2023-09-11 11:09:50 -07:00
Joshua Casey
38230fc518 Use pversion to retrieve buildtime information 2023-08-28 11:54:27 -05:00
Joshua Casey
ca05969f8d Integration tests should use 'kubectl explain --output plaintext-openapiv2'
- OpenAPIV3 discovery of aggregate APIs seems to need a little more work in K8s 1.28
2023-08-28 10:50:11 -05:00
Joshua Casey
1b504b6fbd Expose OpenAPIv3 explanations 2023-08-28 10:50:11 -05:00
Joshua Casey
741ccfd2ce Fix lint 2023-07-19 15:47:48 -05:00
Joshua Casey
52b0cf43ca Fix godoc 2023-07-19 15:47:47 -05:00
Ryan Richard
af01c3aeb6 Make kubectl explain work for Pinniped aggregated APIs
- Change update-codegen.sh script to also generated openapi code for the
  aggregated API types
- Update both aggregated API servers' configuration to make them serve
  the openapi docs for the aggregated APIs
- Add new integration test which runs `kubectl explain` for all Pinniped
  API resources, and all fields and subfields of those resources
- Update some the comments on the API structs
- Change some names of the tmpl files to make the filename better match
  the struct names
2022-09-21 15:15:37 -07:00
Monis Khan
0674215ef3
Switch to go.uber.org/zap for JSON formatted logging
Signed-off-by: Monis Khan <mok@vmware.com>
2022-05-24 11:17:42 -04:00
Ryan Richard
814399324f Merge branch 'main' into upstream_access_revocation_during_gc 2022-01-14 10:49:22 -08:00
Monis Khan
9599ffcfb9
Update all deps to latest where possible, bump Kube deps to v0.23.1
Highlights from this dep bump:

1. Made a copy of the v0.4.0 github.com/go-logr/stdr implementation
   for use in tests.  We must bump this dep as Kube code uses a
   newer version now.  We would have to rewrite hundreds of test log
   assertions without this copy.
2. Use github.com/felixge/httpsnoop to undo the changes made by
   ory/fosite#636 for CLI based login flows.  This is required for
   backwards compatibility with older versions of our CLI.  A
   separate change after this will update the CLI to be more
   flexible (it is purposefully not part of this change to confirm
   that we did not break anything).  For all browser login flows, we
   now redirect using http.StatusSeeOther instead of http.StatusFound.
3. Drop plog.RemoveKlogGlobalFlags as klog no longer mutates global
   process flags
4. Only bump github.com/ory/x to v0.0.297 instead of the latest
   v0.0.321 because v0.0.298+ pulls in a newer version of
   go.opentelemetry.io/otel/semconv which breaks k8s.io/apiserver.
   We should update k8s.io/apiserver to use the newer code.
5. Migrate all code from k8s.io/apimachinery/pkg/util/clock to
   k8s.io/utils/clock and k8s.io/utils/clock/testing
6. Delete testutil.NewDeleteOptionsRecorder and migrate to the new
   kubetesting.NewDeleteActionWithOptions
7. Updated ExpectedAuthorizeCodeSessionJSONFromFuzzing caused by
   fosite's new rotated_secrets OAuth client field.  This new field
   is currently not relevant to us as we have no private clients.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-12-16 21:15:27 -05:00
Ryan Richard
b8a93b6b90 Merge branch 'main' into customize_ports 2021-11-18 09:31:18 -08:00
Monis Khan
cd686ffdf3
Force the use of secure TLS config
This change updates the TLS config used by all pinniped components.
There are no configuration knobs associated with this change.  Thus
this change tightens our static defaults.

There are four TLS config levels:

1. Secure (TLS 1.3 only)
2. Default (TLS 1.2+ best ciphers that are well supported)
3. Default LDAP (TLS 1.2+ with less good ciphers)
4. Legacy (currently unused, TLS 1.2+ with all non-broken ciphers)

Highlights per component:

1. pinniped CLI
   - uses "secure" config against KAS
   - uses "default" for all other connections
2. concierge
   - uses "secure" config as an aggregated API server
   - uses "default" config as a impersonation proxy API server
   - uses "secure" config against KAS
   - uses "default" config for JWT authenticater (mostly, see code)
   - no changes to webhook authenticater (see code)
3. supervisor
   - uses "default" config as a server
   - uses "secure" config against KAS
   - uses "default" config against OIDC IDPs
   - uses "default LDAP" config against LDAP IDPs

Signed-off-by: Monis Khan <mok@vmware.com>
2021-11-17 16:55:35 -05:00
Ryan Richard
ca2cc40769 Add impersonationProxyServerPort to the Concierge's static ConfigMap
- Used to determine on which port the impersonation proxy will bind
- Defaults to 8444, which is the old hard-coded port value
- Allow the port number to be configured to any value within the
  range 1024 to 65535
- This commit does not include adding new config knobs to the ytt
  values file, so while it is possible to change this port without
  needing to recompile, it is not convenient
2021-11-17 13:27:59 -08:00
Ryan Richard
2383a88612 Add aggregatedAPIServerPort to the Concierge's static ConfigMap
- Allow the port number to be configured to any value within the
  range 1024 to 65535
- This commit does not include adding new config knobs to the ytt
  values file, so while it is possible to change this port without
  needing to recompile, it is not convenient
2021-11-16 16:43:51 -08:00
Monis Khan
0d285ce993
Ensure concierge and supervisor gracefully exit
Changes made to both components:

1. Logs are always flushed on process exit
2. Informer cache sync can no longer hang process start up forever

Changes made to concierge:

1. Add pre-shutdown hook that waits for controllers to exit cleanly
2. Informer caches are synced in post-start hook

Changes made to supervisor:

1. Add shutdown code that waits for controllers to exit cleanly
2. Add shutdown code that waits for active connections to become idle

Waiting for controllers to exit cleanly is critical as this allows
the leader election logic to release the lock on exit.  This reduces
the time needed for the next leader to be elected.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-30 20:29:52 -04:00
Matt Moyer
58bbffded4
Switch to a slimmer distroless base image.
At a high level, it switches us to a distroless base container image, but that also includes several related bits:

- Add a writable /tmp but make the rest of our filesystems read-only at runtime.

- Condense our main server binaries into a single pinniped-server binary. This saves a bunch of space in
  the image due to duplicated library code. The correct behavior is dispatched based on `os.Args[0]`, and
  the `pinniped-server` binary is symlinked to `pinniped-concierge` and `pinniped-supervisor`.

- Strip debug symbols from our binaries. These aren't really useful in a distroless image anyway and all the
  normal stuff you'd expect to work, such as stack traces, still does.

- Add a separate `pinniped-concierge-kube-cert-agent` binary with "sleep" and "print" functionality instead of
  using builtin /bin/sleep and /bin/cat for the kube-cert-agent. This is split from the main server binary
  because the loading/init time of the main server binary was too large for the tiny resource footprint we
  established in our kube-cert-agent PodSpec. Using a separate binary eliminates this issue and the extra
  binary adds only around 1.5MiB of image size.

- Switch the kube-cert-agent code to use a JSON `{"tls.crt": "<b64 cert>", "tls.key": "<b64 key>"}` format.
  This is more robust to unexpected input formatting than the old code, which simply concatenated the files
  with some extra newlines and split on whitespace.

- Update integration tests that made now-invalid assumptions about the `pinniped-server` image.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
2021-08-09 15:05:13 -04:00
Monis Khan
00694c9cb6
dynamiccert: split into serving cert and CA providers
Signed-off-by: Monis Khan <mok@vmware.com>
2021-03-15 12:24:07 -04:00
Ryan Richard
c82f568b2c certauthority.go: Refactor issuing client versus server certs
We were previously issuing both client certs and server certs with
both extended key usages included. Split the Issue*() methods into
separate methods for issuing server certs versus client certs so
they can have different extended key usages tailored for each use
case.

Also took the opportunity to clean up the parameters of the Issue*()
methods and New() methods to more closely match how we prefer to call
them. We were always only passing the common name part of the
pkix.Name to New(), so now the New() method just takes the common name
as a string. When making a server cert, we don't need to set the
deprecated common name field, so remove that param. When making a client
cert, we're always making it in the format expected by the Kube API
server, so just accept the username and group as parameters directly.
2021-03-12 16:09:37 -08:00
Monis Khan
2d28d1da19
Implement all optional methods in dynamic certs provider
Signed-off-by: Monis Khan <mok@vmware.com>
2021-03-11 16:24:08 -05:00
Ryan Richard
0b300cbe42 Use TokenCredentialRequest instead of base64 token with impersonator
To make an impersonation request, first make a TokenCredentialRequest
to get a certificate. That cert will either be issued by the Kube
API server's CA or by a new CA specific to the impersonator. Either
way, you can then make a request to the impersonator and present
that client cert for auth and the impersonator will accept it and
make the impesonation call on your behalf.

The impersonator http handler now borrows some Kube library code
to handle request processing. This will allow us to more closely
mimic the behavior of a real API server, e.g. the client cert
auth will work exactly like the real API server.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-03-10 10:30:06 -08:00
Andrew Keesler
069b3fba37
Merge remote-tracking branch 'upstream/main' into impersonation-proxy
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2021-02-23 12:10:52 -05:00
Monis Khan
abc941097c
Add WhoAmIRequest Aggregated Virtual REST API
This change adds a new virtual aggregated API that can be used by
any user to echo back who they are currently authenticated as.  This
has general utility to end users and can be used in tests to
validate if authentication was successful.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-02-22 20:02:41 -05:00
Monis Khan
62630d6449
getAggregatedAPIServerScheme: move group version logic internally
Signed-off-by: Monis Khan <mok@vmware.com>
2021-02-19 11:10:54 -05:00
Andrew Keesler
957cb2d56c
Merge remote-tracking branch 'upstream/main' into impersonation-proxy
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2021-02-18 13:37:28 -05:00
Matt Moyer
6565265bee
Use new 'go.pinniped.dev/generated/latest' package.
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2021-02-16 13:00:08 -06:00
Andrew Keesler
fdd8ef5835
internal/concierge/impersonator: handle custom login API group
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2021-02-16 07:55:09 -05:00
Ryan Richard
5cd60fa5f9 Move starting/stopping impersonation proxy server to a new controller
- Watch a configmap to read the configuration of the impersonation
  proxy and reconcile it.
- Implements "auto" mode by querying the API for control plane nodes.
- WIP: does not create a load balancer or proper TLS certificates yet.
  Those will come in future commits.

Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-02-11 17:25:52 -08:00
Ryan Richard
e4c49c37b9 Merge branch 'main' into impersonation-proxy 2021-02-09 13:45:37 -08:00
Monis Khan
2679d27ced
Use server scheme to handle credential request API group changes
Signed-off-by: Monis Khan <mok@vmware.com>
2021-02-09 15:51:38 -05:00
Monis Khan
6b71b8d8ad
Revert server side token credential request API group changes
Signed-off-by: Monis Khan <mok@vmware.com>
2021-02-09 15:51:35 -05:00
Margo Crawford
dfcc2a1eb8 Introduce clusterhost package to determine whether a cluster has control plane nodes
Also added hasExternalLoadBalancerProvider key to cluster capabilities
for integration testing.

Signed-off-by: Ryan Richard <richardry@vmware.com>
2021-02-09 11:16:01 -08:00
Ryan Richard
288d9c999e Use custom suffix in Spec.Authenticator.APIGroup of TokenCredentialRequest
When the Pinniped server has been installed with the `api_group_suffix`
option, for example using `mysuffix.com`, then clients who would like to
submit a `TokenCredentialRequest` to the server should set the
`Spec.Authenticator.APIGroup` field as `authentication.concierge.mysuffix.com`.

This makes more sense from the client's point of view than using the
default `authentication.concierge.pinniped.dev` because
`authentication.concierge.mysuffix.com` is the name of the API group
that they can observe their cluster and `authentication.concierge.pinniped.dev`
does not exist as an API group on their cluster.

This commit includes both the client and server-side changes to make
this work, as well as integration test updates.

Co-authored-by: Andrew Keesler <akeesler@vmware.com>
Co-authored-by: Ryan Richard <richardry@vmware.com>
Co-authored-by: Margo Crawford <margaretc@vmware.com>
2021-02-03 15:49:15 -08:00
Margo Crawford
b6abb022f6 Add initial implementation of impersonation proxy.
Signed-off-by: Margo Crawford <margaretc@vmware.com>
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2021-02-03 09:31:13 -08:00
Monis Khan
300d7bd99c
Drop duplicate logic for unversioned type registration
Signed-off-by: Monis Khan <mok@vmware.com>
2021-02-03 12:16:57 -05:00
Monis Khan
012bebd66e
Avoid double registering types in server scheme
This makes sure that if our clients ever send types with the wrong
group, the server will refuse to decode it.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-02-03 12:16:57 -05:00
Monis Khan
efe1fa89fe Allow multiple Pinnipeds to work on same cluster
Yes, this is a huge commit.

The middleware allows you to customize the API groups of all of the
*.pinniped.dev API groups.

Some notes about other small things in this commit:
- We removed the internal/client package in favor of pkg/conciergeclient. The
  two packages do basically the same thing. I don't think we use the former
  anymore.
- We re-enabled cluster-scoped owner assertions in the integration tests.
  This code was added in internal/ownerref. See a0546942 for when this
  assertion was removed.
- Note: the middlware code is in charge of restoring the GV of a request object,
  so we should never need to write mutations that do that.
- We updated the supervisor secret generation to no longer manually set an owner
  reference to the deployment since the middleware code now does this. I think we
  still need some way to make an initial event for the secret generator
  controller, which involves knowing the namespace and the name of the generated
  secret, so I still wired the deployment through. We could use a namespace/name
  tuple here, but I was lazy.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
Co-authored-by: Ryan Richard <richardry@vmware.com>
2021-02-02 15:18:41 -08:00
Andrew Keesler
50c3e4c00f
Merge branch 'main' into reenable-max-inflight-checks 2021-01-19 18:14:27 -05:00
Andrew Keesler
88fd9e5c5e
internal/config: wire API group suffix through to server components
Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2021-01-19 17:23:20 -05:00
Andrew Keesler
792bb98680
Revert "Temporarily disable max inflight checks for mutating requests"
This reverts commit 4a28d1f800.

This commit was originally made to fix a bug that caused TokenCredentialRequest
to become slow when the server was idle for an extended period of time. This was
to address a Kubernetes issue that was fixed in 1.19.5 and onward. We are now
running with Kubernetes 1.20, so we should be able to pick up this fix.
2021-01-13 11:12:09 -05:00
Margo Crawford
6f04613aed Merge branch 'main' of github.com:vmware-tanzu/pinniped into kubernetes-1.20 2021-01-08 13:22:31 -08:00
Margo Crawford
5611212ea9 Changing references from 1.19 to 1.20 2021-01-07 15:25:47 -08:00
Monis Khan
bba0f3a230
Always set an owner ref back to our deployment
This change updates our clients to always set an owner ref when:

1. The operation is a create
2. The object does not already have an owner ref set

Signed-off-by: Monis Khan <mok@vmware.com>
2021-01-07 15:25:40 -05:00
Monis Khan
4a28d1f800
Temporarily disable max inflight checks for mutating requests
Signed-off-by: Monis Khan <mok@vmware.com>
2020-11-19 21:21:10 -05:00
Monis Khan
9356f64c55
Remove global klog --log-flush-frequency flag
Signed-off-by: Monis Khan <mok@vmware.com>
2020-11-10 08:48:42 -05:00
Andrew Keesler
fcea48c8f9
Run as non-root
I tried to follow a principle of encapsulation here - we can still default to
peeps making connections to 80/443 on a Service object, but internally we will
use 8080/8443.

Signed-off-by: Andrew Keesler <akeesler@vmware.com>
2020-11-02 12:51:15 -05:00
Matt Moyer
34da8c7877
Rename existing references to "IDP" and "Identity Provider".
Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-10-30 15:12:01 -05:00
Matt Moyer
f0320dfbd8
Rename login API to login.concierge.pinniped.dev.
This is the first of a few related changes that re-organize our API after the big recent changes that introduced the supervisor component.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
2020-10-30 09:58:28 -05:00
Andrew Keesler
617c5608ca Supervisor controllers apply custom labels to JWKS secrets
Signed-off-by: Ryan Richard <richardry@vmware.com>
2020-10-15 12:40:56 -07:00