Commit Graph

415 Commits

Author SHA1 Message Date
Monis Khan 0674215ef3
Switch to go.uber.org/zap for JSON formatted logging
Signed-off-by: Monis Khan <mok@vmware.com>
2022-05-24 11:17:42 -04:00
Margo Crawford 0b72f7084c JWTAuthenticator distributed claims resolution honors tls config
Kube 1.23 introduced a new field on the OIDC Authenticator which
allows us to pass in a client with our own TLS config. See
https://github.com/kubernetes/kubernetes/pull/106141.

Signed-off-by: Margo Crawford <margaretc@vmware.com>
2022-04-19 11:36:46 -07:00
Margo Crawford d5337c9c19 Error format of untrusted certificate errors should depend on OS
Go 1.18.1 started using MacOS' x509 verification APIs on Macs
rather than Go's own. The error messages are different.

Signed-off-by: Margo Crawford <margaretc@vmware.com>
2022-04-14 17:37:36 -07:00
Margo Crawford 53597bb824 Introduce FIPS compatibility
Signed-off-by: Margo Crawford <margaretc@vmware.com>
2022-03-29 16:58:41 -07:00
Ryan Richard fffcb7f5b4 Update to github.com/golangci/golangci-lint/cmd/golangci-lint@v1.44.2
- Two of the linters changed their names
- Updated code and nolint comments to make all linters pass with 1.44.2
- Added a new hack/install-linter.sh script to help developers install
  the expected version of the linter for local development
2022-03-08 12:28:09 -08:00
Margo Crawford fdac4d16f0 Only run group refresh when the skipGroupRefresh boolean isn't set
for AD and LDAP
2022-02-17 12:50:28 -08:00
Ryan Richard 75e4093067 Merge branch 'main' into ldap_and_activedirectory_status_conditions_bug 2022-01-14 16:50:34 -08:00
Ryan Richard 7551af3eb8 Fix code that did not auto-merge correctly in previous merge from main 2022-01-14 10:59:39 -08:00
Ryan Richard 814399324f Merge branch 'main' into upstream_access_revocation_during_gc 2022-01-14 10:49:22 -08:00
Ryan Richard db0a765b98 Merge branch 'main' into ldap_and_activedirectory_status_conditions_bug 2022-01-14 10:06:16 -08:00
Ryan Richard 092a80f849 Refactor some variable names and update one comment
Change variable names to match previously renamed interface name.
2022-01-14 10:06:00 -08:00
Ryan Richard 651d392b00 Refuse logins when no upstream refresh token and no userinfo endpoint
Signed-off-by: Margo Crawford <margaretc@vmware.com>
2022-01-12 18:03:25 -08:00
Ryan Richard 91924ec685 Revert adding allowAccessTokenBasedRefresh flag to OIDCIdentityProvider
Signed-off-by: Margo Crawford <margaretc@vmware.com>
2022-01-12 18:03:25 -08:00
Margo Crawford 683a2c5b23 WIP adding access token to storage upon login 2022-01-12 18:03:25 -08:00
Margo Crawford 2958461970 Addressing PR feedback
store issuer and subject in storage for refresh
Clean up some constants

Signed-off-by: Margo Crawford <margaretc@vmware.com>
2022-01-10 11:03:37 -08:00
Margo Crawford 74b007ff66 Validate that issuer url and urls returned from discovery are https
and that they have no query or fragment

Signed-off-by: Ryan Richard <richardry@vmware.com>
2022-01-10 11:03:37 -08:00
Ryan Richard 7f99d78462 Fix bug where LDAP or AD status conditions were not updated correctly
When the LDAP and AD IDP watcher controllers encountered an update error
while trying to update the status conditions of the IDP resources, then
they would drop the computed desired new value of the condition on the
ground. Next time the controller ran it would not try to update the
condition again because it wants to use the cached settings and had
already forgotten the desired new value of the condition computed during
the previous run of the controller. This would leave the outdated value
of the condition on the IDP resource.

This bug would manifest in CI as random failures in which the expected
condition message and the actual condition message would refer to
different versions numbers of the bind secret. The actual condition
message would refer to an older version of the bind secret because the
update failed and then the new desired message got dropped on the
ground.

This commit changes the in-memory caching strategy to also cache the
computed condition messages, allowing the conditions to be updated
on the IDP resource during future calls to Sync() in the case of a
failed update.
2022-01-07 17:19:13 -08:00
Monis Khan c155c6e629
Clean up nits in AD code
- Make everything private
- Drop unused AuthTime field
- Use %q format string instead of "%s"
- Only rely on GetRawAttributeValues in AttributeUnchangedSinceLogin

Signed-off-by: Monis Khan <mok@vmware.com>
2021-12-17 08:53:44 -05:00
Monis Khan 9599ffcfb9
Update all deps to latest where possible, bump Kube deps to v0.23.1
Highlights from this dep bump:

1. Made a copy of the v0.4.0 github.com/go-logr/stdr implementation
   for use in tests.  We must bump this dep as Kube code uses a
   newer version now.  We would have to rewrite hundreds of test log
   assertions without this copy.
2. Use github.com/felixge/httpsnoop to undo the changes made by
   ory/fosite#636 for CLI based login flows.  This is required for
   backwards compatibility with older versions of our CLI.  A
   separate change after this will update the CLI to be more
   flexible (it is purposefully not part of this change to confirm
   that we did not break anything).  For all browser login flows, we
   now redirect using http.StatusSeeOther instead of http.StatusFound.
3. Drop plog.RemoveKlogGlobalFlags as klog no longer mutates global
   process flags
4. Only bump github.com/ory/x to v0.0.297 instead of the latest
   v0.0.321 because v0.0.298+ pulls in a newer version of
   go.opentelemetry.io/otel/semconv which breaks k8s.io/apiserver.
   We should update k8s.io/apiserver to use the newer code.
5. Migrate all code from k8s.io/apimachinery/pkg/util/clock to
   k8s.io/utils/clock and k8s.io/utils/clock/testing
6. Delete testutil.NewDeleteOptionsRecorder and migrate to the new
   kubetesting.NewDeleteActionWithOptions
7. Updated ExpectedAuthorizeCodeSessionJSONFromFuzzing caused by
   fosite's new rotated_secrets OAuth client field.  This new field
   is currently not relevant to us as we have no private clients.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-12-16 21:15:27 -05:00
Margo Crawford 59d999956c Move ad specific stuff to controller
also make extra refresh attributes a separate field rather than part of
Extra

Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-12-09 16:16:36 -08:00
Margo Crawford acaad05341 Make pwdLastSet stuff more generic and not require parsing the timestamp
Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-12-09 16:16:36 -08:00
Margo Crawford ee4f725209 Incorporate PR feedback 2021-12-09 16:16:36 -08:00
Margo Crawford ef5a04c7ce Check for locked users on ad upstream refresh
Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-12-09 16:16:36 -08:00
Margo Crawford f62e9a2d33 Active directory checks for deactivated user
Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-12-09 16:16:36 -08:00
Margo Crawford da9b4620b3 Active Directory checks whether password has changed recently during
upstream refresh

Signed-off-by: Margo Crawford <margaretc@vmware.com>
2021-12-09 16:16:35 -08:00
Ryan Richard f410d2bd00 Add revocation of upstream access tokens to garbage collector
Also refactor the code that decides which types of revocation failures
are worth retrying. Be more selective by only retrying those types of
errors that are likely to be worth retrying.
2021-12-08 14:29:25 -08:00
Ryan Richard b981055d31 Support revocation of access tokens in UpstreamOIDCIdentityProviderI
- Rename the RevokeRefreshToken() function to RevokeToken() and make it
  take the token type (refresh or access) as a new parameter.
- This is a prefactor getting ready to support revocation of upstream
  access tokens in the garbage collection handler.
2021-12-03 13:44:24 -08:00
Ryan Richard 69be273e01
Merge branch 'main' into upstream_refresh_revocation_during_gc 2021-11-23 14:55:44 -08:00
Ryan Richard 91eed1ab24 Merge branch 'main' into upstream_refresh_revocation_during_gc 2021-11-23 12:11:39 -08:00
Ryan Richard 3ca8c49334 Improve garbage collector log format and some comments 2021-11-23 12:11:17 -08:00
Ryan Richard b8a93b6b90 Merge branch 'main' into customize_ports 2021-11-18 09:31:18 -08:00
Ryan Richard 3b3641568a GC retries failed upstream revocations for a while, but not forever 2021-11-17 15:58:44 -08:00
Monis Khan cd686ffdf3
Force the use of secure TLS config
This change updates the TLS config used by all pinniped components.
There are no configuration knobs associated with this change.  Thus
this change tightens our static defaults.

There are four TLS config levels:

1. Secure (TLS 1.3 only)
2. Default (TLS 1.2+ best ciphers that are well supported)
3. Default LDAP (TLS 1.2+ with less good ciphers)
4. Legacy (currently unused, TLS 1.2+ with all non-broken ciphers)

Highlights per component:

1. pinniped CLI
   - uses "secure" config against KAS
   - uses "default" for all other connections
2. concierge
   - uses "secure" config as an aggregated API server
   - uses "default" config as a impersonation proxy API server
   - uses "secure" config against KAS
   - uses "default" config for JWT authenticater (mostly, see code)
   - no changes to webhook authenticater (see code)
3. supervisor
   - uses "default" config as a server
   - uses "secure" config against KAS
   - uses "default" config against OIDC IDPs
   - uses "default LDAP" config against LDAP IDPs

Signed-off-by: Monis Khan <mok@vmware.com>
2021-11-17 16:55:35 -05:00
Ryan Richard ca2cc40769 Add impersonationProxyServerPort to the Concierge's static ConfigMap
- Used to determine on which port the impersonation proxy will bind
- Defaults to 8444, which is the old hard-coded port value
- Allow the port number to be configured to any value within the
  range 1024 to 65535
- This commit does not include adding new config knobs to the ytt
  values file, so while it is possible to change this port without
  needing to recompile, it is not convenient
2021-11-17 13:27:59 -08:00
Ryan Richard 2388e25235 Revoke upstream OIDC refresh tokens during GC 2021-11-10 15:34:19 -08:00
Ryan Richard d0ced1fd74 WIP towards revoking upstream refresh tokens during GC
- Discover the revocation endpoint of the upstream provider in
  oidc_upstream_watcher.go and save it into the cache for future use
  by the garbage collector controller
- Adds RevokeRefreshToken to UpstreamOIDCIdentityProviderI
- Implements the production version of RevokeRefreshToken
- Implements test doubles for RevokeRefreshToken for future use in
  garbage collector's unit tests
- Prefactors the crud and session storage types for future use in the
  garbage collector controller
- See remaining TODOs in garbage_collector.go
2021-10-22 14:32:26 -07:00
Ryan Richard e0db59fd09 More small updates based on PR feedback 2021-10-22 10:23:21 -07:00
Ryan Richard dec43289f6 Lots of small updates based on PR feedback 2021-10-20 15:53:25 -07:00
Ryan Richard c43e019d3a Change default of additionalScopes and disallow "hd" in additionalAuthorizeParameters 2021-10-18 16:41:31 -07:00
Ryan Richard d68bebeb49 Merge branch 'main' into upstream_refresh 2021-10-18 15:35:46 -07:00
Ryan Richard ddb23bd2ed Add upstream refresh related config to OIDCIdentityProvider CRD
Also update related docs.
2021-10-14 15:49:44 -07:00
Margo Crawford 1bd346cbeb Require refresh tokens for upstream OIDC and save more session data
- Requiring refresh tokens to be returned from upstream OIDC idps
- Storing refresh tokens (for oidc) and idp information (for all idps) in custom session data during authentication
- Don't pass access=offline all the time
2021-10-08 15:48:21 -07:00
Monis Khan 4bf715758f
Do not rotate impersonation proxy signer CA unless necessary
This change fixes a copy paste error that led to the impersonation
proxy signer CA being rotated based on the configuration of the
rotation of the aggregated API serving certificate.  This would lead
to occasional "Unauthorized" flakes in our CI environments that
rotate the serving certificate at a frequent interval.

Updated the certs_expirer controller logs to be more detailed.

Updated CA common names to be more specific (this does not update
any previously generated CAs).

Signed-off-by: Monis Khan <mok@vmware.com>
2021-10-06 12:03:49 -04:00
Monis Khan 266d64f7d1
Do not truncate x509 errors
Signed-off-by: Monis Khan <mok@vmware.com>
2021-09-29 09:38:22 -04:00
Monis Khan 0d6bf9db3e
kubecertagent: attempt to load signer as long as agent labels match
This change updates the kube cert agent to a middle ground behavior
that balances leader election gating with how quickly we load the
signer.

If the agent labels have not changed, we will attempt to load the
signer even if we cannot roll out the latest version of the kube
cert agent deployment.

This gives us the best behavior - we do not have controllers
fighting over the state of the deployment and we still get the
signer loaded quickly.

We will have a minute of downtime when the kube cert agent deployment
changes because the new pods will have to wait to become a leader
and for the new deployment to rollout the new pods.  We would need
to have a per pod deployment if we want to avoid that downtime (but
this would come at the cost of startup time and would require
coordination with the kubelet in regards to pod readiness).

Signed-off-by: Monis Khan <mok@vmware.com>
2021-09-21 16:20:56 -04:00
Monis Khan 91c8f747f4
certauthority: tolerate larger clock skew between API server and pinniped
This change updates our certificate code to use the same 5 minute
backdate that is used by the Kubernetes controller manager.  This
helps to account for clock skews between the API servers and the
kubelets that are running the pinniped pods.  While this backdating
reflects a large percentage of the lifetime of our short lived
certificates (100% for the 5 minute client certificates), even a 10
minute irrevocable client certificate is within our limits.  When
we move to the CSR based short lived certificates, they will always
have at least a 15 minute lifetime (5 minute backdating plus 10 minute
minimum valid duration).

Signed-off-by: Monis Khan <mok@vmware.com>
2021-09-21 09:32:24 -04:00
Monis Khan 09467d3e24
kubecertagent: fix flakey tests
This commit makes the following changes to the kube cert agent tests:

1. Informers are synced on start using the controllerinit code
2. Deployment client and informer are synced per controller sync loop
3. Controller sync loop exits after two consistent errors
4. Use assert instead of require to avoid ending the test early

Signed-off-by: Monis Khan <mok@vmware.com>
2021-09-16 14:48:04 -04:00
Ryan Richard bdcf468e52 Add log statement for when kube cert agent key has been loaded
Because it makes things easier to debug on a real cluster
2021-09-15 14:02:46 -07:00
Ryan Richard 55de160551 Bump the version number of the kube cert agent label
Not required, but within the spirit of using the version number.
Since the existing kube cert agent deployment will get deleted anyway
during an upgrade, it shouldn't hurt to change the version number.
New installations will get the new version number on the new kube cert
agent deployment.
2021-09-14 15:27:15 -07:00
Ryan Richard cec9f3c4d7 Improve the selectors of Deployments and Services
Fixes #801. The solution is complicated by the fact that the Selector
field of Deployments is immutable. It would have been easy to just
make the Selectors of the main Concierge Deployment, the Kube cert agent
Deployment, and the various Services use more specific labels, but
that would break upgrades. Instead, we make the Pod template labels and
the Service selectors more specific, because those not immutable, and
then handle the Deployment selectors in a special way.

For the main Concierge and Supervisor Deployments, we cannot change
their selectors, so they remain "app: app_name", and we make other
changes to ensure that only the intended pods are selected. We keep the
original "app" label on those pods and remove the "app" label from the
pods of the Kube cert agent Deployment. By removing it from the Kube
cert agent pods, there is no longer any chance that they will
accidentally get selected by the main Concierge Deployment.

For the Kube cert agent Deployment, we can change the immutable selector
by deleting and recreating the Deployment. The new selector uses only
the unique label that has always been applied to the pods of that
deployment. Upon recreation, these pods no longer have the "app" label,
so they will not be selected by the main Concierge Deployment's
selector.

The selector of all Services have been updated to use new labels to
more specifically target the intended pods. For the Concierge Services,
this will prevent them from accidentally including the Kube cert agent
pods. For the Supervisor Services, we follow the same convention just
to be consistent and to help future-proof the Supervisor app in case it
ever has a second Deployment added to it.

The selector of the auto-created impersonation proxy Service was
also previously using the "app" label. There is no change to this
Service because that label will now select the correct pods, since
the Kube cert agent pods no longer have that label. It would be possible
to update that selector to use the new more specific label, but then we
would need to invent a way to pass that label into the controller, so
it seemed like more work than was justified.
2021-09-14 13:35:10 -07:00