ContainerImage.Pinniped

Author	SHA1	Message	Date
Ryan Richard	2388e25235	Revoke upstream OIDC refresh tokens during GC	2021-11-10 15:34:19 -08:00
Margo Crawford	cb60a44f8a	extract ldap refresh search into helper function also added an integration test for refresh failing after updating the username attribute	2021-11-05 14:22:43 -07:00
Margo Crawford	b5b8cab717	Refactors: - pull construction of authenticators.Response into searchAndBindUser - remove information about the identity provider in the error that gets returned to users. Put it in debug instead, where it may show up in logs. Signed-off-by: Margo Crawford <margaretc@vmware.com>	2021-11-05 14:22:43 -07:00
Margo Crawford	f988879b6e	Addressing code review changes - changed to use custom authenticators.Response rather than the k8s one that doesn't include space for a DN - Added more checking for correct idp type in token handler - small style changes Signed-off-by: Margo Crawford <margaretc@vmware.com>	2021-11-05 14:22:43 -07:00
Margo Crawford	84edfcb541	Refactor out a function, add tests for getting the wrong idp uid	2021-11-05 14:22:43 -07:00
Margo Crawford	8396937503	Updates to tests and some error assertions	2021-11-05 14:22:43 -07:00
Margo Crawford	2c4dc2951d	resolved a couple of testing related todos	2021-11-05 14:22:43 -07:00
Margo Crawford	7a58086040	Check that username and subject remain the same for ldap refresh	2021-11-05 14:22:43 -07:00
Margo Crawford	19281313dd	Basic upstream LDAP/AD refresh This stores the user DN in the session data upon login and checks that the entry still exists upon refresh. It doesn't check anything else about the entry yet.	2021-11-05 14:22:42 -07:00
Ryan Richard	d0ced1fd74	WIP towards revoking upstream refresh tokens during GC - Discover the revocation endpoint of the upstream provider in oidc_upstream_watcher.go and save it into the cache for future use by the garbage collector controller - Adds RevokeRefreshToken to UpstreamOIDCIdentityProviderI - Implements the production version of RevokeRefreshToken - Implements test doubles for RevokeRefreshToken for future use in garbage collector's unit tests - Prefactors the crud and session storage types for future use in the garbage collector controller - See remaining TODOs in garbage_collector.go	2021-10-22 14:32:26 -07:00
Ryan Richard	e0db59fd09	More small updates based on PR feedback	2021-10-22 10:23:21 -07:00
Ryan Richard	dec43289f6	Lots of small updates based on PR feedback	2021-10-20 15:53:25 -07:00
Ryan Richard	c43e019d3a	Change default of additionalScopes and disallow "hd" in additionalAuthorizeParameters	2021-10-18 16:41:31 -07:00
Ryan Richard	d68bebeb49	Merge branch 'main' into upstream_refresh	2021-10-18 15:35:46 -07:00
Ryan Richard	c51d7c08b9	Add a comment that might be useful some day	2021-10-18 15:35:22 -07:00
Ryan Richard	ddb23bd2ed	Add upstream refresh related config to OIDCIdentityProvider CRD Also update related docs.	2021-10-14 15:49:44 -07:00
Ryan Richard	a34dae549b	When performing an upstream refresh, use the configured http client Otherwise, the CA and proxy settings will not be used for the call to the upstream token endpoint while performing the refresh. This mistake was exposed by the TestSupervisorLogin integration test, so it has test coverage.	2021-10-13 14:05:00 -07:00
Ryan Richard	79ca1d7fb0	Perform an upstream refresh during downstream refresh for OIDC upstreams - If the upstream refresh fails, then fail the downstream refresh - If the upstream refresh returns an ID token, then validate it (we use its claims in the future, but not in this commit) - If the upstream refresh returns a new refresh token, then save it into the user's session in storage - Pass the provider cache into the token handler so it can use the cached providers to perform upstream refreshes - Handle unexpected errors in the token handler where the user's session does not contain the expected data. These should not be possible in practice unless someone is manually editing the storage, but handle them anyway just to be safe. - Refactor to share the refresh code between the CLI and the token endpoint by moving it into the UpstreamOIDCIdentityProviderI interface, since the token endpoint needed it to be part of that interface anyway	2021-10-13 12:31:20 -07:00
Margo Crawford	1bd346cbeb	Require refresh tokens for upstream OIDC and save more session data - Requiring refresh tokens to be returned from upstream OIDC idps - Storing refresh tokens (for oidc) and idp information (for all idps) in custom session data during authentication - Don't pass access=offline all the time	2021-10-08 15:48:21 -07:00
Margo Crawford	43244b6599	Do not pass through downstream prompt param - throw an error when prompt=none because the spec says we can't ignore it - ignore the other prompt params Signed-off-by: Ryan Richard <richardry@vmware.com>	2021-10-06 16:30:30 -07:00
Ryan Richard	c6f1d29538	Use PinnipedSession type instead of fosite's DefaultSesssion type This will allow us to store custom data inside the fosite session storage for all downstream OIDC sessions. Signed-off-by: Margo Crawford <margaretc@vmware.com>	2021-10-06 15:28:13 -07:00
Monis Khan	4bf715758f	Do not rotate impersonation proxy signer CA unless necessary This change fixes a copy paste error that led to the impersonation proxy signer CA being rotated based on the configuration of the rotation of the aggregated API serving certificate. This would lead to occasional "Unauthorized" flakes in our CI environments that rotate the serving certificate at a frequent interval. Updated the certs_expirer controller logs to be more detailed. Updated CA common names to be more specific (this does not update any previously generated CAs). Signed-off-by: Monis Khan <mok@vmware.com>	2021-10-06 12:03:49 -04:00
Monis Khan	266d64f7d1	Do not truncate x509 errors Signed-off-by: Monis Khan <mok@vmware.com>	2021-09-29 09:38:22 -04:00
Monis Khan	03bbc54023	upstreamoidc: log claim keys at debug level At debug level: upstreamoidc.go:213] "claims from ID token and userinfo" providerName="oidc" keys=[at_hash aud email email_verified exp iat iss sub] At all level: upstreamoidc.go:207] "claims from ID token and userinfo" providerName="oidc" claims="{\"at_hash\":\"C55S-BgnHTmr2_TNf...hYmVhYWESBWxvY2Fs\"}" Signed-off-by: Monis Khan <mok@vmware.com>	2021-09-28 12:58:00 -04:00
Monis Khan	e86488615a	upstreamoidc: directly detect user info support Avoid reliance on an error string from the Core OS OIDC lib. Signed-off-by: Monis Khan <mok@vmware.com>	2021-09-28 11:29:38 -04:00
Monis Khan	0d6bf9db3e	kubecertagent: attempt to load signer as long as agent labels match This change updates the kube cert agent to a middle ground behavior that balances leader election gating with how quickly we load the signer. If the agent labels have not changed, we will attempt to load the signer even if we cannot roll out the latest version of the kube cert agent deployment. This gives us the best behavior - we do not have controllers fighting over the state of the deployment and we still get the signer loaded quickly. We will have a minute of downtime when the kube cert agent deployment changes because the new pods will have to wait to become a leader and for the new deployment to rollout the new pods. We would need to have a per pod deployment if we want to avoid that downtime (but this would come at the cost of startup time and would require coordination with the kubelet in regards to pod readiness). Signed-off-by: Monis Khan <mok@vmware.com>	2021-09-21 16:20:56 -04:00
Mo Khan	9851035e40	Merge pull request #847 from enj/enj/i/tcr_log token credential request: fix trace log kind	2021-09-21 12:36:16 -04:00
Mo Khan	aa5ff162b4	Merge pull request #849 from enj/enj/i/clock_skew certauthority: tolerate larger clock skew between API server and pinniped	2021-09-21 12:18:49 -04:00
Monis Khan	91c8f747f4	certauthority: tolerate larger clock skew between API server and pinniped This change updates our certificate code to use the same 5 minute backdate that is used by the Kubernetes controller manager. This helps to account for clock skews between the API servers and the kubelets that are running the pinniped pods. While this backdating reflects a large percentage of the lifetime of our short lived certificates (100% for the 5 minute client certificates), even a 10 minute irrevocable client certificate is within our limits. When we move to the CSR based short lived certificates, they will always have at least a 15 minute lifetime (5 minute backdating plus 10 minute minimum valid duration). Signed-off-by: Monis Khan <mok@vmware.com>	2021-09-21 09:32:24 -04:00
Ryan Richard	4e98c1bbdb	Tests use CertificatesV1 when available, otherwise use CertificatesV1beta1 CertificatesV1beta1 was removed in Kube 1.22, so the tests cannot blindly rely on it anymore. Use CertificatesV1 whenever the server reports that is available, and otherwise use the old CertificatesV1beta1. Note that CertificatesV1 was introduced in Kube 1.19.	2021-09-20 17:14:58 -07:00
Monis Khan	e65817ad5b	token credential request: fix trace log kind Signed-off-by: Monis Khan <mok@vmware.com>	2021-09-20 15:34:05 -04:00
Monis Khan	09467d3e24	kubecertagent: fix flakey tests This commit makes the following changes to the kube cert agent tests: 1. Informers are synced on start using the controllerinit code 2. Deployment client and informer are synced per controller sync loop 3. Controller sync loop exits after two consistent errors 4. Use assert instead of require to avoid ending the test early Signed-off-by: Monis Khan <mok@vmware.com>	2021-09-16 14:48:04 -04:00
Ryan Richard	bdcf468e52	Add log statement for when kube cert agent key has been loaded Because it makes things easier to debug on a real cluster	2021-09-15 14:02:46 -07:00
Ryan Richard	55de160551	Bump the version number of the kube cert agent label Not required, but within the spirit of using the version number. Since the existing kube cert agent deployment will get deleted anyway during an upgrade, it shouldn't hurt to change the version number. New installations will get the new version number on the new kube cert agent deployment.	2021-09-14 15:27:15 -07:00
Ryan Richard	cec9f3c4d7	Improve the selectors of Deployments and Services Fixes #801. The solution is complicated by the fact that the Selector field of Deployments is immutable. It would have been easy to just make the Selectors of the main Concierge Deployment, the Kube cert agent Deployment, and the various Services use more specific labels, but that would break upgrades. Instead, we make the Pod template labels and the Service selectors more specific, because those not immutable, and then handle the Deployment selectors in a special way. For the main Concierge and Supervisor Deployments, we cannot change their selectors, so they remain "app: app_name", and we make other changes to ensure that only the intended pods are selected. We keep the original "app" label on those pods and remove the "app" label from the pods of the Kube cert agent Deployment. By removing it from the Kube cert agent pods, there is no longer any chance that they will accidentally get selected by the main Concierge Deployment. For the Kube cert agent Deployment, we can change the immutable selector by deleting and recreating the Deployment. The new selector uses only the unique label that has always been applied to the pods of that deployment. Upon recreation, these pods no longer have the "app" label, so they will not be selected by the main Concierge Deployment's selector. The selector of all Services have been updated to use new labels to more specifically target the intended pods. For the Concierge Services, this will prevent them from accidentally including the Kube cert agent pods. For the Supervisor Services, we follow the same convention just to be consistent and to help future-proof the Supervisor app in case it ever has a second Deployment added to it. The selector of the auto-created impersonation proxy Service was also previously using the "app" label. There is no change to this Service because that label will now select the correct pods, since the Kube cert agent pods no longer have that label. It would be possible to update that selector to use the new more specific label, but then we would need to invent a way to pass that label into the controller, so it seemed like more work than was justified.	2021-09-14 13:35:10 -07:00
Margo Crawford	0a1ee9e37c	Remove unused functions	2021-09-08 10:34:42 -07:00
Margo Crawford	05f5bac405	ValidatedSettings is all or nothing If either the search base or the tls settings is invalid, just recheck everything.	2021-09-07 13:09:35 -07:00
Margo Crawford	0195894a50	Test fix for ldap upstream watcher	2021-09-07 13:09:35 -07:00
Margo Crawford	27c1d2144a	Make sure search base in the validatedSettings cache is properly updated when the bind secret changes	2021-09-07 13:09:35 -07:00
Monis Khan	0d285ce993	Ensure concierge and supervisor gracefully exit Changes made to both components: 1. Logs are always flushed on process exit 2. Informer cache sync can no longer hang process start up forever Changes made to concierge: 1. Add pre-shutdown hook that waits for controllers to exit cleanly 2. Informer caches are synced in post-start hook Changes made to supervisor: 1. Add shutdown code that waits for controllers to exit cleanly 2. Add shutdown code that waits for active connections to become idle Waiting for controllers to exit cleanly is critical as this allows the leader election logic to release the lock on exit. This reduces the time needed for the next leader to be elected. Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-30 20:29:52 -04:00
Monis Khan	5489f68e2f	supervisor: ensure graceful exit The kubelet will send the SIGTERM signal when it wants a process to exit. After a grace period, it will send the SIGKILL signal to force the process to terminate. The concierge has always handled both SIGINT and SIGTERM as indicators for it to gracefully exit (i.e. stop watches, controllers, etc). This change updates the supervisor to do the same (previously it only handled SIGINT). This is required to allow the leader election lock release logic to run. Otherwise it can take a few minutes for new pods to acquire the lease since they believe it is already held. Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-28 11:23:11 -04:00
Margo Crawford	e5718351ba	Merge pull request #695 from vmware-tanzu/active-directory-identity-provider Active directory identity provider	2021-08-27 08:39:12 -07:00
Monis Khan	6c29f347b4	go 1.17 bump: fix unit test failures Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-27 09:46:58 -04:00
Margo Crawford	19100d68ef	Merge branch 'main' of github.com:vmware-tanzu/pinniped into active-directory-identity-provider	2021-08-26 20:42:16 -07:00
Mayank Bhatt	68547f767d	Copy hostNetwork field for kube-cert-agent For clusters where the control plane nodes aren't running a CNI, the kube-cert-agent pods deployed by concierge cannot be scheduled as they don't know to use `hostNetwork: true`. This change allows embedding the host network setting in the Concierge configuration. (by copying it from the kube-controller-manager pod spec when generating the kube-cert-agent Deployment) Also fixed a stray double comma in one of the nearby tests.	2021-08-26 17:09:59 -07:00
Margo Crawford	2d32e0fa7d	Merge branch 'main' of github.com:vmware-tanzu/pinniped into active-directory-identity-provider	2021-08-26 16:21:08 -07:00
Margo Crawford	6f221678df	Change sAMAccountName env vars to userPrincipalName and add E2E ActiveDirectory test also fixed regexes in supervisor_login_test to be anchored to the beginning and end	2021-08-26 16:18:05 -07:00
Monis Khan	74daa1da64	test/integration: run parallel tests concurrently with serial tests Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-26 12:59:52 -04:00
Margo Crawford	1c5a2b8892	Add a couple more unit tests	2021-08-25 11:33:42 -07:00
Monis Khan	c71ffdcd1e	leader election: use better duration defaults OpenShift has good defaults for these duration fields that we can use instead of coming up with them ourselves: `e14e06ba8d/pkg/config/leaderelection/leaderelection.go (L87-L109)` Copied here for easy future reference: // We want to be able to tolerate 60s of kube-apiserver disruption without causing pod restarts. // We want the graceful lease re-acquisition fairly quick to avoid waits on new deployments and other rollouts. // We want a single set of guidance for nearly every lease in openshift. If you're special, we'll let you know. // 1. clock skew tolerance is leaseDuration-renewDeadline == 30s // 2. kube-apiserver downtime tolerance is == 78s // lastRetry=floor(renewDeadline/retryPeriod)retryPeriod == 104 // downtimeTolerance = lastRetry-retryPeriod == 78s // 3. worst non-graceful lease acquisition is leaseDuration+retryPeriod == 163s // 4. worst graceful lease acquisition is retryPeriod == 26s if ret.LeaseDuration.Duration == 0 { ret.LeaseDuration.Duration = 137 time.Second } if ret.RenewDeadline.Duration == 0 { // this gives 107/26=4 retries and allows for 137-107=30 seconds of clock skew // if the kube-apiserver is unavailable for 60s starting just before t=26 (the first renew), // then we will retry on 26s intervals until t=104 (kube-apiserver came back up at 86), and there will // be 33 seconds of extra time before the lease is lost. ret.RenewDeadline.Duration = 107 * time.Second } if ret.RetryPeriod.Duration == 0 { ret.RetryPeriod.Duration = 26 * time.Second } Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-24 16:21:53 -04:00

... 3 4 5 6 7 ...

1009 Commits