ContainerImage.Pinniped

Author	SHA1	Message	Date
Monis Khan	0d285ce993	Ensure concierge and supervisor gracefully exit Changes made to both components: 1. Logs are always flushed on process exit 2. Informer cache sync can no longer hang process start up forever Changes made to concierge: 1. Add pre-shutdown hook that waits for controllers to exit cleanly 2. Informer caches are synced in post-start hook Changes made to supervisor: 1. Add shutdown code that waits for controllers to exit cleanly 2. Add shutdown code that waits for active connections to become idle Waiting for controllers to exit cleanly is critical as this allows the leader election logic to release the lock on exit. This reduces the time needed for the next leader to be elected. Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-30 20:29:52 -04:00
Matt Moyer	e43bd59688	Merge pull request #830 from mattmoyer/update-youtube-demo-link Update YouTube demo link to our official page.	2021-08-30 14:30:15 -07:00
Matt Moyer	0c8d885c26	Update YouTube demo link to our official page. Signed-off-by: Matt Moyer <moyerm@vmware.com>	2021-08-30 16:29:32 -05:00
Mo Khan	d2dfe3634a	Merge pull request #828 from enj/enj/i/supervisor_graceful_exit supervisor: ensure graceful exit	2021-08-28 13:40:13 -04:00
Monis Khan	5489f68e2f	supervisor: ensure graceful exit The kubelet will send the SIGTERM signal when it wants a process to exit. After a grace period, it will send the SIGKILL signal to force the process to terminate. The concierge has always handled both SIGINT and SIGTERM as indicators for it to gracefully exit (i.e. stop watches, controllers, etc). This change updates the supervisor to do the same (previously it only handled SIGINT). This is required to allow the leader election lock release logic to run. Otherwise it can take a few minutes for new pods to acquire the lease since they believe it is already held. Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-28 11:23:11 -04:00
Ryan Richard	4eb500cc41	Merge pull request #826 from vmware-tanzu/simplify_readme Simplify the main README.md to reduce duplication with website	2021-08-27 16:40:53 -07:00
Ryan Richard	871a9fb0c6	Simplify the main README.md to reduce duplication with website	2021-08-27 15:52:51 -07:00
Mo Khan	d580695faa	Merge pull request #824 from enj/enj/t/disruptive_hang test/integration: use short timeouts with distinct requests to prevent hangs	2021-08-27 16:38:39 -04:00
Monis Khan	ba80b691e1	test/integration: use short timeouts with distinct requests to prevent hangs Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-27 16:10:36 -04:00
Mo Khan	41c017c9da	Merge pull request #821 from enj/enj/t/increase_disruptive_test_timeout test/integration: increase timeout on disruptive tests	2021-08-27 15:24:43 -04:00
Monis Khan	5078cdbc90	test/integration: increase timeout on disruptive tests Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-27 14:56:51 -04:00
Margo Crawford	e5718351ba	Merge pull request #695 from vmware-tanzu/active-directory-identity-provider Active directory identity provider	2021-08-27 08:39:12 -07:00
Mo Khan	36ff0d52da	Merge pull request #818 from enj/enj/i/bump_go1.17 Bump to Go 1.17.0	2021-08-27 10:30:51 -04:00
Monis Khan	ad3086b8f1	Downgrade go mod compat to 1.16 for golangci-lint Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-27 10:03:48 -04:00
Monis Khan	6c29f347b4	go 1.17 bump: fix unit test failures Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-27 09:46:58 -04:00
Monis Khan	a86949d0be	Use go 1.17 module lazy loading See https://golang.org/doc/go1.17#go-command for details. Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-27 09:46:58 -04:00
Monis Khan	44f03af4b9	Bump to Go 1.17.0 Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-27 09:00:49 -04:00
Mo Khan	ce5cfde11e	Merge pull request #816 from enj/enj/i/bump_1.22.1 Bump Kube to v0.22.1	2021-08-27 08:40:23 -04:00
Monis Khan	40d70bf1fc	Bump Kube to v0.22.1 Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-27 07:36:12 -04:00
Margo Crawford	19100d68ef	Merge branch 'main' of github.com:vmware-tanzu/pinniped into active-directory-identity-provider	2021-08-26 20:42:16 -07:00
Mo Khan	1d44aa945d	Merge pull request #814 from mayankbh/topic/bmayank/inherit-hostnetwork Allow use of hostNetwork for kube-cert-agent	2021-08-26 21:13:29 -04:00
Mayank Bhatt	68547f767d	Copy hostNetwork field for kube-cert-agent For clusters where the control plane nodes aren't running a CNI, the kube-cert-agent pods deployed by concierge cannot be scheduled as they don't know to use `hostNetwork: true`. This change allows embedding the host network setting in the Concierge configuration. (by copying it from the kube-controller-manager pod spec when generating the kube-cert-agent Deployment) Also fixed a stray double comma in one of the nearby tests.	2021-08-26 17:09:59 -07:00
Margo Crawford	43694777d5	Change some comments on API docs, fix lint error by ignoring it	2021-08-26 16:55:43 -07:00
Ryan Richard	f579b1cb9f	Merge pull request #812 from vmware-tanzu/resources_section_web_site Add "Resources" section to pinniped.dev web site	2021-08-26 16:23:36 -07:00
Margo Crawford	2d32e0fa7d	Merge branch 'main' of github.com:vmware-tanzu/pinniped into active-directory-identity-provider	2021-08-26 16:21:08 -07:00
Margo Crawford	6f221678df	Change sAMAccountName env vars to userPrincipalName and add E2E ActiveDirectory test also fixed regexes in supervisor_login_test to be anchored to the beginning and end	2021-08-26 16:18:05 -07:00
Ryan Richard	e24040b0a9	add link to CNCF presentation slides	2021-08-26 15:52:04 -07:00
Mo Khan	1d269d2f6d	Merge pull request #815 from enj/enj/t/integration_parallel_disruptive test/integration: mark certain tests as disruptive	2021-08-26 17:32:14 -04:00
Monis Khan	d4a7f0b3e1	test/integration: mark certain tests as disruptive This prevents them from running with any other test, including other parallel tests. Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-26 15:11:47 -04:00
Mo Khan	d22099ac33	Merge pull request #808 from enj/enj/t/integration_parallel test/integration: run parallel tests concurrently with serial tests	2021-08-26 14:34:18 -04:00
Monis Khan	e2cf9f6b74	leader election test: approximate that followers have observed change Instead of blindly waiting long enough for a disruptive change to have been observed by the old leader and followers, we instead rely on the approximation that checkOnlyLeaderCanWrite provides - i.e. only a single actor believes they are the leader. This does not account for clients that were in the followers list before and after the disruptive change, but it serves as a reasonable approximation. Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-26 12:59:52 -04:00
Monis Khan	74daa1da64	test/integration: run parallel tests concurrently with serial tests Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-26 12:59:52 -04:00
Ryan Richard	475da05185	Merge pull request #810 from vmware-tanzu/docs_gitops_example Install docs use more GitOps-friendly style	2021-08-25 16:46:58 -07:00
Ryan Richard	86bfd4f5e4	Number each install step using "1."	2021-08-25 16:37:36 -07:00
Ryan Richard	d453bf3403	Add "Resources" section to pinniped.dev web site	2021-08-25 16:25:53 -07:00
Mo Khan	2b9b034bd2	Merge pull request #811 from vmware-tanzu/test_shell_container_image Replace one-off usages of busybox and debian images in integration tests	2021-08-25 19:13:13 -04:00
Ryan Richard	d20cab10b9	Replace one-off usages of busybox and debian images in integration tests Those images that are pulled from Dockerhub will cause pull failures on some test clusters due to Dockerhub rate limiting. Because we already have some images that we use for testing, and because those images are already pre-loaded onto our CI clusters to make the tests faster, use one of those images and always specify PullIfNotPresent to avoid pulling the image again during the integration test.	2021-08-25 15:12:07 -07:00
Ryan Richard	399737e7c6	Install docs use more GitOps-friendly style	2021-08-25 14:33:48 -07:00
Margo Crawford	1c5a2b8892	Add a couple more unit tests	2021-08-25 11:33:42 -07:00
Mo Khan	c17e7bec49	Merge pull request #800 from enj/enj/i/leader_election_release leader election: fix small race duration lease release	2021-08-25 10:29:19 -04:00
Monis Khan	c71ffdcd1e	leader election: use better duration defaults OpenShift has good defaults for these duration fields that we can use instead of coming up with them ourselves: `e14e06ba8d/pkg/config/leaderelection/leaderelection.go (L87-L109)` Copied here for easy future reference: // We want to be able to tolerate 60s of kube-apiserver disruption without causing pod restarts. // We want the graceful lease re-acquisition fairly quick to avoid waits on new deployments and other rollouts. // We want a single set of guidance for nearly every lease in openshift. If you're special, we'll let you know. // 1. clock skew tolerance is leaseDuration-renewDeadline == 30s // 2. kube-apiserver downtime tolerance is == 78s // lastRetry=floor(renewDeadline/retryPeriod)retryPeriod == 104 // downtimeTolerance = lastRetry-retryPeriod == 78s // 3. worst non-graceful lease acquisition is leaseDuration+retryPeriod == 163s // 4. worst graceful lease acquisition is retryPeriod == 26s if ret.LeaseDuration.Duration == 0 { ret.LeaseDuration.Duration = 137 time.Second } if ret.RenewDeadline.Duration == 0 { // this gives 107/26=4 retries and allows for 137-107=30 seconds of clock skew // if the kube-apiserver is unavailable for 60s starting just before t=26 (the first renew), // then we will retry on 26s intervals until t=104 (kube-apiserver came back up at 86), and there will // be 33 seconds of extra time before the lease is lost. ret.RenewDeadline.Duration = 107 * time.Second } if ret.RetryPeriod.Duration == 0 { ret.RetryPeriod.Duration = 26 * time.Second } Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-24 16:21:53 -04:00
Margo Crawford	c590c8ff41	Merge branch 'main' of github.com:vmware-tanzu/pinniped into active-directory-identity-provider	2021-08-24 12:19:29 -07:00
Monis Khan	c0617ceda4	leader election: in-memory leader status is stopped before release This change fixes a small race condition that occurred when the current leader failed to renew its lease. Before this change, the leader would first release the lease via the Kube API and then would update its in-memory status to reflect that change. Now those events occur in the reverse (i.e. correct) order. Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-24 15:02:56 -04:00
Mo Khan	f7751d13fe	Merge pull request #778 from vmware-tanzu/oidc_password_grant Optionally allow OIDC password grant for CLI-based login experience	2021-08-24 13:02:07 -04:00
Mo Khan	3077034b2d	Merge branch 'main' into oidc_password_grant	2021-08-24 12:23:52 -04:00
Mo Khan	89cef2ea6c	Merge pull request #796 from enj/enj/i/leader_election_flake leader election test: fix flake related to invalid assumption	2021-08-20 19:06:51 -04:00
Ryan Richard	211f4b23d1	Log auth endpoint errors with stack traces	2021-08-20 14:41:02 -07:00
Monis Khan	132ec0d2ad	leader election test: fix flake related to invalid assumption Even though a client may hold the leader election lock in the Kube lease API, that does not mean it has had a chance to update its internal state to reflect that. Thus we retry the checks in checkOnlyLeaderCanWrite a few times to allow the client to catch up. Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-20 17:04:26 -04:00
Mo Khan	ae505d8009	Merge pull request #788 from enj/enj/i/leader_election Add Leader Election Middleware	2021-08-20 12:58:27 -04:00
Monis Khan	c356710f1f	Add leader election middleware Signed-off-by: Monis Khan <mok@vmware.com>	2021-08-20 12:18:25 -04:00

1 2 3 4 5 ...

2359 Commits