Commit Graph

2288 Commits

Author SHA1 Message Date
Ryan Richard
475da05185
Merge pull request #810 from vmware-tanzu/docs_gitops_example
Install docs use more GitOps-friendly style
2021-08-25 16:46:58 -07:00
Ryan Richard
86bfd4f5e4 Number each install step using "1." 2021-08-25 16:37:36 -07:00
Mo Khan
2b9b034bd2
Merge pull request #811 from vmware-tanzu/test_shell_container_image
Replace one-off usages of busybox and debian images in integration tests
2021-08-25 19:13:13 -04:00
Ryan Richard
d20cab10b9 Replace one-off usages of busybox and debian images in integration tests
Those images that are pulled from Dockerhub will cause pull failures
on some test clusters due to Dockerhub rate limiting.

Because we already have some images that we use for testing, and
because those images are already pre-loaded onto our CI clusters
to make the tests faster, use one of those images and always specify
PullIfNotPresent to avoid pulling the image again during the integration
test.
2021-08-25 15:12:07 -07:00
Ryan Richard
399737e7c6 Install docs use more GitOps-friendly style 2021-08-25 14:33:48 -07:00
Mo Khan
c17e7bec49
Merge pull request #800 from enj/enj/i/leader_election_release
leader election: fix small race duration lease release
2021-08-25 10:29:19 -04:00
Monis Khan
c71ffdcd1e
leader election: use better duration defaults
OpenShift has good defaults for these duration fields that we can
use instead of coming up with them ourselves:

e14e06ba8d/pkg/config/leaderelection/leaderelection.go (L87-L109)

Copied here for easy future reference:

// We want to be able to tolerate 60s of kube-apiserver disruption without causing pod restarts.
// We want the graceful lease re-acquisition fairly quick to avoid waits on new deployments and other rollouts.
// We want a single set of guidance for nearly every lease in openshift.  If you're special, we'll let you know.
// 1. clock skew tolerance is leaseDuration-renewDeadline == 30s
// 2. kube-apiserver downtime tolerance is == 78s
//      lastRetry=floor(renewDeadline/retryPeriod)*retryPeriod == 104
//      downtimeTolerance = lastRetry-retryPeriod == 78s
// 3. worst non-graceful lease acquisition is leaseDuration+retryPeriod == 163s
// 4. worst graceful lease acquisition is retryPeriod == 26s
if ret.LeaseDuration.Duration == 0 {
	ret.LeaseDuration.Duration = 137 * time.Second
}

if ret.RenewDeadline.Duration == 0 {
	// this gives 107/26=4 retries and allows for 137-107=30 seconds of clock skew
	// if the kube-apiserver is unavailable for 60s starting just before t=26 (the first renew),
	// then we will retry on 26s intervals until t=104 (kube-apiserver came back up at 86), and there will
	// be 33 seconds of extra time before the lease is lost.
	ret.RenewDeadline.Duration = 107 * time.Second
}
if ret.RetryPeriod.Duration == 0 {
	ret.RetryPeriod.Duration = 26 * time.Second
}

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-24 16:21:53 -04:00
Monis Khan
c0617ceda4
leader election: in-memory leader status is stopped before release
This change fixes a small race condition that occurred when the
current leader failed to renew its lease.  Before this change, the
leader would first release the lease via the Kube API and then would
update its in-memory status to reflect that change.  Now those
events occur in the reverse (i.e. correct) order.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-24 15:02:56 -04:00
Mo Khan
f7751d13fe
Merge pull request #778 from vmware-tanzu/oidc_password_grant
Optionally allow OIDC password grant for CLI-based login experience
2021-08-24 13:02:07 -04:00
Mo Khan
3077034b2d
Merge branch 'main' into oidc_password_grant 2021-08-24 12:23:52 -04:00
Mo Khan
89cef2ea6c
Merge pull request #796 from enj/enj/i/leader_election_flake
leader election test: fix flake related to invalid assumption
2021-08-20 19:06:51 -04:00
Ryan Richard
211f4b23d1 Log auth endpoint errors with stack traces 2021-08-20 14:41:02 -07:00
Monis Khan
132ec0d2ad
leader election test: fix flake related to invalid assumption
Even though a client may hold the leader election lock in the Kube
lease API, that does not mean it has had a chance to update its
internal state to reflect that.  Thus we retry the checks in
checkOnlyLeaderCanWrite a few times to allow the client to catch up.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-20 17:04:26 -04:00
Mo Khan
ae505d8009
Merge pull request #788 from enj/enj/i/leader_election
Add Leader Election Middleware
2021-08-20 12:58:27 -04:00
Monis Khan
c356710f1f
Add leader election middleware
Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-20 12:18:25 -04:00
Matt Moyer
b9d186e8a3
Merge pull request #786 from mattmoyer/cleanup-go-mod
Cleanup `go.mod` replace directives that are no longer needed.
2021-08-20 08:43:36 -07:00
Matt Moyer
03a8160a91
Remove replace directive for dgrijalva/jwt-go.
We no longer have a transitive dependency on this older repository, so we don't need the replace directive anymore.

There is a new fork of this that we should move to (https://github.com/golang-jwt/jwt), but we can't easily do that until a couple of our direct dependencies upgrade.

This is a revert of d162cb9adf.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
2021-08-20 10:15:55 -05:00
Matt Moyer
f379eee7a3
Drop replace directive for oleiade/reflections.
This is reverting 8358c26107.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
2021-08-20 10:15:55 -05:00
Matt Moyer
4f5312807b
Undo dep hacks to work around gRPC example module.
This is essentially reverting 87c7e89b13.

Signed-off-by: Matt Moyer <moyerm@vmware.com>
2021-08-20 10:15:54 -05:00
Ryan Richard
6239a567a8 remove one nolint:unparam comment 2021-08-19 10:57:00 -07:00
Ryan Richard
e4d418a076 Merge branch 'main' into oidc_password_grant 2021-08-19 10:55:54 -07:00
Ryan Richard
c4727d57c8
Merge pull request #789 from vmware-tanzu/remove_unparam_linter
Remove `unparam` linter
2021-08-19 10:55:04 -07:00
Ryan Richard
b4a39ba3c4 Remove unparam linter
We decided that this linter does not provide very useful feedback
for our project.
2021-08-19 10:20:24 -07:00
Ryan Richard
cf627a82cb Merge branch 'main' into oidc_password_grant 2021-08-19 10:00:11 -07:00
Ryan Richard
42d31a7085 Update login.md doc to mention OIDC CLI-based flow 2021-08-19 09:59:47 -07:00
anjalitelang
02b8ed7e0b
Update ROADMAP.md
Removing features listed for July as they are shipped.
2021-08-19 12:19:31 -04:00
Ryan Richard
61c21d2977 Refactor some authorize and callback error handling, and add more tests 2021-08-18 12:06:46 -07:00
Ryan Richard
04b8f0b455 Extract Supervisor authorize endpoint string constants into apis pkg 2021-08-18 10:20:33 -07:00
Ryan Richard
0089540b07 Extract Supervisor IDP discovery endpoint string constants into apis pkg 2021-08-17 17:50:02 -07:00
Ryan Richard
62c6d53a21 Merge branch 'main' into oidc_password_grant 2021-08-17 15:23:29 -07:00
Ryan Richard
96474b3d99 Extract Supervisor IDP discovery endpoint types into apis package 2021-08-17 15:23:03 -07:00
Ryan Richard
964d16110e Some refactors based on PR feedback from @enj 2021-08-17 13:14:09 -07:00
Matt Moyer
d57637ee56
Merge pull request #783 from enj/enj/t/ignore_test_pods
test/integration: ignore restarts associated with test pods
2021-08-17 11:00:19 -07:00
Mo Khan
8ce4bb6dc1
Merge pull request #784 from enj/enj/r/specific_private
dynamiccert: prevent misuse of NewServingCert
2021-08-17 13:56:23 -04:00
Ryan Richard
a7c88b599c Merge branch 'main' into oidc_password_grant 2021-08-17 10:45:00 -07:00
Monis Khan
e0901f4fe5
dynamiccert: prevent misuse of NewServingCert
The Kube API server code that we use will cast inputs in an attempt
to see if they implement optional interfaces.  This change adds a
simple wrapper struct to prevent such casts from causing us any
issues.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-17 12:58:32 -04:00
Monis Khan
cf25c308cd
test/integration: ignore restarts associated with test pods
Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-17 12:57:41 -04:00
Mo Khan
9d11be899c
Merge pull request #785 from enj/enj/i/no_proxy_env
Provide good defaults for NO_PROXY
2021-08-17 12:55:12 -04:00
Monis Khan
66ddcf98d3
Provide good defaults for NO_PROXY
This change updates the default NO_PROXY for the supervisor to not
proxy requests to the Kubernetes API and other Kubernetes endpoints
such as Kubernetes services.

It also adds https_proxy and no_proxy settings for the concierge
with the same default.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-17 10:03:19 -04:00
Ryan Richard
3fb683f64e Update expected error message in e2e integration test 2021-08-16 15:40:34 -07:00
Ryan Richard
52409f86e8 Merge branch 'main' into oidc_password_grant 2021-08-16 15:17:55 -07:00
Ryan Richard
91c8a3ebed Extract private helper in auth_handler.go 2021-08-16 15:17:30 -07:00
Ryan Richard
52cb0bbc07 More unit tests and small error handling changes for OIDC password grant 2021-08-16 14:27:40 -07:00
Mo Khan
eb2a68fec0
Merge pull request #782 from vmware-tanzu/dependabot/go_modules/github.com/go-ldap/ldap/v3-3.4.1
Bump github.com/go-ldap/ldap/v3 from 3.3.0 to 3.4.1
2021-08-16 17:20:06 -04:00
dependabot[bot]
e05a46b7f5
Bump github.com/go-ldap/ldap/v3 from 3.3.0 to 3.4.1
Bumps [github.com/go-ldap/ldap/v3](https://github.com/go-ldap/ldap) from 3.3.0 to 3.4.1.
- [Release notes](https://github.com/go-ldap/ldap/releases)
- [Commits](https://github.com/go-ldap/ldap/compare/v3.3.0...v3.4.1)

---
updated-dependencies:
- dependency-name: github.com/go-ldap/ldap/v3
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-08-16 20:46:25 +00:00
Mo Khan
46304c8137
Merge pull request #775 from enj/enj/i/dynamiccert_no_unload
impersonatorconfig: only unload dynamiccert when proxy is disabled
2021-08-16 16:36:03 -04:00
Monis Khan
7a812ac5ed
impersonatorconfig: only unload dynamiccert when proxy is disabled
In the upstream dynamiccertificates package, we rely on two pieces
of code:

1. DynamicServingCertificateController.newTLSContent which calls
   - clientCA.CurrentCABundleContent
   - servingCert.CurrentCertKeyContent
2. unionCAContent.VerifyOptions which calls
   - unionCAContent.CurrentCABundleContent

This results in calls to our tlsServingCertDynamicCertProvider and
impersonationSigningCertProvider.  If we Unset these providers, we
subtly break these consumers.  At best this results in test slowness
and flakes while we wait for reconcile loops to converge.  At worst,
it results in actual errors during runtime.  For example, we
previously would Unset the impersonationSigningCertProvider on any
sync loop error (even a transient one caused by a network blip or
a conflict between writes from different replicas of the concierge).
This would cause us to transiently fail to issue new certificates
from the token credential require API.  It would also cause us to
transiently fail to authenticate previously issued client certs
(which results in occasional Unauthorized errors in CI).

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-16 16:07:46 -04:00
Ryan Richard
71d6281e39 Merge branch 'main' into oidc_password_grant 2021-08-16 09:30:13 -07:00
Mo Khan
bb30569e41
Merge pull request #780 from enj/enj/i/browser_stderr
cli: prevent browser output from breaking ExecCredential output
2021-08-16 10:34:33 -04:00
Monis Khan
942c55cf51
cli: prevent browser output from breaking ExecCredential output
This change updates the pinniped CLI entrypoint to prevent browser
processes that we spawn from polluting our std out stream.

For example, chrome will print the following message to std out:

Opening in existing browser session.

Which leads to the following incomprehensible error message from
kubectl:

Unable to connect to the server: getting credentials:
decoding stdout: couldn't get version/kind; json parse error:
json: cannot unmarshal string into Go value of type struct
{ APIVersion string "json:\"apiVersion,omitempty\"";
  Kind string "json:\"kind,omitempty\"" }

This would only occur on the initial login when we opened the
browser.  Since credentials would be cached afterwards, kubectl
would work as expected for future invocations as no browser was
opened.

I could not think of a good way to actually test this change.  There
is a clear gap in our integration tests - we never actually launch a
browser in the exact same way a user does - we instead open a chrome
driver at the login URL as a subprocess of the integration test
binary and not the pinniped CLI.  Thus even if the chrome driver was
writing to std out, we would not notice any issues.

It is also unclear if there is a good way to prevent future related
bugs since std out is global to the process.

Signed-off-by: Monis Khan <mok@vmware.com>
2021-08-16 09:13:57 -04:00